Data dimensionality reduction based on genetic selection of feature subsets

K. M. Faraoun; A. Rabhi

PDF

Published: Jun 1, 2007

Keywords:

Features selection genetic algorithms patterns classification

K. M. Faraoun

UDK university

A. Rabhi

UDK University

Abstract

In the present paper, we show that a multi-classification process can be significantly enhanced by selecting an optimal set of the features used as input for the training operation. The selection of such a subset will reduce the dimensionality of the data samples and eliminate the redundancy and ambiguity introduced by some attributes. The used classifier can then operate only on the selected features to perform the learning process. A genetic search is used here to explore the set of all possible features subsets whose size is exponentially proportional to the number of features. A new measure is proposed to compute the information gain provided by each features subsets, and used as the fitness function of the genetic search. Experiments are performed using the KDD99 dataset to classify DoS network intrusions, according to the 41 existing features. The optimality of the obtained features subset is then tested using a multi-layered neural network. Obtained results show that the proposed approach can enhance both the classification rate and the learning runtime.

How to Cite

Faraoun, K. M., & Rabhi, A. (2007). Data dimensionality reduction based on genetic selection of feature subsets. INFOCOMP Journal of Computer Science, 6(2), 9–19. Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/169

Issue

Vol. 6 No. 2 (2007): June, 2007

Section

Articles

Upon receipt of accepted manuscripts, authors will be invited to complete a copyright license to publish the paper. At least the corresponding author must send the copyright form signed for publication. It is a condition of publication that authors grant an exclusive licence to the the INFOCOMP Journal of Computer Science. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning the copyright license, authors may use their own material in other publications and ensure that the INFOCOMP Journal of Computer Science is acknowledged as the original publication place.

Article Sidebar

Main Article Content

Abstract

Article Details