Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches
Main Article Content
Abstract
Gene expression data usually contains a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. In this paper, we propose a two-stage selection algorithm for genomic data by combining MRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm): In the first stage, MRMR is used to filter noisy and redundant genes in high dimensional microarray data. In the second stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminating genes. The proposed method is tested on five open datasets: NCI, Lymphoma, Lung, Leukemia and Colon using Support Vector Machine and Naïve Bayes classifiers. The comparison of the MRMR-GA with MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset that gives the most classification accuracy in leave-one-out cross-validation (LOOCV).
Article Details
How to Cite
Akadi, A. E., Amine, A., El Ouardighi, A., & Aboutajdine, D. (2009). Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches. INFOCOMP Journal of Computer Science, 8(4), 28–36. Retrieved from https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279
Section
Articles
Upon receipt of accepted manuscripts, authors will be invited to complete a copyright license to publish the paper. At least the corresponding author must send the copyright form signed for publication. It is a condition of publication that authors grant an exclusive licence to the the INFOCOMP Journal of Computer Science. This ensures that requests from third parties to reproduce articles are handled efficiently and consistently and will also allow the article to be as widely disseminated as possible. In assigning the copyright license, authors may use their own material in other publications and ensure that the INFOCOMP Journal of Computer Science is acknowledged as the original publication place.