Abstract
Recent tools that analyze microarray expression data have exploited correlation-based approaches such as clustering analysis. We describe a new method for assessing the importance of genes for sample classification based on expression data. Our approach combines a genetic algorithm (GA) and the k-nearest neighbor (KNN) method to identify genes that jointly can discriminate between two types of samples (e.g. normal vs. tumor). First, many such subsets of differentially expressed genes are obtained independently using the GA. Then, the overall frequency with which genes were selected is used to deduce the relative importance of genes for sample classification. Sample heterogeneity is accommodated; that is, the method should be robust against the existence of distinct subtypes. We applied GA / KNN to expression data from normal versus tumor tissue from human colon. Two distinct clusters were observed when the 50 most frequently selected genes were used to classify all of the samples in the data sets stu died and the majority of samples were classified correctly. Identification of a set of differentially expressed genes could aid in tumor diagnosis and could also serve to identify disease subtypes that may benefit from distinct clinical approaches to treatment.
Keywords: Gene Expression, Algorithm (GA), K-nearest neighbor (KNN), Pattern recognition, Gene selection, High-dimensional, Microarray
Combinatorial Chemistry & High Throughput Screening
Title: Gene Assessment and Sample Classification for Gene Expression Data Using a Genetic Algorithm / k-nearest Neighbor Method
Volume: 4 Issue: 8
Author(s): Leping Li, Thomas A. Darden, Clarice R. Weingberg, A. J. Levine and Lee G. Pedersen
Affiliation:
Keywords: Gene Expression, Algorithm (GA), K-nearest neighbor (KNN), Pattern recognition, Gene selection, High-dimensional, Microarray
Abstract: Recent tools that analyze microarray expression data have exploited correlation-based approaches such as clustering analysis. We describe a new method for assessing the importance of genes for sample classification based on expression data. Our approach combines a genetic algorithm (GA) and the k-nearest neighbor (KNN) method to identify genes that jointly can discriminate between two types of samples (e.g. normal vs. tumor). First, many such subsets of differentially expressed genes are obtained independently using the GA. Then, the overall frequency with which genes were selected is used to deduce the relative importance of genes for sample classification. Sample heterogeneity is accommodated; that is, the method should be robust against the existence of distinct subtypes. We applied GA / KNN to expression data from normal versus tumor tissue from human colon. Two distinct clusters were observed when the 50 most frequently selected genes were used to classify all of the samples in the data sets stu died and the majority of samples were classified correctly. Identification of a set of differentially expressed genes could aid in tumor diagnosis and could also serve to identify disease subtypes that may benefit from distinct clinical approaches to treatment.
Export Options
About this article
Cite this article as:
Li Leping, Darden A. Thomas, Weingberg R. Clarice, Levine J. A. and Pedersen G. Lee, Gene Assessment and Sample Classification for Gene Expression Data Using a Genetic Algorithm / k-nearest Neighbor Method, Combinatorial Chemistry & High Throughput Screening 2001; 4 (8) . https://dx.doi.org/10.2174/1386207013330733
DOI https://dx.doi.org/10.2174/1386207013330733 |
Print ISSN 1386-2073 |
Publisher Name Bentham Science Publisher |
Online ISSN 1875-5402 |
Call for Papers in Thematic Issues
Artificial Intelligence Methods for Biomedical, Biochemical and Bioinformatics Problems
Recently, a large number of technologies based on artificial intelligence have been developed and applied to solve a diverse range of problems in the areas of biomedical, biochemical and bioinformatics problems. By utilizing powerful computing resources and massive amounts of data, methods based on artificial intelligence can significantly improve the ...read more
Eco-friendly Agents for Biological Control of Pathogenic Diseases
The discovery of an alternative biological approach to disease management includes work on medicinal products derived from natural sources as a starting point for the development of eco-friendly agents for these diseases and the injuries they cause, as well as reducing human contact with hazardous chemicals and their residues. We ...read more
Emerging trends in diseases mechanisms, noble drug targets and therapeutic strategies: focus on immunological and inflammatory disorders
Recently infectious and inflammatory diseases have been a key concern worldwide due to tremendous morbidity and mortality world Wide. Recent, nCOVID-9 pandemic is a good example for the emerging infectious disease outbreak. The world is facing many emerging and re-emerging diseases out breaks at present however, there is huge lack ...read more
Exploring Spectral Graph Theory in Combinatorial Chemistry
Scope of the Thematic Issue: Combinatorial chemistry involves the synthesis and analysis of a large number of diverse compounds simultaneously. Traditional methods rely on brute force experimentation, which can be time-consuming and resource-intensive. Spectral Graph Theory, a branch of mathematics dealing with the properties of graphs in relation to the ...read more
- Author Guidelines
- Graphical Abstracts
- Fabricating and Stating False Information
- Research Misconduct
- Post Publication Discussions and Corrections
- Publishing Ethics and Rectitude
- Increase Visibility of Your Article
- Archiving Policies
- Peer Review Workflow
- Order Your Article Before Print
- Promote Your Article
- Manuscript Transfer Facility
- Editorial Policies
- Allegations from Whistleblowers
Related Articles
-
A Classification Method for Microarrays Based on Diversity
Current Bioinformatics Recent Software Developments and Applications in Functional Imaging
Current Pharmaceutical Biotechnology Nelarabine- A New Purine Analog in the Treatment of Hematologic Malignancies
Reviews on Recent Clinical Trials LHON: Mitochondrial Mutations and More
Current Genomics The Multifaceted Activities of Mammalian Defensins
Current Pharmaceutical Design in vitro Anti-leukaemia Activity of Pyrrolo[1,2-b][1,2,5]benzothiadiazepines (PBTDs)
Recent Patents on Anti-Cancer Drug Discovery Histopathology, Immunohistochemistry and Molecular Biology of Follicular Epithelium-Derived Pediatric Thyroid Carcinomas
Current Pediatric Reviews Implications of Epigenetic Mechanisms and their Targets in Cerebral Ischemia Models
Current Neuropharmacology γ-H2AX as a Therapeutic Target for Improving the Efficacy of Radiation Therapy
Current Cancer Drug Targets Exposure of B Cell Chronic Lymphocytic Leukemia (B-CLL) Cells to Nutlin-3 Induces a Characteristic Gene Expression Profile, which Correlates with Nutlin-3-Mediated Cytotoxicity (Supplementry Table)
Current Cancer Drug Targets Dual-target Inhibitors Based on BRD4: Novel Therapeutic Approaches for Cancer
Current Medicinal Chemistry Glucocorticoids Pharmacology: Past, Present and Future
Current Pharmaceutical Design Small Molecule Tyrosine Kinase Inhibitors: Potential Role in Pediatric Malignant Solid Tumors
Current Cancer Drug Targets A Dual Role for Sirtuin 1 in Tumorigenesis
Current Pharmaceutical Design Concise Synthesis of Benzoindolizidine Derivatives and Bioactivity Evaluation
Letters in Organic Chemistry Exploiting Endogenous Cellular Process to Generate Quinone Methides In Vivo
Current Organic Chemistry Smart Drug Release Systems Based on Stimuli-Responsive Polymers
Mini-Reviews in Medicinal Chemistry Expression and Function of Anti-Inflammatory Interleukins: The Other Side of the Vascular Response to Injury
Current Vascular Pharmacology Endothelial and Circulating Progenitor Cells in Hematological Diseases and Allogeneic Hematopoietic Stem Cell Transplantation
Current Medicinal Chemistry B Cell Modulation Strategies in Autoimmunity: The SLE Example
Current Pharmaceutical Design