Abstract
In the spirit of reporting valid and reliable Quantitative Structure-Activity Relationship (QSAR) models, the aim of our research was to assess how the leverage (analysis with Hat matrix, hi) and the influential (analysis with Cook’s distance, Di) of QSAR models may reflect the models reliability and their characteristics. The datasets included in this research were collected from previously published papers. Seven datasets which accomplished the imposed inclusion criteria were analyzed. Three models were obtained for each dataset (full-model, hi-model and Di-model) and several statistical validation criteria were applied to the models. In 5 out of 7 sets the correlation coefficient increased when compounds with either hi or Di higher than the threshold were removed. Withdrawn compounds varied from 2 to 4 for himodels and from 1 to 13 for Di-models. Validation statistics showed that Di-models possess systematically better agreement than both full-models and hi-models. Removal of influential compounds from training set significantly improves the model and is recommended to be conducted in the process of quantitative structure-activity relationships developing. Cook’s distance approach should be combined with hat matrix analysis in order to identify the compounds candidates for removal.
Keywords: Influential points, leverage effect, model sensitivity, model validation, quantitative structure-activity relationship (QSAR), Cook’s distance.
Combinatorial Chemistry & High Throughput Screening
Title:The Effect of Leverage and/or Influential on Structure-Activity Relationships
Volume: 16 Issue: 4
Author(s): Sorana D. Bolboaca and Lorentz Jantschi
Affiliation:
Keywords: Influential points, leverage effect, model sensitivity, model validation, quantitative structure-activity relationship (QSAR), Cook’s distance.
Abstract: In the spirit of reporting valid and reliable Quantitative Structure-Activity Relationship (QSAR) models, the aim of our research was to assess how the leverage (analysis with Hat matrix, hi) and the influential (analysis with Cook’s distance, Di) of QSAR models may reflect the models reliability and their characteristics. The datasets included in this research were collected from previously published papers. Seven datasets which accomplished the imposed inclusion criteria were analyzed. Three models were obtained for each dataset (full-model, hi-model and Di-model) and several statistical validation criteria were applied to the models. In 5 out of 7 sets the correlation coefficient increased when compounds with either hi or Di higher than the threshold were removed. Withdrawn compounds varied from 2 to 4 for himodels and from 1 to 13 for Di-models. Validation statistics showed that Di-models possess systematically better agreement than both full-models and hi-models. Removal of influential compounds from training set significantly improves the model and is recommended to be conducted in the process of quantitative structure-activity relationships developing. Cook’s distance approach should be combined with hat matrix analysis in order to identify the compounds candidates for removal.
Export Options
About this article
Cite this article as:
Bolboaca D. Sorana and Jantschi Lorentz, The Effect of Leverage and/or Influential on Structure-Activity Relationships, Combinatorial Chemistry & High Throughput Screening 2013; 16 (4) . https://dx.doi.org/10.2174/1386207311316040003
DOI https://dx.doi.org/10.2174/1386207311316040003 |
Print ISSN 1386-2073 |
Publisher Name Bentham Science Publisher |
Online ISSN 1875-5402 |
Call for Papers in Thematic Issues
Artificial Intelligence Methods for Biomedical, Biochemical and Bioinformatics Problems
Recently, a large number of technologies based on artificial intelligence have been developed and applied to solve a diverse range of problems in the areas of biomedical, biochemical and bioinformatics problems. By utilizing powerful computing resources and massive amounts of data, methods based on artificial intelligence can significantly improve the ...read more
Eco-friendly Agents for Biological Control of Pathogenic Diseases
The discovery of an alternative biological approach to disease management includes work on medicinal products derived from natural sources as a starting point for the development of eco-friendly agents for these diseases and the injuries they cause, as well as reducing human contact with hazardous chemicals and their residues. We ...read more
Emerging trends in diseases mechanisms, noble drug targets and therapeutic strategies: focus on immunological and inflammatory disorders
Recently infectious and inflammatory diseases have been a key concern worldwide due to tremendous morbidity and mortality world Wide. Recent, nCOVID-9 pandemic is a good example for the emerging infectious disease outbreak. The world is facing many emerging and re-emerging diseases out breaks at present however, there is huge lack ...read more
Exploring Spectral Graph Theory in Combinatorial Chemistry
Scope of the Thematic Issue: Combinatorial chemistry involves the synthesis and analysis of a large number of diverse compounds simultaneously. Traditional methods rely on brute force experimentation, which can be time-consuming and resource-intensive. Spectral Graph Theory, a branch of mathematics dealing with the properties of graphs in relation to the ...read more
- Author Guidelines
- Graphical Abstracts
- Fabricating and Stating False Information
- Research Misconduct
- Post Publication Discussions and Corrections
- Publishing Ethics and Rectitude
- Increase Visibility of Your Article
- Archiving Policies
- Peer Review Workflow
- Order Your Article Before Print
- Promote Your Article
- Manuscript Transfer Facility
- Editorial Policies
- Allegations from Whistleblowers
Related Articles
-
A Critical View on Antimalarial Endoperoxide QSAR Studies
Mini-Reviews in Medicinal Chemistry Herb-Drug Interactions: Methods to Identify Potential Influence of Genetic Variations in Genes Encoding Drug Metabolizing Enzymes and Drug Transporters
Current Pharmaceutical Biotechnology Synthesis of Aminophosphines and Their Applications in Catalysis
Current Organic Chemistry Isonicotinic Acid Hydrazide Derivatives: Synthesis, Antimycobacterial, Antiviral, Antimicrobial Activity and QSAR Studies
Letters in Drug Design & Discovery Inhibitors of Serine Proteinases from Blood Coagulation Cascade - View on Current Developments
Mini-Reviews in Medicinal Chemistry Autocorrelation of Molecular Electrostatic Potential Surface Properties Combined with Partial Least Squares Analysis as Alternative Attractive Tool to Generate Ligand-Based 3D-QSARs
Current Drug Discovery Technologies Anti-Inflammatory Iridoids of Botanical Origin
Current Medicinal Chemistry Seroprevalence of Pertussis Antibodies and Infection Risk Among Female Medical Students
New Emirates Medical Journal Alternative Splice Variants of Survivin as Potential Targets in Cancer
Current Drug Discovery Technologies Pathophysiology of Sepsis and Recent Patents on the Diagnosis, Treatment and Prophylaxis for Sepsis
Recent Patents on Inflammation & Allergy Drug Discovery Generation of Human Single-chain Antibody to the CD99 Cell Surface Determinant Specifically Recognizing Ewing’s Sarcoma Tumor Cells
Current Pharmaceutical Biotechnology Current Status of the Non-nucleoside Reverse Transcriptase Inhibitors of Human Immunodeficiency Virus Type 1
Current Topics in Medicinal Chemistry Prediction of the Functional Roles of Small Molecules in Lipid Metabolism Based on Ensemble Learning
Protein & Peptide Letters The Role of CO-RADS Scoring System in the Diagnosis of COVID-19 Infection and its Correlation with Clinical Signs
Current Medical Imaging The Exploitation of Toll-like Receptor 3 Signaling in Cancer Therapy
Current Pharmaceutical Design COVID 19, All a Radiologist Needs to Know: A Narrative Review
Current Respiratory Medicine Reviews Preliminary Anti-Coxsackie Activity of Novel 1-[4-(5,6-dimethyl(H)- 1H(2H)-benzotriazol-1(2)-yl)phenyl]-3-alkyl(aryl)ureas
Medicinal Chemistry ICE Regimen for Relapsed/Refractory Bone and Soft Tissue Sarcomas in Children
Reviews on Recent Clinical Trials Proteomic Study of Micro-Algae: Sample Preparation for Two-Dimensional Gel Electrophoresis and De Novo Peptide Sequencing Using MALDI-TOF MS
Current Proteomics Development and Application of Activity-based Fluorescent Probes for High-Throughput Screening
Current Medicinal Chemistry