Variable Subset Selection in the Presence of Flagged Observations and Multicollinear Descriptors in QSAR
Peter P. Mager,
A major problem in traditional quantitative structure-activity relationships (QSARs) analysis is to select suitable chemical descriptors from a large pool of variables. Decisions against or in favor of a particular descriptor depends entirely on the result of statistically based hypothesis testing. Uncertain results may be produced in presence of multicollinear descriptors and flagged observations (high-leverage points, outliers, influential data). To satisfy the assumptions for hypothesis testing, diagnostic statistics and subsequent design repair are employed. Here we show an example with nonnucleoside HIV-1 reverse transcriptase inhibitors.
Keywords: computer-assisted drug design, quantitative structure-activity relationships, regression analysis, hypothesis testing, design repair, diagnostic statistics, artificial neural networks, nonnucleoside hiv-1 reverse, transcriptase inhibitors
Rights & PermissionsPrintExport