Challenges of Target/Compound Data Integration from Disease to Chemistry: A Case Study of Dihydrofolate Reductase Inhibitors

Steven   J.   Potts; David   J.   Edwards; Remy      Hoffman

Abstract

Despite the improvements in informatics associated with initiatives in the structure-based design and genomics fields, no straight-forward links are available between a given disease class and drug chemistry. This involves effective linking of disease to protein targets, and then mapping these targets to drug chemistry. In practice, protein-ligand structural analyses and high-throughput screening experiments generate the links between targets implicated in disease and chemical leads. Additionally, large volumes of relevant data are also being produced by high-throughput X-ray crystallography and in-silico docking initiatives. Each of these efforts takes a distinctly different approach to how data is managed and mined, resulting in difficulties in sharing data across each area. This review discusses the diverse approaches taken to data management in these areas, and the challenges associated with the construction of a data warehouse that meets all of the needs of each data type. Using the current work available for dihydrofolate reductase inhibitors, we demonstrate the challenges and opportunities associated with data mining from disease to drug chemistry.

Keywords: data integration, dihydrofolate reductase, chemical genomics, high-throughput crystallography, highthroughput screening, virtual docking, informatics