The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial was a large,
randomized controlled trial of cancer screening that also evolved over time into a unique epidemiologic
cohort. Vast quantities of data have been collected since the beginning of the trial in 1993.
Screening data was obtained through 2006. Questionnaire-based risk factor data (collected at baseline
and at other points in the trial), vital status, cancer diagnoses and treatment, biospecimen data and additional
ancillary efforts continue to be collected.
Accurate data collection and efficient management methods are required to ensure high-quality data and valid and consistent
analyses of trial outcomes. Information Management Services (IMS) was and continues to be responsible for processing
and converting the collected raw PLCO data into comprehensive and accessible datasets. IMS also continues to provide
a wide spectrum of analytic support including support for trial monitoring, data sharing, and epidemiologic research.
In this paper, we describe the data processing and management requirements from the analytic team perspective, highlighting
the various data sources and their complexity. We also illustrate the construction of usable analytic data files and
discuss the wide range of analytic support provided. Instructions for accessing PLCO data also are provided.