For a set of 178 volatile organic compounds with highly diverse chemical structures, including terpenes, oxygenated terpenes such as esters, aldehydes, ketones, acids and alcohols, oxygenated benzene compounds, alkanes and so forth, the gas chromatographic programmed-temperature retention indices (PTRIs) have been modeled quantitatively using topological indices. A prediction model constructed by a recently proposed method named modeling based on subspace orthogonal projection (MSOP) with Monte-Carlo cross-validation (MCCV) was developed. The correlation coefficient R = 0.9993 and the root mean square error of prediction RMSEP=36.0 i.u. A prediction dataset including 20 compounds collected from the NIST web-book was further used to verify the stability and accuracy of the constructed model, a low root mean square error of prediction (RMSEP =32.6 i.u.) was obtained. Consequently, the developed prediction model can support the identification of natural volatile compounds more unambiguously by GC-MS when the retention data for candidate structures are not available.
Keywords: Gas chromatography mass spectrometry (GC-MS), Modeling based on subspace orthogonal projection (MSOP), Programmed-temperature retention indices (PTRIs), Quantitative structure retention relationship (QSRR), Topological index (TI), Volatile constituents, Monte-Carlo cross-validation, Natural compounds, Multiple linear regression (MLR), Prediction errors