Background: Traditional quantitative structure - property / activity relationships
(QSPRs/QSARs) are based on representation of molecular structure by molecular
graph or simplified molecular input-line entry system (SMILES). It is an attractive idea to
develop predictive models for large molecules in general and for peptides in particular.
However, the representation of these molecules by molecular graph or SMILES is problematic
owing to large size of these molecules. A possible alternative of SMILES is the
representation of peptides via sequence of abbreviations of amino acids.
Method: Models for hemolysis and cytotoxicity of peptides are suggested. These models
are based on representation of the peptides by sequences of amino acids. Correlation
weights, which are calculated for each amino acid using the Monte Carlo method are basis
for quantitative sequence - activity relationships (QSAR) for antimicrobial peptides. The
correlation weights are the basis for optimal descriptors, which are correlated with experimental
data for hemolysis and cytotoxicity. The basic hypothesis is that if optimal descriptors
are correlated with endpoints of peptides for the training set, they should also
correlate with the endpoints for validation set.
Results: Checking up of correlations between the above-mentioned descriptors and antimicrobial
activity of peptides (cytotoxicity or hemolysis) has shown that these models
have good predictive potential.
Conclusion: Suggested approach can be used as a tool to develop predictive models of
biological activity of peptides as a mathematical function of sequences of amino acids.