Semantic Similarities as Discriminative Features of Protein Complexes
Pietro Hiram Guzzi,
Biological data about genes, proteins and biologically relevant molecules that are stored in databases may be
associated to biological information (knowledge) such as experiments, properties and functions, response to drugs etc.
Such knowledge is formally structured into ontologies that provide the best formalize to organize and store knowledge. In
the biological field, Gene Ontology (GO) provides both a categorization of annotating terms and a source of annotation
for genes and proteins. Consequently it is possible to introduce novel methodologies of analysis that are based on the use
of ontologies. Recently a growing interest has caputed semantic similarities, i.e. the calculation of the similarity of two or
more proteins starting from their annotations. For instance semantic measures have been used for the prediction of protein
complexes. Although the importance of these researches, some problems remain still unsolved: the assessment of
semantic measures with respect to biological features as well as a deep study on the impact of the chosen measure in the
obtained results. This paper focus on the use of semantic similarity measures into the protein complexes prediction
pipeline. For these aims we investigated if there exists a bias among different measures as well as a higher value of
semantic similarity within proteins that participate in the same complex. Results confirm that protein belonging to the
same complex have a bigger average values of semantic similarity with respect to the average values of the proteomes.
This confirm a possible use of semantic similarity measures within protein complexes prediction algorithms and a way to
choose the best one among them.
Keywords: Ontologies, protein interaction networks, semantic similarity measures.
Rights & PermissionsPrintExport