Building a Biological Space Based on Protein Sequence Similarities and Biological Ontologies

Paul      Kersey; David      Lonsdale; Nicky   J.   Mulder; Robert      Petryszak; Rolf      Apweiler

Abstract

Assignment of function to protein sequence is a task of growing importance in the life sciences, as new highthroughput sequencing DNA technologies generate ever increasing quantities of genomic and meta-genomic data. Patterns within the sequence space, caused by the evolutionary conservation and assembly of protein domains, make possible the inference of function from sequence similarity. Clustering similar sequences is a useful technique for finding conserved sequences; the CluSTr database is a publicly-available database arranging proteins in a hierarchy structured by similarity. The protein classification tool InterProScan builds on this approach by applying a range of methods to detect proteins that contain signatures indicative of the presence of particular conserved domains. The use of ontologies to describe protein function provides a flexible and abstract language to classify proteins. Together, these techniques can provide an understanding of the shape of the protein space, and can be used to explore the unchartered waters of the emerging metagenomic world.

Keywords: CluSTr, clustering, genomes, GO, InterPro, metagenomes, orthology, paralagy

« Previous Next »

Rights & Permissions Print Cite

Combinatorial Chemistry & High Throughput Screening

Title: Building a Biological Space Based on Protein Sequence Similarities and Biological Ontologies

Volume: 11 Issue: 8

Author(s): Paul Kersey, David Lonsdale, Nicky J. Mulder, Robert Petryszak and Rolf Apweiler

Affiliation:

Keywords: CluSTr, clustering, genomes, GO, InterPro, metagenomes, orthology, paralagy

Abstract: Assignment of function to protein sequence is a task of growing importance in the life sciences, as new highthroughput sequencing DNA technologies generate ever increasing quantities of genomic and meta-genomic data. Patterns within the sequence space, caused by the evolutionary conservation and assembly of protein domains, make possible the inference of function from sequence similarity. Clustering similar sequences is a useful technique for finding conserved sequences; the CluSTr database is a publicly-available database arranging proteins in a hierarchy structured by similarity. The protein classification tool InterProScan builds on this approach by applying a range of methods to detect proteins that contain signatures indicative of the presence of particular conserved domains. The use of ontologies to describe protein function provides a flexible and abstract language to classify proteins. Together, these techniques can provide an understanding of the shape of the protein space, and can be used to explore the unchartered waters of the emerging metagenomic world.

Export Options

Export File:

RIS (for EndNote, Reference Manager, ProCite)

BibTeX

Text

Content:

Citation Only

Citation and Abstract

About this article

Cite this article as:

Kersey Paul, Lonsdale David, Mulder J. Nicky, Petryszak Robert and Apweiler Rolf, Building a Biological Space Based on Protein Sequence Similarities and Biological Ontologies, Combinatorial Chemistry & High Throughput Screening 2008; 11 (8) . https://dx.doi.org/10.2174/138620708785739925

DOI https://dx.doi.org/10.2174/138620708785739925	Print ISSN 1386-2073
Publisher Name Bentham Science Publisher	Online ISSN 1875-5402

About this journal

Call for Papers in Thematic Issues

Submission closes on : 01 October, 2024

Artificial Intelligence Methods for Biomedical, Biochemical and Bioinformatics Problems

Recently, a large number of technologies based on artificial intelligence have been developed and applied to solve a diverse range of problems in the areas of biomedical, biochemical and bioinformatics problems. By utilizing powerful computing resources and massive amounts of data, methods based on artificial intelligence can significantly improve the ...read more

Guest Editor(s): Dr. Yinglei Song

Submission closes on : 01 May, 2024

Eco-friendly Agents for Biological Control of Pathogenic Diseases

The discovery of an alternative biological approach to disease management includes work on medicinal products derived from natural sources as a starting point for the development of eco-friendly agents for these diseases and the injuries they cause, as well as reducing human contact with hazardous chemicals and their residues. We ...read more

Guest Editor(s): Dr. Mohamed Dkhil

Submission closes on : 31 August, 2024

Emerging trends in diseases mechanisms, noble drug targets and therapeutic strategies: focus on immunological and inflammatory disorders

Recently infectious and inflammatory diseases have been a key concern worldwide due to tremendous morbidity and mortality world Wide. Recent, nCOVID-9 pandemic is a good example for the emerging infectious disease outbreak. The world is facing many emerging and re-emerging diseases out breaks at present however, there is huge lack ...read more

Guest Editor(s): Dr. Rituraj Niranjan

Submission closes on : 29 June, 2024

Exploring Spectral Graph Theory in Combinatorial Chemistry

Scope of the Thematic Issue: Combinatorial chemistry involves the synthesis and analysis of a large number of diverse compounds simultaneously. Traditional methods rely on brute force experimentation, which can be time-consuming and resource-intensive. Spectral Graph Theory, a branch of mathematics dealing with the properties of graphs in relation to the ...read more

Guest Editor(s): Dr. Jia Bao Liu