Domain-based Comparative Analysis of Bacterial Proteomes: Uniqueness, Interactions, and the Dark Matter

Author(s): Liang Wang*, Jianye Yang, Yaping Xu, Xue Piao, Jichang Lv.

Journal Name: Current Genomics

Volume 20 , Issue 2 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Background: Proteins may have none, single, double, or multiple domains, while a single domain may appear in multiple proteins. Their distribution patterns may have impacts on bacterial physiology and lifestyle.

Objective: This study aims to understand how domains are distributed and duplicated in bacterial proteomes, in order to better understand bacterial physiology and lifestyles.

Methods: In this study, we used 16712 Hidden Markov Models to screen 944 bacterial reference proteomes versus a threshold E-value<0.001. The number of non-redundant domains and duplication rates of redundant domains for each species were calculated. The unique domains, if any, were also identified for each species. In addition, the properties of no-domain proteins were investigated in terms of physicochemical properties.

Results: The increasing number of non-redundant domains for a bacterial proteome follows the trend of an asymptotic function. The domain duplication rate is positively correlated with proteome size and increases more rapidly. The high percentage of single-domain proteins is more associated with small proteome size. For each proteome, unique domains were also obtained. Moreover, no-domain proteins show differences with the other three groups for several physicochemical properties analysed in this study.

Conclusion: The study confirmed that a low domain duplication rate and a high percentage of singledomain proteins are more likely to be associated with bacterial host-dependent or restricted nicheadapted lifestyle. In addition, the unique lifestyle and physiology were revealed based on the analysis of species-specific domains and core domain interactions or co-occurrences.

Keywords: Bacterial proteome, Hidden markov model, Pfam, Bacterial lifestyle, Domain interaction, Domain redundancy.

Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res., 2000, 28(1), 235-242.
Goodacre, N.F.; Gerloff, D.L.; Uetz, P. Protein domains of unknown function are essential in bacteria. MBio, 2013, 5(1), e00744-e13.
Belshaw, R.; Yang, S.; Bourne, P.E. The evolutionary history of protein domains viewed by species phylogeny. PLoS One, 2009, 4(12), e8378.
Pasek, S.; Risler, J.L.; Brezellec, P. Gene fusion/fission is a major contributor to evolution of multi-domain bacterial proteins. Bioinformatics, 2006, 22(12), 1418-1423.
Kuznetsov, V.A.; Pickalov, V.V.; Kanapin, A.A. Proteome complexity measures based on counting of domain-to-protein links for replicative and non-replicative domains. In: Bioinformatics of Genome Regulation and Structure II; , 2006; pp. 329-341.
Chen, C.; Huang, H.; Wu, C.H. Protein bioinformatics databases and resources. Methods Mol. Biol., 2017, 1558(1), 3-39.
Finn, R.D.; Coggill, P.; Eberhardt, R.Y.; Eddy, S.R.; Mistry, J.; Mitchell, A.L.; Potter, S.C.; Punta, M.; Qureshi, M.; Sangrador-Vegas, A.; Salazar, G.A.; Tate, J.; Bateman, A. The Pfam protein families database: Towards a more sustainable future. Nucleic Acids Res., 2016, 44(1), 279-285.
Zhang, X.C.; Wang, Z.; Zhang, X.; Le, M.H.; Sun, J.; Xu, D.; Cheng, J.; Stacey, G. Evolutionary dynamics of protein domain architecture in plants. BMC Evol. Biol., 2012, 12(1), 6.
Rentzsch, R.; Orengo, C.A. Protein function prediction using domain families. BMC Bioinform., 2013, 14( (Suppl 3))
Apweiler, R. UniProt: The Universal Protein knowledgebase. Nucleic Acids Res., 2004, 32(90001), 115-119.
Babushok, D.V.; Ostertag, E.M.; Kazazian, H.H. Current topics in genome evolution: Molecular mechanisms of new gene formation. Cell. Mol. Life Sci., 2006, 64(5), 542-554.
Shannon, P. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res., 2003, 13(11), 2498-2504.
Cock, P.J.A.; Antao, T.; Chang, J.T.; Chapman, B.A.; Cox, C.J.; Dalke, A.; Friedberg, I.; Hamelryck, T.; Kauff, F.; Wilczynski, B.; de Hoon, M.J.L. Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 2009, 25(11), 1422-1423.
Wang, L.; Liu, Z.; Dai, S.; Yan, J.; Wise, M.J. The sit-and-wait hypothesis in bacterial pathogens: A theoretical study of durability and virulence. Front. Microbiol., 2017, 8(2167)
Walker, J.M. The Proteomics Protocols Handbook; Humana Press: New York, 2005.
Schad, E.; Tompa, P.; Hegyi, H. The relationship between proteome size, structural disorder and organism complexity. Genome Biol., 2011, 12(12), R120.
Wang, L.; Yan, J.; Wise, M.J.; Liu, Q.; Asenso, J.; Huang, Y.; Dai, S.; Liu, Z.; Du, Y.; Tang, D. Distribution patterns of polyphosphate metabolism pathway and its relationships with bacterial durability and virulence. Front. Microbiol., 2018, 9, 782.
P., Bagowski C.; Bruins, W.; J.W. te Velthuis, A. The nature of protein domain evolution: Shaping the Interaction Network. Curr. Genom, 2010, 11(5), 368-376.
Sonnhammer, E. Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res., 1998, 26(1), 320-322.
Buljan, M. Mechanisms of change in protein architecture; Cambridge: University of Cambridge, 2011.
Hilton, J.A.; Foster, R.A.; James Tripp, H.; Carter, B.J.; Zehr, J.P.; Villareal, T.A. Genomic deletions disrupt nitrogen metabolism pathways of a cyanobacterial diatom symbiont. Nature. Commun., 2013, 4(1767)
Ojcius, D.M.; Voigt, A.; Schöfl, G.; Saluz, H.P. The Chlamydia psittaci Genome: A comparative analysis of intracellular pathogens. PLoS ONE, 2012, 7(4)
Manzano-Marín, A.; Latorre, A. Snapshots of a shrinking partner: Genome reduction in Serratia symbiotica. Scientific Reports., 2016, 6(32590)
Pilpel, Y.; Mendonça, A.G.; Alves, R.J.; Pereira-Leal, J.B. Loss of genetic redundancy in reductive genome evolution. PLOS Comput. Biol., 2011, 7(2), e1001082.
Kelkar, Y.D.; Ochman, H. Genome reduction promotes increase in protein functional complexity in bacteria. Genetics, 2012, 193(1), 303-307.
Cavaletti, L. Actinospica robiniae gen. nov., sp. nov. and Actinospica acidiphila sp. nov.: Proposal for Actinospicaceae fam. nov. and Catenulisporinae subord. nov. in the order Actinomycetales. Int. J. Syst. Evol. Micro., 2006, 56(8), 1747-1753.
Molloy, S. A tiny alternative. Nature. Rev. Micro., 2009, 7(9), 620-620.
Grove, A. MarR family transcription factors. Curr. Biol., 2013, 23(4), 142-143.
Viollier, P.H.; Willett, J.W.; Kirby, J.R. Genetic and biochemical dissection of a hiska domain identifies residues required exclusively for kinase and phosphatase activities. PLoS Genetics., 2012, 8(11), e1003084.
Donahue, J.P.; Peek, J. R. M. Helicobacter pylori: Physiology and Genetics. In: Helicobacter pylori: Physiology and Genetics; Mobley, H.L.T.; Mendz, G.L.; Hazell, S.L., Eds.; ASM Press: Washington, DC, 2001.
Carlyon, J.A.; Ryan, D.; Archer, K.; Fikrig, E. Effects of anaplasma phagocytophilum on host cell ferritin mrna and protein levels. Infect. Immun., 2005, 73(11), 7629-7636.
Du, Y. Role of fraction 1 antigen of yersinia pestis in inhibition of phagocytosis. Infect. Immun., 2002, 70(3), 1453-1460.
Hatakeyama, M. Structure and function of Helicobacter pylori CagA, the first-identified bacterial protein involved in human cancer. Proc. Japn. Acad., Ser. B, Phys. Biol. Sci., 2017, 93(4), 196-219.
Toll-Riera, M.; Albà, M.M. Emergence of novel domains in proteins. BMC Evol. Biol., 2013, 13(47) [DOI: 10.1186/1471-2148-13-47].

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [115 - 123]
Pages: 9
DOI: 10.2174/1389202920666190320134438
Price: $58

Article Metrics

PDF: 35