Generic placeholder image

Current Protein & Peptide Science


ISSN (Print): 1389-2037
ISSN (Online): 1875-5550

Review Article

Free Open Source Software for Protein and Peptide Mass Spectrometry- based Science

Author(s): Filippo Rusconi*

Volume 22, Issue 2, 2021

Published on: 18 January, 2021

Page: [134 - 147] Pages: 14

DOI: 10.2174/1389203722666210118160946

Price: $65


In the field of biology, and specifically in protein and peptide science, the power of mass spectrometry is that it is applicable to a vast spectrum of applications. Mass spectrometry can be applied to identify proteins and peptides in complex mixtures, to identify and locate post-translational modifications, to characterize the structure of proteins and peptides to the most detailed level or to detect protein-ligand non-covalent interactions. Thanks to the Free and Open Source Software (FOSS) movement, scientists have limitless opportunities to deepen their skills in software development to code software that solves mass spectrometric data analysis problems. After the conversion of raw data files into open standard format files, the entire spectrum of data analysis tasks can now be performed integrally on FOSS platforms, like GNU/Linux, and only with FOSS solutions. This review presents a brief history of mass spectrometry open file formats and goes on with the description of FOSS projects that are commonly used in protein and peptide mass spectrometry fields of endeavor: identification projects that involve mostly automated pipelines, like proteomics and peptidomics, and bio-structural characterization projects that most often involve manual scrutiny of the mass data. Projects of the last kind usually involve software that allows the user to delve into the mass data in an interactive graphics-oriented manner. Software projects are thus categorized on the basis of these criteria: software libraries for software developers vs desktop-based graphical user interface, software for the end-user and automated pipeline-based data processing vs interactive graphics-based mass data scrutiny.

Keywords: Free Software, open source, mass spectrometry, proteins, peptides, structural biology.

Graphical Abstract
Smith, R. Conversations with 100 Scientists in the Field Reveal a Bifurcated Perception of the State of Mass Spectrometry Software. J. Proteome Res., 2018, 17(4), 1335-1339.
[] [PMID: 29546988]
Lampen, P.; Hillig, H.; Davies, A.N.; Linscheid, M. JCAMP-DX for Mass Spectrometry. Appl. Spectrosc., 1994, 48, 1545-1552.
Rew, R.K.; Davis, G.P. NetCDF: An Interface for Scientific Data Access. IEEE Comput. Graph. Appl., 1990, 10(4), 76-82.
Rusconi, F. massXpert 2: a cross-platform software environment for polymer chemistry modelling and simulation/analysis of mass spectrometric data. Bioinformatics, 2009, 25(20), 2741-2742.
[] [PMID: 19740912]
Pedrioli, P.G.A.; Eng, J.K.; Hubley, R.; Vogelzang, M.; Deutsch, E.W.; Raught, B.; Pratt, B.; Nilsson, E.; Angeletti, R.H.; Apweiler, R.; Cheung, K.; Costello, C.E.; Hermjakob, H.; Huang, S.; Julian, R.K.; Kapp, E.; McComb, M.E.; Oliver, S.G.; Omenn, G.; Paton, N.W.; Simpson, R.; Smith, R.; Taylor, C.F.; Zhu, W.; Aebersold, R. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol., 2004, 22(11), 1459-1466.
[] [PMID: 15529173]
Deutsch, E. mzML: a single, unifying data format for mass spectrometer output. Proteomics, 2008, 8(14), 2776-2777.
[] [PMID: 18655045]
Martens, L.; Chambers, M.; Sturm, M.; Kessner, D.; Levander, F.; Shofstahl, J.; Tang, W.H.; Römpp, A.; Neumann, S.; Pizarro, A.D.; Montecchi-Palazzi, L.; Tasman, N.; Coleman, M.; Reisinger, F.; Souda, P.; Hermjakob, H.; Binz, P.A.; Deutsch, E.W. mzML--a community standard for mass spectrometry data. Mol. Cell. Proteomics, 2011, 10(1), 000133.
[] [PMID: 20716697]
Jones, A.R.; Eisenacher, M.; Mayer, G.; Kohlbacher, O.; Siepen, J.; Hubbard, S.J.; Selley, J.N.; Searle, B.C.; Shofstahl, J.; Seymour, S.L.; Julian, R.; Binz, P.A.; Deutsch, E.W.; Hermjakob, H.; Reisinger, F.; Griss, J.; Vizcaíno, J.A.; Chambers, M.; Pizarro, A.; Creasy, D. The mzIdentML data standard for mass spectrometry-based proteomics results. Mol. Cell. Proteomics, 2012, 11(7), 014381.
[] [PMID: 22375074]
Orchard, S.; Jones, A.; Albar, J.P.; Cho, S.Y.; Kwon, K.H.; Lee, C.; Hermjakob, H. Tackling quantitation: a report on the annual Spring Workshop of the HUPO-PSI 28-30 March 2010, Seoul, South Korea. Proteomics, 2010, 10(17), 3062-3066.
[] [PMID: 20806224]
Walzer, M.; Qi, D.; Mayer, G.; Uszkoreit, J.; Eisenacher, M.; Sachsenberg, T.; Gonzalez-Galarza, F.F.; Fan, J.; Bessant, C.; Deutsch, E.W.; Reisinger, F.; Vizcaíno, J.A.; Medina-Aunon, J.A.; Albar, J.P.; Kohlbacher, O.; Jones, A.R. The mzQuantML data standard for mass spectrometry-based quantitative studies in proteomics. Mol. Cell. Proteomics, 2013, 12(8), 2332-2340.
[] [PMID: 23599424]
Deutsch, E.W.; Chambers, M.; Neumann, S.; Levander, F.; Binz, P.A.; Shofstahl, J.; Campbell, D.S.; Mendoza, L.; Ovelleiro, D.; Helsens, K.; Martens, L.; Aebersold, R.; Moritz, R.L.; Brusniak, M.Y. TraML--a standard format for exchange of selected reaction monitoring transition lists. Mol. Cell. Proteomics, 2012, 11(4), 015040.
[] [PMID: 22159873]
Schramm, T.; Hester, Z.; Klinkert, I.; Both, J.P.; Heeren, R.M.A.; Brunelle, A.; Laprévote, O.; Desbenoit, N.; Robbe, M.F.; Stoeckli, M.; Spengler, B.; Römpp, A. imzML--a common data format for the flexible exchange and processing of mass spectrometry imaging data. J. Proteomics, 2012, 75(16), 5106-5110.
[] [PMID: 22842151]
Scheltema, R.A.; Jankevics, A.; Jansen, R.C.; Swertz, M.A.; Breitling, R. PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Anal. Chem., 2011, 83(7), 2786-2793.
[] [PMID: 21401061]
Kessner, D.; Chambers, M.; Burke, R.; Agus, D.; Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics, 2008, 24(21), 2534-2536.
[] [PMID: 18606607]
Röst, H.L.; Sachsenberg, T.; Aiche, S.; Bielow, C.; Weisser, H.; Aicheler, F.; Andreotti, S.; Ehrlich, H.C.; Gutenbrunner, P.; Kenar, E.; Liang, X.; Nahnsen, S.; Nilse, L.; Pfeuffer, J.; Rosenberger, G.; Rurik, M.; Schmitt, U.; Veit, J.; Walzer, M.; Wojnar, D.; Wolski, W.E.; Schilling, O.; Choudhary, J.S.; Malmström, L.; Aebersold, R.; Reinert, K.; Kohlbacher, O. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods, 2016, 13(9), 741-748.
[] [PMID: 27575624]
Bertsch, A.; Gröpl, C.; Reinert, K.; Kohlbacher, O. OpenMS and TOPP: open source software for LC-MS data analysis. Methods Mol. Biol., 2011, 696, 353-367.
[] [PMID: 21063960]
Sturm, M.; Kohlbacher, O. TOPPView: an open-source viewer for mass spectrometry data. J. Proteome Res., 2009, 8(7), 3760-3763.
[] [PMID: 19425593]
Junker, J.; Bielow, C.; Bertsch, A.; Sturm, M.; Reinert, K.; Kohlbacher, O. TOPPAS: a graphical workflow editor for the analysis of high-throughput proteomics data. J. Proteome Res., 2012, 11(7), 3914-3920.
[] [PMID: 22583024]
Aiche, S.; Sachsenberg, T.; Kenar, E.; Walzer, M.; Wiswedel, B.; Kristl, T.; Boyles, M.; Duschl, A.; Huber, C.G.; Berthold, M.R.; Reinert, K.; Kohlbacher, O. Workflows for automated downstream data analysis and visualization in large-scale computational mass spectrometry. Proteomics, 2015, 15(8), 1443-1447.
[] [PMID: 25604327]
Röst, H.L.; Schmitt, U.; Aebersold, R.; Malmström, L. pyOpenMS: a Python-based interface to the OpenMS mass-spectrometry algorithm library. Proteomics, 2014, 14(1), 74-77.
[] [PMID: 24420968]
Deutsch, E.W.; Mendoza, L.; Shteynberg, D.; Farrah, T.; Lam, H.; Tasman, N.; Sun, Z.; Nilsson, E.; Pratt, B.; Prazen, B.; Eng, J.K.; Martin, D.B.; Nesvizhskii, A.I.; Aebersold, R. A guided tour of the Trans-Proteomic Pipeline. Proteomics, 2010, 10(6), 1150-1159.
[] [PMID: 20101611]
Deutsch, E.W.; Mendoza, L.; Shteynberg, D.; Slagel, J.; Sun, Z.; Moritz, R.L. Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteomics Clin. Appl., 2015, 9(7-8), 745-754.
[] [PMID: 25631240]
Lam, H.; Deutsch, E.W.; Eddes, J.S.; Eng, J.K.; King, N.; Stein, S.E.; Aebersold, R. Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics, 2007, 7(5), 655-667.
[] [PMID: 17295354]
Keller, A.; Nesvizhskii, A.I.; Kolker, E.; Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem., 2002, 74(20), 5383-5392.
[] [PMID: 12403597]
Shteynberg, D.; Deutsch, E.W.; Lam, H.; Eng, J.K.; Sun, Z.; Tasman, N.; Mendoza, L.; Moritz, R.L.; Aebersold, R.; Nesvizhskii, A.I. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteomics, 2011, 10(12), 007690.
[] [PMID: 21876204]
Han, D.K.; Eng, J.; Zhou, H.; Aebersold, R. Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat. Biotechnol., 2001, 19(10), 946-951.
[] [PMID: 11581660]
Li, X-J.; Zhang, H.; Ranish, J.A.; Aebersold, R. Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal. Chem., 2003, 75(23), 6648-6657.
[] [PMID: 14640741]
Nesvizhskii, A.I.; Keller, A.; Kolker, E.; Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem., 2003, 75(17), 4646-4658.
[] [PMID: 14632076]
Kösters, M.; Leufken, J.; Schulze, S.; Sugimoto, K.; Klein, J.; Zahedi, R.P.; Hippler, M.; Leidel, S.A.; Fufezan, C. pymzML v2.0: introducing a highly compressed and seekable gzip format. Bioinformatics, 2018, 34(14), 2513-2514.
[] [PMID: 29394323]
Horlacher, O.; Nikitin, F.; Alocci, D.; Mariethoz, J.; Müller, M.; Lisacek, F. MzJava: An open source library for mass spectrometry data processing. J. Proteomics, 2015, 129, 63-70.
[] [PMID: 26141507]
Horlacher, O.; Lisacek, F.; Müller, M. Mining Large Scale Tandem Mass Spectrometry Data for Protein Modifications Using Spectral Libraries. J. Proteome Res., 2016, 15(3), 721-731.
[] [PMID: 26653734]
Levitsky, L.I.; Klein, J.A.; Ivanov, M.V.; Gorshkov, M.V. Pyteomics 4.0: Five Years of Development of a Python Proteomics Framework. J. Proteome Res., 2019, 18(2), 709-714.
[] [PMID: 30576148]
Bernd Fischer, S. N. mzR. Bioconductor, 2017.
Gatto, L.; Gibb, S.; Rainer, J. MSnbase, Efficient and Elegant R-Based Processing and Visualization of Raw Mass Spectrometry Data. J. Proteome Res., 2020.
[] [PMID: 32902283]
Thomas Lin Pedersen, V. A. P. W. C. F. G. mzID. Bioconductor, 2017.
Wang, Xiaojing pepXMLTab. Bioconductor, 2017.
Gatto, V. P. W. C. F. L. MSnID. Bioconductor, 2017.
Breitwieser, F.P.; Müller, A.; Dayon, L.; Köcher, T.; Hainard, A.; Pichler, P.; Schmidt-Erfurth, U.; Superti-Furga, G.; Sanchez, J.C.; Mechtler, K.; Bennett, K.L.; Colinge, J. General statistical modeling of data from protein relative expression isobaric tags. J. Proteome Res., 2011, 10(6), 2758-2766.
[] [PMID: 21526793]
Gibb, S.; Strimmer, K. MALDIquant: a versatile R package for the analysis of mass spectrometry data. Bioinformatics, 2012, 28(17), 2270-2271.
[] [PMID: 22796955]
Łącki, M.K.; Valkenborg, D.; Startek, M.P. IsoSpec2: Ultrafast Fine Structure Calculator. Anal. Chem., 2020, 92(14), 9472-9475.
[] [PMID: 32501003]
Shliaha, P.V.; Gibb, S.; Gorshkov, V.; Jespersen, M.S.; Andersen, G.R.; Bailey, D.; Schwartz, J.; Eliuk, S.; Schwämmle, V.; Jensen, O.N. Maximizing Sequence Coverage in Top-Down Proteomics By Automated Multimodal Gas-Phase Protein Fragmentation. Anal. Chem., 2018, 90(21), 12519-12526.
[] [PMID: 30252444]
Choi, M.; Chang, C.Y.; Clough, T.; Broudy, D.; Killeen, T.; MacLean, B.; Vitek, O. MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics, 2014, 30(17), 2524-2526.
[] [PMID: 24794931]
Bond, N.J.; Shliaha, P.V.; Lilley, K.S.; Gatto, L. Improving qualitative and quantitative performance for MS(E)-based label-free proteomics. J. Proteome Res., 2013, 12(6), 2340-2353.
[] [PMID: 23510225]
Avtonomov, D.M.; Raskind, A.; Nesvizhskii, A.I. BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics. J. Proteome Res., 2016, 15(8), 2500-2509.
[] [PMID: 27306858]
Eng, J.K.; Hoopmann, M.R.; Jahan, T.A.; Egertson, J.D.; Noble, W.S.; MacCoss, M.J. A deeper look into Comet--implementation and features. J. Am. Soc. Mass Spectrom., 2015, 26(11), 1865-1874.
[] [PMID: 26115965]
Kiefer, P.; Schmitt, U.; Vorholt, J.A. eMZed: an open source framework in Python for rapid and interactive development of LC/MS data analysis workflows. Bioinformatics, 2013, 29(7), 963-964.
[] [PMID: 23418185]
Solntsev, S.K.; Shortreed, M.R.; Frey, B.L.; Smith, L.M. Enhanced Global Post-translational Modification Discovery with MetaMorpheus. J. Proteome Res., 2018, 17(5), 1844-1851.
[] [PMID: 29578715]
Rusconi, F. mineXpert: Biological Mass Spectrometry Data Visualization and Mining with Full JavaScript Ability. J. Proteome Res., 2019, 18(5), 2254-2259.
[] [PMID: 30950277]
Pluskal, T.; Castillo, S.; Villar-Briones, A.; Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics, 2010, 11, 395.
[] [PMID: 20650010]
Vaudel, M.; Burkhart, J.M.; Zahedi, R.P.; Oveland, E.; Berven, F.S.; Sickmann, A.; Martens, L.; Barsnes, H. PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nat. Biotechnol., 2015, 33(1), 22-24.
[] [PMID: 25574629]
da Veiga Leprevost, F.; Haynes, S.E.; Avtonomov, D.M.; Chang, H.Y.; Shanmugam, A.K.; Mellacheruvu, D.; Kong, A.T.; Nesvizhskii, A.I. Philosopher: a versatile toolkit for shotgun proteomics data analysis. Nat. Methods, 2020, 17(9), 869-870.
[] [PMID: 32669682]
Bald, T.; Barth, J.; Niehues, A.; Specht, M.; Hippler, M.; Fufezan, C. pymzML--Python module for high-throughput bioinformatics on mass spectrometry data. Bioinformatics, 2012, 28(7), 1052-1053.
[] [PMID: 22302572]
Barsnes, H.; Vaudel, M. SearchGui: A Highly Adaptable Common Interface for Proteomics Search and de Novo Engines. J. Proteome Res., 2018, 17(7), 2552-2555.
[] [PMID: 29774740]
Kou, Q.; Xun, L.; Liu, X. TopPIC: a software tool for top-down mass spectrometry-based proteoform identification and characterization. Bioinformatics, 2016, 32(22), 3495-3497.
[] [PMID: 27423895]
Marty, M.T.; Baldwin, A.J.; Marklund, E.G.; Hochberg, G.K.A.; Benesch, J.L.P.; Robinson, C.V. Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to polydisperse ensembles. Anal. Chem., 2015, 87(8), 4370-4376.
[] [PMID: 25799115]
Kolbowski, L.; Combe, C.; Rappsilber, J. xiSPEC: web-based visualization, analysis and sharing of proteomics data. Nucleic Acids Res., 2018, 46(W1), W473-W478.
[] [PMID: 29741719]
Craig, R.; Beavis, R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics, 2004, 20(9), 1466-1467.
[] [PMID: 14976030]
Langella, O.; Valot, B.; Balliau, T.; Blein-Nicolas, M.; Bonhomme, L.; Zivy, M. X!TandemPipeline: A Tool to Manage Sequence Redundancy for Protein Inference and Phosphosite Identification. J. Proteome Res., 2017, 16(2), 494-503.
[] [PMID: 27990826]
Valot, B.; Langella, O.; Nano, E.; Zivy, M. MassChroQ: a versatile tool for mass spectrometry quantification. Proteomics, 2011, 11(17), 3572-3577.
[] [PMID: 21751374]
Horn, D.M.; Zubarev, R.A.; McLafferty, F.W. Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J. Am. Soc. Mass Spectrom., 2000, 11(4), 320-332.
[] [PMID: 10757168]
Liu, X.; Inbar, Y.; Dorrestein, P.C.; Wynne, C.; Edwards, N.; Souda, P.; Whitelegge, J.P.; Bafna, V.; Pevzner, P.A. Deconvolution and database search of complex tandem mass spectra of intact proteins: a combinatorial approach. Mol. Cell. Proteomics, 2010, 9(12), 2772-2782.
[] [PMID: 20855543]
Gadadhar, S.; Dadi, H.; Bodakuntla, S.; Schnitzler, A.; Bièche, I.; Rusconi, F.; Janke, C. Tubulin glycylation controls primary cilia length. J. Cell Biol., 2017, 216(9), 2701-2713.
[] [PMID: 28687664]
Redeker, V. Mass spectrometry analysis of C-terminal posttranslational modifications of tubulins. Methods Cell Biol., 2010, 95, 77-103.
[] [PMID: 20466131]
Alvarez, L.A.; Merola, F.; Erard, M.; Rusconi, F. Mass spectrometry-based structural dissection of fluorescent proteins. Biochemistry, 2009, 48(18), 3810-3812.
[] [PMID: 19284782]
Berthelot, V.; Steinmetz, V.; Alvarez, L.A.; Houée-Levin, C.; Merola, F.; Rusconi, F.; Erard, M. An analytical workflow for the molecular dissection of irreversibly modified fluorescent proteins. Anal. Bioanal. Chem., 2013, 405(27), 8789-8798.
[] [PMID: 24026516]
Rusconi, F. GNU polyxmass: a software framework for mass spectrometric simulations of linear (bio-)polymeric analytes. BMC Bioinformatics, 2006, 7, 226.
[] [PMID: 16643644]
Rusconi, F.; Belghazi, M. Desktop prediction/analysis of mass spectrometric data in proteomic projects by using massXpert. Bioinformatics, 2002, 18(4), 644-645.
[] [PMID: 12016065]
Łącki, M.K.; Startek, M.; Valkenborg, D.; Gambin, A. IsoSpec: Hyperfast Fine Structure Calculator. Anal. Chem., 2017, 89(6), 3272-3277.
[] [PMID: 28234451]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy