![Page 1: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/1.jpg)
mzTab - Reporting MS-based Proteomics and Metabolomics Results
Dr. Johannes Griss
Proteomics Services Team
EMBL-EBI
Hinxton, Cambridge, UK
Division of Immunology, Allergy and Infectious Diseases
Department of Dermatology
Medical University of Vienna, Austria
Dr. Juan A. Vizcaíno on behalf of
![Page 2: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/2.jpg)
Johannes [email protected]
HUPO 2014
Overview
• Need for mzTab
• Details about the data format (mzTab 1.0)
• Existing software implementations
• Extension of mzTab 1.0 for metabolomics
![Page 3: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/3.jpg)
Johannes [email protected]
HUPO 2014
•Develops data format standards for proteomics.
•Both data representation and annotation standards.
•Involves data producers, database providers, software producers, publishers, …
•Active Workgroups: MI, MS, PI, Mod, (Protein Separation).
•Inter-group activities: MIAPE and Controlled Vocabularies.
•Started in 2002, so some experience already…
www.psidev.info
HUPO Proteomics Standards Initiative
![Page 4: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/4.jpg)
Johannes [email protected]
HUPO 2014
PSI-MS/PI Standard File Formats before mzTab
• TraMLSRM
• mzQuantMLQuantitation
• mzIdentMLIdentification
• mzMLMS data
![Page 5: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/5.jpg)
Johannes [email protected]
HUPO 2014
Reasons for an additional file format (mzTab)• mzIdentML and mzQuantML (necessary) focus on
complete representation of proteomics results
• Complex XML-based file formats
• Specialised software required for visualisation
• In-depth bioinformatics understanding required to create and use files
• No simple method to communicate final results to non-proteomics experts
• No simple method to utilise files through scripting languages and standard statistical software
![Page 6: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/6.jpg)
Johannes [email protected]
HUPO 2014
mzTab – Aims
• Store final results of MS-based experiment in a single file
• Quantitation data
• Identification data
• Small Molecule data
• Reduce complexity to make data accessible to non-proteomics / bioinformatics experts
• Be easily accessible using “standard” software
![Page 7: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/7.jpg)
Johannes [email protected]
HUPO 2014
mzTab – Aims
• What the format does NOT aim at:
• Replace mzIdentML or mzQuantML for proteomics approaches
• Contain the complete data of a MS based experiment
• Provide fully detailed evidence for the data
• Allow a researcher to recreate the process which led to the results
![Page 8: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/8.jpg)
Johannes [email protected]
HUPO 2014
Why a tab-delimited file?
• Using XML based formats requires sophisticated bioinformatics expertise
• Many researchers are still used to use MS Excel to “look” at or exchange their data.
• Standard tab-delimited file formats for transcriptomics (MAGE-TAB) and molecular interactions (MI-TAB) data were already successful
![Page 10: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/10.jpg)
Johannes [email protected]
HUPO 2014
mzTab - Sections
• Basic information about experiment and sample• Key-Value pairsMetadata
• Basic information about protein identifications• Table-basedProtein
• Information about quantified peptides• Table-basedPeptide
• Information about identified spectra• Table-basedPSM
• Basic information about identified small molecules• Table-basedSmall Molecule
![Page 12: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/12.jpg)
Johannes [email protected]
HUPO 2014
mzTab –Modes and Types
• Modes (depending on the level of detail):
• ‘Summary’: only the ‘final results’.
• ‘Complete’: detailed information for each individual assay or replicate is provided.
• Types:
• ‘Identification’: Only identification results.
• ‘Quantification’: They can also contain identification results.
• Overall, 4 different files “flavors” are possible, so very flexible design.
![Page 15: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/15.jpg)
Johannes [email protected]
HUPO 2014
Peptide Section (label-free)
• Only used in “Quantification” files.
![Page 17: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/17.jpg)
Johannes [email protected]
HUPO 2014
mzTab – Current implementations
• jmzTab (Java API): Version 3.0 is now a stable version. Manuscript published in the journal Proteomics.
• mzTab Validator, PRIDE XML to mzTab converter (PRIDE team).
• mzIdentML and mzQuantML to mzTab converters (Andy Jones group).
• MaxQuant: exporter in beta is available.
• OpenMS (version 1.10).
• R/Bioconductor package Msnbase (L. Gatto, Cambridge University).
• LipidDataAnalyzer (J. Hartler, University of Graz, see next talk).
• Metabolights (EBI).
![Page 18: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/18.jpg)
Johannes [email protected]
HUPO 2014
mzTab – ongoing development
• More detailed modelling of MS metabolomics data
• Led by S. Neumann (COSMOS EU FP7 project).
• Extension from one to three sections.
Example file exists at
https://github.com/sneumann/mtbls2/faahKO.mzTab
http://www.cosmos-fp7.eu/
![Page 19: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/19.jpg)
Johannes [email protected]
HUPO 2014
mzTab format related publications
http://code.google.com/p/mztab/
J. Griss et al., MCP, 2014
Q.W. Xu et al., Proteomics, 2014
![Page 21: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/21.jpg)
Johannes [email protected]
HUPO 2014
Current PSI-MS/PI Standard File Formats
• mzTabFinal Results
• TraMLSRM
• mzQuantMLQuantitation
• mzIdentMLIdentification
• mzMLMS data
![Page 22: The mzTab data standard format for reporting MS-based peptide, protein and small molecule identification and quantification results](https://reader035.vdocument.in/reader035/viewer/2022062419/558986e8d8b42a1e278b459d/html5/thumbnails/22.jpg)
Johannes [email protected]
HUPO 2014
Acknowledgements
Johannes GrissQing-Wei XuHenning Hermjakob
Timo SachsenbergMathias WalzerOliver Kohlbacher
http://mztab.googlecode.com
Andy Jones
S. Neumann and other COSMOS partners
PSI editor and reviewers… and many others have also contributed
BBSRC PROCESS grantBBSRC ProteoSuite grant