proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories...

53
Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279 © 2015 Regents of the University of Minnesota. All rights reserved. Proteomics Data Analysis using Galaxy-P Center for Mass Spectrometry and Proteomics November 23rd 2015 Pratik Jagtap http://www.cbs.umn.edu/msp

Upload: others

Post on 23-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P

Center for Mass Spectrometry and Proteomics

November 23rd 2015 Pratik Jagtap

http://www.cbs.umn.edu/msp

Page 2: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 3: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Documentation: http://z.umn.edu/augworkshopgalaxyp

Page 4: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 5: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng  et  al  2011  Mol  Cell  Proteomics.  10(11):  R111.009522.  

PROTEOMICS WORKFLOW

5

Page 6: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

SEARCH DATABASES

Mass spectrum Reference Protein Database

from genomic annotation Peptide Spectral Match

6

Page 7: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Swiss-Prot is the manually annotated and reviewed section of the UniProt Knowledgebase (UniProtKB).

It is a high quality annotated and non-redundant protein sequence database,

which brings together experimental results, computed features and scientific

conclusions. http://en.wikipedia.org/wiki/Swiss-Prot

TrEMBL contains high-quality computationally analyzed records, which are enriched with automatic annotation.

The translations of annotated coding sequences in the EMBL-Bank/GenBank/DDBJ nucleotide sequence database are automatically processed and entered in

TrEMBL. http://en.wikipedia.org/wiki/TrEMBL

PROTEOMIC DATABASES

7

Page 8: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

CUSTOMIZED PROTEOMIC DATABASES

Customized database

repositories (CPTAC / UniMesh)

Genomic DNA

sequences.

Expressed sequence

tags / cDNA sequences.

Six-frame translation

Three-frame translation

Metagenomic databases.

Translation

RNASeq data.

Translation and database reduction

workflows

Proteomic databases.

8

Page 9: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

LOOKING BEYOND THE KNOWN PROTEOME

Mass spectrum Reference Protein Database

from genomic annotation

Cancer / Disease related Databases such as COSMIC, IARC p53, OMIM…

Deep genome sequencing data from ICGC, TCGA and CPTAC

RNASeq data (Customized OR

Combined)

6-frame DNA sequences. 3-frame cDNA sequences. Identification of

peptides corresponding

to novel proteoforms.

Page 10: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 11: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

GALAXY PLATFORM

Benefits of Galaxy •  A web-based bioinformatics data analysis platform. •  Software accessibility and usability. •  Share-ability of tools, workflows and histories. •  Reproducibility and ability to test and compare results after using multiple

parameters. •  Software tools can be used in a sequential manner to generate analytical workflows

that can be reused, shared and creatively modified for multiple studies.

Goecks J et al Genome Biol. 2010;11(8):R86.

Page 12: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

TOOLS & WORKFLOWS •  Software tools can be used in a sequential manner to generate analytical

workflows that can be reused, shared and creatively modified for multiple studies.

For example, Protein Database Downloader downloads UniProt protein FASTA

databases of various organisms.

Page 13: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Galaxy-P: https://galaxyp.msi.umn.edu/

Page 14: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 15: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

INPUTS : Mass spectral data and search database.

The dataset will be searched against FASTA database with human proteins, contaminant proteins, spiked in proteins and a subset of 3-frame translated cDNA database from EnSEMBL.

INPUTS: a) MGF formatter MGF files. (dataset collection) b) ABRF-Spike4: FASTA sequences of 4 spiked in proteins. c) FASTA File from EnSEMBL Searches: Subset of 3-frame translated cDNA database from EnSEMBL (our template for identifying novel proteoforms). d) Human UniProt FASTA file + contaminant proteins.

HeLa cell lysate

4 proteins spiked in (10 fmols each)

Digested O/N with trypsin

Liquid chromatography fractionation (10 fractions)

Thermofinnigan Orbitrap Velos (Orbi MS, MS/MS HCD)

RAW Files

mzml files

msconvert

MGF files

Page 16: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Log in using your MSI login and password. Click on http://z.umn.edu/history1 Import history and click on ‘start using this history’ Click on http://z.umn.edu/workflow1 Choose import to copy the workflow into your user workflows. On the confirmation screen, select start using this workflow to navigate to your user. In the workflows menu select Run Workflow 1 from the drop down menu. Appropriately assign each input database from History 1 to the corresponding input or the workflow and ‘Run’ the workflow.

GENERATING A DATABASE

1  

2  

3  

4  

Page 17: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

WORFLOW 1

17

Page 18: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Tools used in the workflow

Page 19: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Select History 1

Import history

Start using this history

Select Workflow 1

Import workflow

Start using this workflow

Run Workflow 1

INPUT

WORKFLOW

http://z.umn.edu/history2

Page 20: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 21: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng  et  al  2011  Mol  Cell  Proteomics.  10(11):  R111.009522.  

PROTEOMICS WORKFLOW

21

Page 22: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

INPUTS : Mass spectral data and search database.

The dataset will be searched against FASTA database with human proteins, contaminant proteins, spiked in proteins and a subset of 3-frame translated cDNA database from EnSEMBL.

INPUTS: a) MGF formatter MGF files. (dataset collection) b) ABRF-Spike4: FASTA sequences of 4 spiked in proteins. c) FASTA File from EnSEMBL Searches: Subset of 3-frame translated cDNA database from EnSEMBL (our template for identifying novel proteoforms). d) Human UniProt FASTA file + contaminant proteins.

HeLa cell lysate

4 proteins spiked in (10 fmols each)

Digested O/N with trypsin

Liquid chromatography fractionation (10 fractions)

Thermofinnigan Orbitrap Velos (Orbi MS, MS/MS HCD)

RAW Files

mzml files

msconvert

Page 23: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Eng  et  al  2011  Mol  Cell  Proteomics.  10(11):  R111.009522.  

MASS SPECTRAL DATA

23

Page 24: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

RAW DATA CONVERSION TOOL

.RAW

msconvert ProteoWizard

mzML

http://z.umn.edu/msconvert

MGF Formatter

MGF

http://z.umn.edu/mgfformatter

24

Page 25: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Click on http://z.umn.edu/history2b Import history and click on ‘start using this history’

5  

6  

Page 26: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 27: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

A face in the crowd: recognizing peptides through database search. Eng et al 2011 Mol Cell Proteomics. 10(11)

PROTEOMICS WORKFLOW

27

Page 28: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Mass spectrum Reference Protein Database

from genomic annotation Peptide Spectral Match

DATABASE SEARCH

28

Page 29: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Nesvizhskii et al Nature Methods - 4, 787 - 797 (2007)

DATABASE SEARCH

29

Page 30: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Nesvizhskii et al Nature Methods - 4, 787 - 797 (2007)

DATABASE SEARCH

30

Page 31: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

SEARCHGUI

Vaudel M. et al Proteomics (2011) 11(5) https://code.google.com/p/searchgui/ 31

Page 32: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Comet

Myrimatch

MSGF+

MS Amanda

MULTIPLE SEARCH ALGORITHMS Tabb et al, J. Proteome Res., 2007, 6 (2)

Eng et al, Proteomics. 2013, 13(1)

Kim and Pevzner PA. Nat Commun., 2014, 5(1)

Geer et al, J Proteome Res., 2004,3(5).

Craig and Beavis. Bioinformatics., 2004, Jun 20(9)

Dorfer et al, J Proteome Res., 2014, 13(8).

32

Page 33: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

MULTIPLE SEARCH ALGORITHMS

Page 34: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 35: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Click on http://z.umn.edu/history3b Import history and click on ‘start using this history’

7  

8  

Identification Algorithms: OMSSA, MS-GF+ and Comet Database Search Parameters 1: Precursor Accuracy Unit: ppm 2: Precursor Ion m/z Tolerance: 10.0 3: Fragment Ion m/z Tolerance: 0.01 4: Enzyme: Trypsin 5: Number of Missed Cleavages: Not implemented 6: Database: input_database.fasta 7: Forward Ion: b 8: Rewind Ion: y 9: Fixed Modifications: mmts on c 10: Variable Modifications: oxidation of m

Page 36: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

SEARCHGUI PARAMETERS

Page 37: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

SEARCHGUI PARAMETERS

Page 38: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Identification Algorithms: OMSSA, MS-GF+ and Comet Database Search Parameters 1: Precursor Accuracy Unit: ppm 2: Precursor Ion m/z Tolerance: 10.0 3: Fragment Ion m/z Tolerance: 0.01 4: Enzyme: Trypsin 5: Number of Missed Cleavages: Not implemented 6: Database: input_database.fasta 7: Forward Ion: b 8: Rewind Ion: y 9: Fixed Modifications: mmts on c 10: Variable Modifications: oxidation of m

SEARCHGUI PARAMETERS

Page 39: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 40: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PEPTIDESHAKER

Vaudel et al Nature Biotechnology, 33, (2015)

http://galaxyproteomics.github.io/peptideshaker/

40

Page 41: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Slide from Alexey Nesvizshkii talk at http://www.scivee.tv/node/12671

PEPTIDESHAKER : PROTEIN INFERENCE

41

Page 42: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Click on http://z.umn.edu/history4b Import history and click on ‘start using this history’

9  

10  

Page 43: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

4.3 Peptide Shaker in GalaxyP

Page 44: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PEPTIDESHAKER : TARGET-DECOY SEARCH

Page 45: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PEPTIDESHAKER : TARGET-DECOY SEARCH

45

Page 46: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

4.3 Peptide Shaker in GalaxyP

46

Page 47: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PEPTIDESHAKER: OUTPUTS

Page 48: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

PEPTIDESHAKER: OUTPUTS

Page 49: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

http://z.umn.edu/augworkshopgalaxyp

Page 50: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Complex Workflows Galaxy-P provides an integrated platform for every step of proteogenomic analysis. •  Build target database – download and

translate EST databases or perform gene prediction with Augustus.

•  Numerous tools for identification and text manipulation.

•  Workflow utilizing BLAST to identify novel peptides.

•  Tool to assess peptide-spectrum matches and visualize spectra.

•  Visualize identified peptides on the genome. •  140 steps: Seamless, integrated

proteogenomic workflow.

Flexible and accessible workflows for improved proteogenomic analysis using Galaxy framework. J. Proteome Res., DOI: 10.1021/pr500812t Link: z.umn.edu/pgfirstlook

Page 51: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Links to workflows, webcast, pages, documentation and publications.

Workflows Proteogenomic studies: http://z.umn.edu/pg140 Metaproteomic studies: http://z.umn.edu/metaproteomics1

Webcast Using ProteinPilot within Galaxy-P: z.umn.edu/ppingp

Pages Proteogenomics page: z.umn.edu/proteinpilotpage Metaproteomics page: z.umn.edu/metaproteomicspage

Workshop / Tutorial on proteogenomics: Mass Spectrometry-based Proteomics Data Analysis using Galaxy-P: z.umn.edu/gcc2015gp

Manuscripts

•  Metaproteomic analysis using the Galaxy framework. Proteomics. (2015) doi: 10.1002/pmic.201500074. PMID: 26058579.

•  Multi-omic data analysis using Galaxy. Nat Biotechnol. (2015) 33(2):137-9. doi: 10.1038/nbt.3134. PMID: 25658277

•  Flexible and accessible workflows for improved proteogenomic analysis using the Galaxy framework. J Proteome Res. (2014)13(12):5898-908. doi: 10.1021/pr500812t. PMID:25301683

•  Proteomic profiles in acute respiratory distress syndrome differentiates survivors from non-survivors. PLoS One. (2014) 7;9(10):e109713. doi: 10.1371/journal.pone.0109713. PMID: 25290099

•  Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations. BMC Genomics. (2014) 15:703. doi: 10.1186/1471-2164-15-703. PubMed PMID: 25149441

Proteogenomics page: z.umn.edu/proteinpilotpage Metaproteomics page: z.umn.edu/metaproteomicspage

5: Sheynkman GM, Johnson JE, Jagtap PD, Shortreed MR, Onsongo G, Frey BL, Griffin TJ, Smith LM.

Page 52: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

Proteomics Data Analysis using Galaxy-P •  Proteomics Workflow •  Search Databases •  Galaxy Platform •  Generating a Database within GalaxyP •  Peaklist Conversion •  Search algorithms •  Using search algorithms within GalaxyP •  Protein Inference •  Using PeptideShaker within GalaxyP

Page 53: Proteomicscbs.umn.edu/sites/cbs.umn.edu/files/public/downloads/... · 2019. 3. 29. · repositories (CPTAC / UniMesh) Genomic DNA sequences. Expressed sequence tags / cDNA sequences

Center for Mass Spectrometry and Proteomics | Phone | (612)625-2280 | (612)625-2279

© 2015 Regents of the University of Minnesota. All rights reserved.

QUESTIONS?

Follow us on twitter.com/usegalaxyp

Visit http://usegalaxyp.org

or http://galaxyp.msi.umn.edu

or