computational strategies for improving the quantitative analysis …€¦ · analysis of peptides...

Post on 25-Jul-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Computational Strategies for Improving the Quantitative Analysis of Peptides by Tandem Mass Spectrometry

Michael MacCossDepartment of Genome Sciences

University of Washington

Acquisition methods in shotgun LC-MS/MS

Data Dependent Acquisition (DDA)

Liquid chromatography

MS/MS analysisIsolation &

fragmentationMS analysis

• Irregular sampling across replicates• Limited to MS1 quant

• 3-10x less sensitive than MS2 quant• Poor interference relative to MS2

• Need an MS1 signal to trigger MS2

Acquisition methods in shotgun LC-MS/MS

Data Dependent Acquisition (DDA)

Parallel Reaction Monitoring (PRM)

MS/MS analysisIsolation &

fragmentationMS analysis

Liquid chromatography

The Challenge:

• In 2006, software for building targeted

proteomics methods didn’t exist.

• Neither did software for the analysis of

complex multiplexed SRM methods

• No software was capable of cross

vendor data analysis

Needs:

Protein and Peptide centric

Intuitive document editor

graphical interface

Full undo/redo support

Direct analysis of all vendor

RAW MS data without

conversion.

Ability to choose peptides

and transitions based on

peptide spectrum libraries

Free and open source

Developed and supported

professionallyhttp://skyline.maccosslab.org

Skyline: Software for the Development of Targeted Proteomics Methods and Analysis of the Data

Skyline is Vender Neutral

• Paulovich LabAppliedBiosystemsQTrap 4000

• MacCoss LabThermoTSQ Ultra

Skyline Project Overview

• 10th year in development

• 6 developers

• over 500,000 lines of code – open source

• 60,000 lines of code for testing

• over 85,000 installations

• over 10,000 registered users

• over 7500 instances started each week

Skyline is Not Just for SRM

Support Targeted Quantitation from Extracted Ion Chromatograms:

• Selected Reaction Monitoring (aka SRM and MRM)

• Targeted MS/MS (aka Parallel Reaction Monitoring or HMRM)

• MS1 Filtering (chromatogram extraction from the MS1 data)

• Data Independent Acquisition and SWATH

Immediate Visual Interaction with the Data

129 peptides, 42 replicates

External Tools for Skyline

External Tools – 13 Total

Msstats (Vitek)

grouped study statistics

QuaSAR (Carr)

response curve statistics

Population Variation

mutation frequency

Protter (Wollscheid)

transmembrane topology

SProCoP (Bereman)

system suitability

MS1 Probe (Gibson)

MS1 filtering statistics

Broudy, et. al, Bioinformatics

Tools in Skyline

Skyline Tester

Critical tool for professional translation companies

Automated access to any of 140 forms in Skyline

Automated access to 14 tutorials and 400 screenshots

Runs English, Chinese and Japanese UI

Runs French and Turkish number formats

Nightly runs tracking failures and memory leaks

Leveraging already extensive Skyline test suite

530 tests

60,000 lines of code

Custom Test Framework -- Skyline Tester

Custom Test Framework -- Skyline Tester

Memory Leak

Skyline Nightly – Test Web Log

Aggregating and Publishing• Store and manage Skyline documents

• Publish fully annotated Skyline documents

o Satisfies the newest MCP Guidelines for dissemination of targeted proteomics assays (Abbatiello et al, MCP 2017)

• Build chromatogram libraries

• Aggregate lab QC data

• Free hosted version (http://panoramaweb.org)

o 331 labs so far (including large projects from LINCS, CPTAC, & ABRF sPRG)

o >9589 data sets uploaded

o User controlled security

• Supported by a grant from the NIH and the Panorama Partners Program

o Roche (2013), Genentech (2013), Merck (2014), CONFIDENTIAL (2014), NCI (2017), and Sanofi 2018

• Panorama Public fulfills the MCP publication requirements for release of quantitative data.

• Locally installable server application

• Free and open source (Apache 2.0)Sharma, et. al, J. Proteome Res. 2014

Skyline/Panorama Workflow

Publishing to Panorama

Growth of PanoramaWeb

Lab Growth to 331 Skyline docs to 14568

Panorama Growth High

Viewing Runs - Peptides

Viewing Proteins

Viewing Peptides

Customizing Views, Sorts, Filters

Grouth Partially Enabled by AutoQC

Vendor Neutral AutoQC

Data Uploaded Automatically to Panorama

Track Performance Overtime

Backend to NIH Branded Sites

NCI CPTAC Assay Portal LINCS pilincs.org

http://passport.maccosslab.org

Alternatives to SRM or Targeted MS/MS?

• We want the selectivity of targeted SRM with the breadth of data dependent acquisition.

• If we have a new peptide we want to target, we don’t want to have to go back and rerun the experiment.o We want a digital molecular archive of

all samples.

Acquisition methods in shotgun LC-MS/MS

Data Independent Acquisition (DIA)

Data Dependent Acquisition (DDA)

Parallel Reaction Monitoring (PRM)

MS/MS analysisIsolation &

fragmentationMS analysis

Liquid chromatography

Data Independent Acquisition

20 20 m/z-wide windows = 400 m/z

m/z500 900

Tim

e

~2 seconds (Q-Exactive)

m/z500 900

780

m/z

800

m/z

51.2

51.351.4

51.551.6

51.7

400 600 800 1000 1200 1400m/z

0

1

2

3

4

Inte

nsi

ty x

10

-5

Targeted Chromatogram ExtractionVLENTFEIGSDSIFDK++ (790.4 m/z)

Extraction of Fragment Ions from DIA Data

Venable et al Nat Methods 2004

Dong et al Science 2007

Data Acquisition

Peptide Selection

Chromatogram Extraction

Peak Integration

Transition Refinement

Hypothesis Generation

Hypothesis Test

Data Acquisition

Peptide Selection

Hypothesis Generation

Hypothesis Test

Transition Selection or Refinement

RT Scheduling

Peak Integration

DIASRM

The Promise of DIA

▪ Acquire a “molecular image” or “digital archive” of the sample– Mine it over and over

– No scheduling

▪ Direct queries (and p-values) for peptides of interest

▪ Better quantitation than DDA

Source: http://edition.cnn.com/interactive/2017/01/politics/trump-inauguration-gigapixel/

PRM: High Fidelity; Targeted

Source: http://edition.cnn.com/interactive/2017/01/politics/trump-inauguration-gigapixel/

There Can be Value in Keeping the Full Picture

Source: https://www.nytimes.com/interactive/2017/01/20/us/politics/trump-inauguration-crowd.html

2009 Obama Inauguration 2017 Trump Inauguration

There Can be Value in Keeping the Full Picture

Things you can do post data acquisition

• Choose different peptides for a protein

• Choose different transitions

• Add additional proteins / proteoforms

Peptide Detections: Typical DIA library search workflow

Spectrum

LibraryCompute

Match

Features

Machine

Learning

ClassifierWide

Window

DIA File

Typical DIA library search

Peptide Detections: Typical DIA library search workflow

Compute

Match

Features

Machine

Learning

ClassifierWide

Window

DIA File

Typical DIA library search

Innovations

Protein/

Peptide

Sequences

Pecan: Detecting Peptides Directly from DIA Data

Percolator, Käll et al. 2007Decoy peptides

CVITNLAPTK

p-value

q-valuePercolator

APTNVTCILK

Peptides of interest

Fragment ions - raw score

- background subtracted score

- spectrum similarity

- number of contributing ions

- mass error mean

- mass error variance

- rank

Precursor ions- idotp (isotope dot product)

- mass error mean

- mass error variance

Peptide sequence- charge state

- peptide length

- SSR hydrophobicity index

Score

Ting et al, Nature Methods 2017

XcorDIA is a Pecan Like Algorithm that Uses Xcorr

Brian Searle ASMS 2018

DIA is a powerful tool for detecting peptides in HeLa

192

Experimental Details:

90 min linear gradient

QE-HF single injection

DDA:

400-1600 m/z

Top-12

DIA:

400-1000 m/z

24 m/z overlap

windows

(12 m/z effective)

Peptide Detections: EncyclopeDIA workflow

Spectrum

Library

Deconvolut

e

Overlappin

g

Windows

Compute

Match

Features

Percolator

Wide

Window

DIA File

Retention

Time Filtering

Automated

Transition

Refinement

Quantitation

Typical DIA library search

EncyclopeDIA innovations

Searle et al. https://www.biorxiv.org/content/early/2018/03/07/277822.article-info

The Promise of DIA

Precursor Sampling in DIA is Low Res

Spectrum Library Approach

Query Data DIA Data

DIA is a powerful tool for detecting peptides in HeLa

HeLa Spectrum Library

• 39 Injections

• ~166.4k Peptides

Pan-Human Spectrum

Library

• 331 Injections

• ~139k Peptides

Separating Detection from Quantitation

24 m/z overlapping12 m/z effective

4 m/z overlapping2 m/z effective

Searle et al Biorxiv 2018

EncyclopeDIA workflow

Chromatogram

Library

Deconvolut

e

Overlappin

g

Windows

Compute

Match

Features

Percolator

Wide

Window

DIA File

Retention

Time Filtering

Automated

Transition

Refinement

Quantitation

Typical DIA library search

EncyclopeDIA innovations

Chromatogram libraries

FASTA

or

Library

Narrow

Window

DIA File

PECAN/

EncyclopeDIA

DIA Chromatogram Library Approach

DIA Chromatogram Library Approach

Query Data

DIA Data

Chromatogra

m Library

DIA Chromatogram Library Approach

Query Spectrum

Chromatogra

m LibraryNot Found

DIA Chromatogram Library Approach

Chromatogram

Library DIA Data

On-column

Chromatogram

Library

Normalized Retention

Time Library

Negative Positive Delta RT

- 1 0 - 8 - 6 - 4 - 2 0 2 4 6 8 1 0

Delta RT

0

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 0

9 0

100

Co

un

t

Negative Positive Delta RT

- 1 0 - 8 - 6 - 4 - 2 0 2 4 6 8 1 0

Delta RT

0

100

200

300

400

500

600

700

800

900

1,000

1,100

1,200

Co

un

t

+/- 5 min +/- 10 sec

Chromatogram Library Intensities are Better

Correlated with DIA Data

DDA Library vs Wide DIA DataDIA Chromatogram Library vs

Wide DIA Data

DIA is a powerful tool for detecting peptides in HeLa

Chromatogram DIA Library • 6 Injections

HeLa Chromatogram Library from FASTA • 53.2k Peptides

DIA is a powerful tool for detecting peptides in HeLa

Chromatogram DIA Library • 6 Injections

HeLa Chromatogram Library from FASTA • 53.2k Peptides

HeLa Chromatogram Library from HeLa Spectrum Library • 99.6k Peptides

HeLa Chromatogram Library from Pan-Human Spectrum Library • 91.1k Peptides

Advantages of On-Column Chromatogram Library

• If you can’t detect a peptide with a narrow precursor window then you will never detect it with a wide-window.

• On-column RT calibration overcomes selectivity limitations of poor precursor isolation in DIA

• Simplifies workflow. No need to perform extensive fractionation to generate sample specific library data. Just 4-6 runs with narrow windows

• Can make use of large assay spectrum libraries if available.

• Reduces multiple testing. Only look for peptides you know are in the sample in the wide window data.

http://chorusproject.org

Managed by Stratus Biosciences – A Non-Profit Company

New Paradigm for Data Processing

• Proteomics data needs to be “un-siloed”– A single place for all data

• Volume of high value data is increasing• Crucial for peer review and critical analysis• Economics are driving demand for data

reuse• Cultural shift: Public access to data

collected with public funds• We need to bring algorithms to the data

Chorus Project

http://chorusproject.org

Sharing and Dissemination of RAW Data

67

Vendor Neutral Chromatogram and Spectrum Viewer

Instruments and Operators

Automatic Upload / Desktop Client

Link

Each instrument has an automatic client uploader

Organizing Data

Support Wiki Pages for Projects

Sharing Data and Projects

Conclusions

• Software for mass spectrometry analysis must be easy to use, well supported, well documented, well tested, and robust.

• We feel strongly about the power of being able to visualize data.

• Data sharing continues to be a problem and it will continue to get worse as the information content of mass spectrometry data gets larger.

75

Financial SupportNCI-IMAT

NIGMS BTRR P41 Program

NIA Nathan Shock Center

NIA Investigator Initiated R01 and P01

NIGMS Investigator Initiated R01s

Stratus BiosciencesAndrey BonderenkoChristine WuNathan Yates

University of WashingtonDepartment of Genome

SciencesBill NobleJohn StamatoyannopolousJudit Villen

KTH StockholmLukas Kall

UW Lab MedicineAndy HoofnagleGeoff BairdClark Henderson

Former Lab MembersJason GilmoreSandi Spencer

University of Washington

MacCoss LabJosh Aldrich

Nat Brace

Yuval Boss

Brian Connolly

Danielle Faivre

Austin Keller

Eric Huang

Rich Johnson

Brendan MacLean

Gennifer Merrihew

Brook Nunn

Lindsay Pino

Deanna Plumbell

Brian Pratt

Tobias Rohde

Paul Rudnick

Brian Searle

Vagisha Sharma

Nick Shulman

Emma Timmins-Schiffman

Stanford Tom Montine

ThermoFisherMary BlackburnRomain HuguetAndreas KuehnClaudia MartinsPhil RemesMike SenkoYue XuanVlad Zabrouskov

Acknowledgement

Software Vendor SupportAgilent Shimadzu

Bruker ThermoFisher

Sciex Waters

MacCoss Lab UW

top related