protÉomique - université de montréaldbcm2003/proteomique/... · computational algorithm trained...
TRANSCRIPT
PROTÉOMIQUE
Application de la protéomique à la cartographie de l’interactome et
la découverte de biomarqueurs
BENOIT COULOMBE, PhD
Laboratoire de protéomique & transcription génique
BCM2003, mars 2012
Biomarkers and personalized medicine
Early diagnosis and prognosis of disease
Monitoring the response of patients to treatment
- Known therapies - Development of new drugs
Stratification of patient cohorts in order to provide personalized treatments for specific disease phenotypes (ex: IRESSA and lung cancer)
Process of biomarker discovery can also identify new targets for drug discovery, and develop new knowledge on mechanisms of disease
The sequencing of the human genome has provided the list
and linear sequence of all (most) human proteins
Genome
Proteins
Proteins rarely work alone, but rather assemble into protein
complexes that integrate the function of multiple gene products
Genome
Proteins
Complexes
Machineries
Function
Large-scale genomic studies identify an increasing number of
genes associated with disease phenotypes
Genome
Genome
How the products of these genes participate in
the establishment and the progression of disease
is often unknown!
Large-scale genomic studies identify an increasing number of
genes associated with disease phenotypes
The interaction partners of disease-associated proteins can
reveal their role in disease (guilt by association)
Genome
Proteins
Complexes
Machineries
Role in disease
The interaction partners of disease-associated proteins are
likely to act as modulators of the disease phenotype
Genome
Proteins
Complexes
Machineries
Role in disease
Modulators
The disease-associated proteins and their interactors are
putative biomarkers and drug targets
Genome
Proteins
Complexes
Machineries
Role in disease
Biomarkers Drug targets
P3MD
Computational classifier
PROTEUS
Computational tools
AP-MS
Cell Lines Patient Samples
SRM
List of Interactions
(Network Maps)
List of Validated
Biomarkers
Candidate
Biomarkers
Selection of
Disease-Associated
Proteins (Baits)
The P3MD platform Infrastructure, expertise, and data
Selection of
Peptides
(transitions)
3. Data
2. Software Applications
1. Laboratory Pipelines
A. INTERACTOME B. BIOMARKERS
Candidate
Biomarker
Selection
Engine
The human proteome
Identification of an association between a protein and a disease The case of hypothetical protein D
D
The protein D interactome
D
High-quality generic map of the protein D interactome Affinity purification coupled with MS
D
Components of the protein D interactome are candidate
drug targets and biomarkers
D
Identification of interactors using LC-MS/MS
+ Determination of a ‘‘probability’’ (False Discovery
Rate or FDR) for each interaction to be valid using
computational algorithm trained using machine
learning to differentiate valid vs. invalid interactions
Components of the protein D interactome are candidate
drug targets and biomarkers
D
Components of the protein D interactome are candidate
drug targets and biomarkers
D
I
I
Components of the protein D interactome are candidate
drug targets and biomarkers
D
I
Preliminary validation of candidate biomarkers Modulations of components of protein D interactome in disease cells (SRM)
D
Upregulated
Isoform
Downregulated
Selected Reaction Monitoring (SRM, also termed MRM)
Protein quantification is possible by spiking known amounts of chemically
synthesized heavy-labelled peptides in the samples (standard)
Proof of concept
RNAP II
Cellular machineries involved in mRNA synthesis
5’
CHROMATIN-REMODELING MACHINERY
- ATP dependent remodeling
- Histone-modifying enzymes
TFIIS
PRE-mRNA PROCESSING
MACHINERY
snRNPs U1-U6
hnRNPs
PRE-mRNA
Capping enzymes
Cleavage/polyA factors
P-TEFb
TFIID
RNAP II
TATAAA
TFIIB TFIIE
TFIIF
TFIIH
Mediator
+1
GENERAL TRANSCRIPTION MACHINERY
- RNA polymerase II (RNAP II)
- General transcription factors
- Elongation factors
TFIIF
3’
RNAP II
Cellular machineries involved in mRNA synthesis
5’
CHROMATIN-REMODELING MACHINERY
- ATP dependent remodeling
- Histone-modifying enzymes
TFIIS
PRE-mRNA PROCESSING
MACHINERY
snRNPs U1-U6
hnRNPs
PRE-mRNA
Capping enzymes
Cleavage/polyA factors
P-TEFb
TFIID
RNAP II
TATAAA
TFIIB TFIIE
TFIIF
TFIIH
Mediator
+1
GENERAL TRANSCRIPTION MACHINERY
- RNA polymerase II (RNAP II)
- General transcription factors
- Elongation factors
TFIIF
3’
Main Features of the Procedure: 1) Expression at physiological level 2) Cell lysis in mild conditions (soluble) 3) Purification in native conditions 4) Protein identification by sensitive MS 5) Efficient data analysis and validation
Protein Identification
Computational Analysis (Reliability, Connectivity)
Cell Lysis in Mild Conditions
Soluble fraction (supernatant)
Insoluble fraction (pellet)
Collection of 293 cell lines engineered to express physiological levels of affinity tagged polypeptides upon induction
A platform for the systematic characterization of protein
interaction networks in human cells
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
The ecdysone-inducible mammalian expression system
Modified from
Vickers and Sharrocks, 2002
Advantages:
• very low basal expression
• ecdysone analogs are inert in mammalian cells
• high control on expression levels (physiological)
pVgEcR
pIND
Western blot
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Double-affinity tagged protein
extract
IgG sepharose
TEV protease
Calmodulin Beads
Purified
Protein Complexes
Unbound Bound
Beads Eluate
Unbound Bound
EGTA
Beads Eluate
IgG
Sepharose
Calmodulin
Beads
Protein A IgG-binding sites
TEV protease cleavage site
Calmodulin binding peptide (CBP)
Protein of interest
Double-affinity Tag
EGTA
TEV
Tandem affinity purification in native conditions
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Identification of proteins by mass spectrometry In-gel
digestion
Yarmush and Jayaraman, 2002
SDS-
PAGE
Excised
proteins
Trypsin
digestion
Peptides
mixture
N-I I
Mass spectrometry
LC-MS/MS Computer Cluster
Peptide Sequence
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Affinity purification of protein complexes
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Reciprocal tagging
Reciprocal tagging allows to validate some interactions and to enrich the data set
A platform for the systematic characterization of protein interaction networks in human cells
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Computational analysis (reliability & connectivity)
Reciprocal tagging
Human cDNA cloned into an ecdysone-inducible expression vector
encoding the TAP tag
Transfection in EcR-293 cells and selection of expressing clones
Induction with ponasterone A to obtain near physiological expression levels
Cell lysis in mild conditions
Centrifugation
Soluble fraction (WCE)
Insoluble fraction (pellet)
Tandem Affinity Purification
SDS gel analysis (high resolution)
Protein identification (Orbitrap MS: high sensitivity & mass precision)
Computational analysis (reliability & connectivity)
Reciprocal tagging
Comprehensive Maps of Human Protein Interaction
Networks
A platform for the systematic characterization of protein interaction networks in human cells
The protein interaction database Proteus
C. Poitras & D. Bergeron
Computational data analysis and validation
A B
C
Computational data analysis and validation
• keeping 83% of the interactions supported by the literature
• excluding 83% of the interactions judged likely false-positives
805 distinct interactions were selected as high-confidence
protein interactions
Summary of our survey of protein complexes
(2007)
Jeronimo et al (2007) Mol Cell 27, 262
High-quality protein interaction network for the
RNA polymerase II transcription machinery
Proof of concept
Myopathies
The myopathies are a heterogeneous group of diseases of skeletal
muscle
They include:
Sporadic Inclusion-Body Myositis (sIBM)
Dermatomyositis
Polymyositis
Myofibrillar myopathies
Neurogenic muscular athrophy
IBMPFD
And many others
The RNA Polymerase II-Associated Proteins (RPAPs)
are previously uncharacterized proteins
Expression of components of the RNAPII network is specifically
modulated in tissues of patients suffering from
various muscular diseases
Muscle disease
N° biopsies
RPAPx
RPAPy
RPAPz
Normal controls
15
Normal
Normal
Normal
Disease Phenotype A
12
Increased *
Normal
Normal
Disease Phenotype B
8
Increased
Increased *
Increased
Disease Phenotype C
10
Increased
Normal
Increased
Increased expression of RPAPs is the only marker today that permits to
discriminate between two specific muscular disorders
The RPAPs are dysregulated in muscular diseases
Diseases under study
Myopathies
Leucodystrophies
Type 2 diabetes and insulin resistance
Hypercholesterolemia
Cancer
Thank you !