protein-protein interactions june 18, 2015. why ppi? protein-protein interactions determine outcome...
TRANSCRIPT
Protein-protein Interactions
June 18, 2015
Why PPI?
Protein-protein interactions determine outcome of most cellular processes
Proteins which are close homologues often interact in the same way
Protein-protein interactions place evolutionary constraints on protein sequence and structural divergence
Pre-cursor to networks
PPI classification
Strength of interaction Permanent or transient
SpecificityLocation within polypeptide chainSimilarity of partners
Homo- or hetero-oligomers
Direct (binary) or a complexConfidence score
Determining PPIs
Small-scale methodsCo-immunoprecipitationAffinity chromatographyPull-down assays In vitro binding assays
FRET, Biacore, AFM
Structural (co-crystals)
PPIs by high-throughput methods
Yeast two hybrid systemsAffinity tag purification followed by mass
spectrometry Protein microarraysMicroarrays/gene co-expression
Implied functional PPIs
Synthetic lethality Genetic interactions, implied functional PPIs
Yeast two hybrid system
Gal4 protein comprises DNA binding and activating domains
Binding domain interacts with
promoter
Measure reporter enzyme activity (e.g. blue colonies)
Activating domain interacts with polymerase
Yeast two hybrid system• Gal4 protein: two domains do not need to be
transcribed in a single protein• If they come into close enough proximity to interact,
they will activate the RNA polymerase
Binding domain interacts with
promoter
Measure reporter enzyme activity (e.g. blue colonies)
Activating domain interacts with polymerase
A B
Two other protein domains (A & B) interact
Yeast two hybrid system
A B
This is achieved using gene fusion Plasmids carrying different constructs can be expressed in
yeast
Binding domain as a translational fusion with the gene encoding another protein in one plasmid.
Activating domain as a translational fusion with the gene encoding a different protein in a second plasmid.
If the two proteins interact, then GAL4 is expressed and blue colonies form
Yeast two hybrid
AdvantagesFairly simple, rapid and inexpensiveRequires no protein purificationNo previous knowledge of proteins neededScalable to high-throughput Is not limited to yeast proteins
LimitationsWorks best with cytosolic proteinsTendency to produce false positives
Mass spectrometry
Need to purify protein or protein complexesUse a affinity-tag systemNeed efficient method of recovering fusion protein
in low concentration
TAP (tandem affinity purification)
Spacer CBP TEV site Protein A
Spacer CBP TEV site Protein A
Homologous recombination
Chromosome
PCR product
Fusion protein Protein
Calmodulin binding peptide
TAP process
"Taptag simple" by Chandres - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons
TAP
Advantages No prior knowledge of complex composition Two-step purification increases specificity of pull-down
Limitations Transient interactions may not survive 2 rounds of
washing Tag may prevent interactions Tag may affect expression levels
Works less efficiently in mammalian cells
Other tags
HA, Flag and HisAnti-tag antibodies can interfere with MS analysis
Streptavidin binding peptide (SBP) High affinity for streptavidin beads10-fold increase in efficiency of purification
compared to conventional TAP tagSuccessfully used to identify components of
complexes in the Wnt/b-catenin pathway
Nature Cell Biology 4:348-357 (2006)
The KLHL12-Cullin-3 ubiquitin ligase negatively regulates Wnt-b-catenin pathway by targeting Dishevelled for degradation
Used Dsh-2 and Dsh-3 as bait
proteins
Binding partners of Bruton’s tyrosine kinase
Protein Science 20:140-149 (2011)
Role in lymphocyte development & B-cell maturation
MINT – Molecular Interaction Database >240,000 interactions with 35,000 proteins Covers multiple speces
DIP -- Database of Interacting Proteins (UCLA) >79,000 interactions with >27,000 proteins
CCSB – Proteomics base interactomes (Harvard) Human, viruses, C. elegans, S. cerevisiae Some unpublished data
IntAct – EBI molecular interaction database Curated data from multiple sources
Databases of protein-protein interactions
Integrated Databases of PPIsMiMI: Michigan Molecular Interactions
Data merged from several PPI databases; source provenance maintained
Links to literature sources for the PPILinked to Entrez Gene, InterPro, Gene ontology Includes pathway data Various methods of viewing the data
NOT CURATEDData only as good as source data
http://mimi.ncibi.org
MiMI database
MiMI search results
MiMI Gene Detail
Gene Ontology
PathwaysInteractions
KEGG pathway
Each protein name is a link
to another page
Arrows & lines provide information
about the type of interaction
Other viewing options
MeSH terms that involve
this gene
PPI with this gene in
CytoscapeAdaptive
PubMed search
On average, two databases curating the same publication agree on 42% of their interactions. Discrepancies between sets of proteins annotated from the same publication are less pronounced, with an average agreement of 62%, but the overall trend is similarBetter agreement on non-vertebrate model
organisms data sets than for vertebrates Isoform complexity is a major issue
Literature curation of protein interactions: measuring agreement across databases. Turinsky A.L. et. al. Database, Vol. 2010, Article ID baq026
iRefWeb
Web interface to integrated database of protein-protein interactionsBetter review of the records after pulling in the
data from the various source databases
Can search by gene name or various IDs, including batch searches.
Does not have the pathway and other information, but has a better measure of confidence of PPI
http://wodaklab.org/iRefWeb/
iRef Web search
The search will try to match automatically, both name and species.
MI score: (Mint-inspired) score is a measure of confidence in molecular interactions for interactions between A and B:
1. Total number of unique PubMed publications that support the interactions
2. Cumulative sum of weighted evidence from all3. The cumulative sum of weighted evidence from all interologs, i.e.
interactions containing homologous pairs A' and B'.
Interaction detail
STRING database
Search Tool for the Retrieval of Interacting Genes Integrates information from existing PPI data
sourcesProvides confidence scoring of the interactionsPeriodically runs interaction prediction algorithms
on newly sequenced genomes
v.10 covers >2000 organisms
http://string-db.org/
Networks in STRING database
Starting protein
Networks can be expanded
3 indirect interactions
Information about the proteins
Transferring PPI annotation
Most of the high-throughput PPI work is done in model organisms
Can you transfer that annotation a homologous gene in a different organism?
Defining homologs
Orthologue of a protein is usually defined as the best-matching homolog in another species
Candidates with significant BLASTP E-value (<10-
20) Having ≥80% of residues in both sequences
included in BLASTP alignment Having one candidate as the best-matching
homologue of the other candidate in corresponding organism
Interologs
If two proteins, A and B, interact in one organism and their orthologs, A’ and B’, interact in another species, then the pair of interactions A—B and A’—B’ are called interologs
Align the homologs (A & A’, B & B’) to each other. Determine the percent identity and the E-value of both
alignments Then calculate the Joint identity and the Joint Evalue
Joint identity Joint E-value
Transfer of annotation
Compared interaction datasets between yeast, worm and fly
Assessed chance that two proteins interact with each other based on their joint sequence identities
Performed similar analysis based on joint E-values All protein pairs with JI ≥ 80% with a known interacting
pair will interact with each other
More than half of protein pairs with JE E-70 could be experimentally verified.
Yu, H. et. al. (2004) Genome Res. 14: 1107-1118PMID: 15173116
Examples of Protein-Protein Interologs
In C. elegans, mpk-1 was experimentally shown to interact with 26 other proteins (by yeast 2-hybrid)
Ste5 is the homolog of Mpk-1 in S. cerevisiaeBased on the similarity between the interaction
partners of mpk-1 and their closest homologs in S. cerevisiae, the interolog approach predicted 5 of the 6 subunits of the Ste5 complex in S. cerevisiae
This paper has been cited >100 timesWhy the interest in predicting protein-protein
interactions?Determining protein-protein interactions is
challenging and the high-throughput (genome-wide) methods are still difficult and expensive to conduct
Identifying candidate interaction partners for a targeted pull-down assay is a more viable strategy for most labs
BIPS: BIANA Interolog Prediction Server
• Based on concept of interolog
• Pre-defined alignments
• Can submit list of proteins to get predicted interaction partners
• Can filter predicted list to increase confidence
Today in computer lab
Tutorial on finding PPIs in your gene list using MiMI or iRefWeb
Exploring a subset of PPIs using the STRING database
Prediction of interactions homologs using the BIPS server
Exercise 4 on protein domain analysis