protein-protein interactions june 18, 2015. why ppi? protein-protein interactions determine outcome...

Protein-protein Interactions

June 18, 2015

Why PPI?

Protein-protein interactions determine outcome of most cellular processes

Proteins which are close homologues often interact in the same way

Protein-protein interactions place evolutionary constraints on protein sequence and structural divergence

Pre-cursor to networks

PPI classification

Strength of interaction Permanent or transient

SpecificityLocation within polypeptide chainSimilarity of partners

Homo- or hetero-oligomers

Direct (binary) or a complexConfidence score

Determining PPIs

Small-scale methodsCo-immunoprecipitationAffinity chromatographyPull-down assays In vitro binding assays

FRET, Biacore, AFM

Structural (co-crystals)

PPIs by high-throughput methods

Yeast two hybrid systemsAffinity tag purification followed by mass

spectrometry Protein microarraysMicroarrays/gene co-expression

Implied functional PPIs

Synthetic lethality Genetic interactions, implied functional PPIs

Yeast two hybrid system

Gal4 protein comprises DNA binding and activating domains

Binding domain interacts with

promoter

Measure reporter enzyme activity (e.g. blue colonies)

Activating domain interacts with polymerase

Yeast two hybrid system• Gal4 protein: two domains do not need to be

transcribed in a single protein• If they come into close enough proximity to interact,

they will activate the RNA polymerase

Binding domain interacts with

promoter

Measure reporter enzyme activity (e.g. blue colonies)

Activating domain interacts with polymerase

A B

Two other protein domains (A & B) interact

Yeast two hybrid system

A B

This is achieved using gene fusion Plasmids carrying different constructs can be expressed in

yeast

Binding domain as a translational fusion with the gene encoding another protein in one plasmid.

Activating domain as a translational fusion with the gene encoding a different protein in a second plasmid.

If the two proteins interact, then GAL4 is expressed and blue colonies form

Yeast two hybrid

AdvantagesFairly simple, rapid and inexpensiveRequires no protein purificationNo previous knowledge of proteins neededScalable to high-throughput Is not limited to yeast proteins

LimitationsWorks best with cytosolic proteinsTendency to produce false positives

Mass spectrometry

Need to purify protein or protein complexesUse a affinity-tag systemNeed efficient method of recovering fusion protein

in low concentration

TAP (tandem affinity purification)

Spacer CBP TEV site Protein A

Spacer CBP TEV site Protein A

Homologous recombination

Chromosome

PCR product

Fusion protein Protein

Calmodulin binding peptide

TAP process

"Taptag simple" by Chandres - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons

TAP

Advantages No prior knowledge of complex composition Two-step purification increases specificity of pull-down

Limitations Transient interactions may not survive 2 rounds of

washing Tag may prevent interactions Tag may affect expression levels

Works less efficiently in mammalian cells

Other tags

HA, Flag and HisAnti-tag antibodies can interfere with MS analysis

Streptavidin binding peptide (SBP) High affinity for streptavidin beads10-fold increase in efficiency of purification

compared to conventional TAP tagSuccessfully used to identify components of

complexes in the Wnt/b-catenin pathway

Nature Cell Biology 4:348-357 (2006)

The KLHL12-Cullin-3 ubiquitin ligase negatively regulates Wnt-b-catenin pathway by targeting Dishevelled for degradation

Used Dsh-2 and Dsh-3 as bait

proteins

Binding partners of Bruton’s tyrosine kinase

Protein Science 20:140-149 (2011)

Role in lymphocyte development & B-cell maturation

MINT – Molecular Interaction Database >240,000 interactions with 35,000 proteins Covers multiple speces

DIP -- Database of Interacting Proteins (UCLA) >79,000 interactions with >27,000 proteins

CCSB – Proteomics base interactomes (Harvard) Human, viruses, C. elegans, S. cerevisiae Some unpublished data

IntAct – EBI molecular interaction database Curated data from multiple sources

Databases of protein-protein interactions

Integrated Databases of PPIsMiMI: Michigan Molecular Interactions

Data merged from several PPI databases; source provenance maintained

Links to literature sources for the PPILinked to Entrez Gene, InterPro, Gene ontology Includes pathway data Various methods of viewing the data

NOT CURATEDData only as good as source data

http://mimi.ncibi.org

MiMI database

MiMI search results

MiMI Gene Detail

Gene Ontology

PathwaysInteractions

KEGG pathway

Each protein name is a link

to another page

Arrows & lines provide information

about the type of interaction

Other viewing options

MeSH terms that involve

this gene

PPI with this gene in

CytoscapeAdaptive

PubMed search

On average, two databases curating the same publication agree on 42% of their interactions. Discrepancies between sets of proteins annotated from the same publication are less pronounced, with an average agreement of 62%, but the overall trend is similarBetter agreement on non-vertebrate model

organisms data sets than for vertebrates Isoform complexity is a major issue

Literature curation of protein interactions: measuring agreement across databases. Turinsky A.L. et. al. Database, Vol. 2010, Article ID baq026

iRefWeb

Web interface to integrated database of protein-protein interactionsBetter review of the records after pulling in the

data from the various source databases

Can search by gene name or various IDs, including batch searches.

Does not have the pathway and other information, but has a better measure of confidence of PPI

http://wodaklab.org/iRefWeb/

iRef Web search

The search will try to match automatically, both name and species.

MI score: (Mint-inspired) score is a measure of confidence in molecular interactions for interactions between A and B:

1. Total number of unique PubMed publications that support the interactions

2. Cumulative sum of weighted evidence from all3. The cumulative sum of weighted evidence from all interologs, i.e.

interactions containing homologous pairs A' and B'.

Interaction detail

STRING database

Search Tool for the Retrieval of Interacting Genes Integrates information from existing PPI data

sourcesProvides confidence scoring of the interactionsPeriodically runs interaction prediction algorithms

on newly sequenced genomes

v.10 covers >2000 organisms

http://string-db.org/

Networks in STRING database

Starting protein

Networks can be expanded

3 indirect interactions

Information about the proteins

Transferring PPI annotation

Most of the high-throughput PPI work is done in model organisms

Can you transfer that annotation a homologous gene in a different organism?

Defining homologs

Orthologue of a protein is usually defined as the best-matching homolog in another species

Candidates with significant BLASTP E-value (<10-

20) Having ≥80% of residues in both sequences

included in BLASTP alignment Having one candidate as the best-matching

homologue of the other candidate in corresponding organism

Interologs

If two proteins, A and B, interact in one organism and their orthologs, A’ and B’, interact in another species, then the pair of interactions A—B and A’—B’ are called interologs

Align the homologs (A & A’, B & B’) to each other. Determine the percent identity and the E-value of both

alignments Then calculate the Joint identity and the Joint Evalue

Joint identity Joint E-value

Transfer of annotation

Compared interaction datasets between yeast, worm and fly

Assessed chance that two proteins interact with each other based on their joint sequence identities

Performed similar analysis based on joint E-values All protein pairs with JI ≥ 80% with a known interacting

pair will interact with each other

More than half of protein pairs with JE E-70 could be experimentally verified.

Yu, H. et. al. (2004) Genome Res. 14: 1107-1118PMID: 15173116

Examples of Protein-Protein Interologs

In C. elegans, mpk-1 was experimentally shown to interact with 26 other proteins (by yeast 2-hybrid)

Ste5 is the homolog of Mpk-1 in S. cerevisiaeBased on the similarity between the interaction

partners of mpk-1 and their closest homologs in S. cerevisiae, the interolog approach predicted 5 of the 6 subunits of the Ste5 complex in S. cerevisiae

This paper has been cited >100 timesWhy the interest in predicting protein-protein

interactions?Determining protein-protein interactions is

challenging and the high-throughput (genome-wide) methods are still difficult and expensive to conduct

Identifying candidate interaction partners for a targeted pull-down assay is a more viable strategy for most labs

BIPS: BIANA Interolog Prediction Server

• Based on concept of interolog

• Pre-defined alignments

• Can submit list of proteins to get predicted interaction partners

• Can filter predicted list to increase confidence

Today in computer lab

Tutorial on finding PPIs in your gene list using MiMI or iRefWeb

Exploring a subset of PPIs using the STRING database

Prediction of interactions homologs using the BIPS server

Exercise 4 on protein domain analysis

protein-protein interactions june 18, 2015. why ppi? protein-protein interactions determine outcome...

Documents

protein purification

protein domains

different protein

protein sequence

protein complexes

single protein

hybrid system gal4 protein

polymerase slide