biological networks bing zhang department of biomedical informatics vanderbilt university...
Post on 19-Dec-2015
217 Views
Preview:
TRANSCRIPT
Biological networks
Bing Zhang
Department of Biomedical Informatics
Vanderbilt University
bing.zhang@vanderbilt.edu
BCHM352, Spring 2011 2
Protein-protein interaction (PPI)
Definition Physical association of two or more
protein molecules
Examples Receptor-ligand interactions
Kinase-substrate interactions
Transcription factor-co-activator interactions
Multiprotein complex, e.g. multimeric enzymes
Cramer et al. Science 292:1863, 2001
RNA polymerase II, 12 subunits
BCHM352, Spring 2011 3
Significance of protein interaction
Most proteins mediate their function through interacting with other proteins To form molecular machines
To participate in various regulatory processes
Distortions of protein interactions can cause diseases
BCHM352, Spring 2011
Method Bait strain: a protein of interest, bait (B), fused
to a DNA-binding domain (DBD) Prey strains: ORFs fused to a transcriptional
activation domain (AD) Mate the bait strain to prey strains and plate
diploid cells on selective media (e.g. without Histidine)
If bait and prey interact in the diploid cell, they reconstitute a transcription factor, which activates a reporter gene whose expression allows the diploid cell to grow on selective media
Pick colonies, isolate DNA, and sequence to identify the ORF interacting with the bait
Pros High-throughput Can detect transient interactions
Cons False positives Non-physiological (done in the yeast nucleus) Can’t detect multiprotein complexes
Uetz P. Curr Opin Chem Biol. 6:57, 2002
Yeast two-hybrid
4
BCHM352, Spring 2011
Tandem affinity purification
Method TAP tag: Protein A, Calmodulin binding
domain, TEV protease cleavage site Bait protein gene is fused with the DNA
sequences encoding TAP tag Tagged bait is expressed in cells and forms
native complexes Complexes purified by TAP method Components of each complex are identified
through gel separation followed by MS/MS Pros
High-throughput Physiological setting Can detect large stable protein complexes
Cons High false positives Can’t detect transient interactions Can’t detect interactions not present under
the given condition Tagging may disturb complex formation Binary interaction relationship is not clear
Chepelev et al. Biotechnol & Biotechnol 22:1, 2008
5
BCHM352, Spring 2011
Large scale protein interaction identification
Experimental Yeast two-hybrid
Tandem affinity purification
Computational Gene fusion
Ortholog interaction
Phylogenetic profiling
Microarray gene co-expression
Valencia et al. Curr. Opin. Struct. Biol, 12:368, 2002
6
BCHM352, Spring 2011
Protein interaction data in the public domain
Database of Interacting Proteins (DIP)http://dip.doe-mbi.ucla.edu/
The Molecular INTeraction database (MINT)http://mint.bio.uniroma2.it/mint/
The Biomolecular Interaction Network Database (BIND)http://www.binddb.org/
The General Repository for Interaction Datasets (BioGRID)http://www.thebiogrid.org/
Human Protein Reference Database (HPRD)http://www.hprd.org
Online Predicted Human Interaction Database (OPHID)http://ophid.utoronto.ca
The Munich Information Center for Protein Sequences (MIPS)http://mips.gsf.de
7
HPRD
BCHM352, Spring 2011 8
BCHM352, Spring 2011
Protein interaction networks
Saccharomyces cerevisiae Jeong et al. Nature, 411:41, 2001
Drosophila melanogasterGiot et al. Science, 302:1727, 2003
Caenorhabditis elegans
Li et al. Science, 303:540, 2004
Homo sapiens Rual et al. Nature, 437:1173, 2005
9
Gene regulatory networks
Experimental Chromatin immunoprecipitation (ChIP)
ChIP-chip
ChIP-seq
Computational Promoter sequence analysis
Reverse engineering from microarray gene expression data
Public databases Transfac (http://www.gene-regulation.com)
MSigDB (http://www.broadinstitute.org/gsea/msigdb)
hPDI (http://bioinfo.wilmer.jhu.edu/PDI/ )
BCHM352, Spring 2011 10
Shen-orr et al. Nat Genet, 31:64, 2002
KEGG metabolic network
BCHM352, Spring 2011 11
Network visualization tools
Cytoscape http://www.cytoscape.org
BCHM352, Spring 2011 12
Gehlenborg et al. Nature Methods, 7:S56, 2010
BCHM352, Spring 2011 13
Graph representation of networks
Cramer et al. Science 292:1863, 2001
edge
node
Graph: a graph is a set of objects called nodes or vertices connected by links called edges. In mathematics and computer science, a graph is the basic object of study in graph theory.
RNA polymerase II
BCHM352, Spring 2011 14
Undirected graph vs directed graph
Protein interaction network
Nodes: protein
Edges: physical interaction
Undirected
Transcriptional regulatory network
Nodes: transcription factors and genes
Edges: transcriptional regulation
Directed
TF->target gene
Metabolic network
Nodes: metabolites
Edges: enzymes
Directed
Substrate->Product
Krogan et al. Nature 440:637, 2006
Ravasz et al. Science 297:1551, 2002
Lee et al. Science 298:799, 2002
Fhl1
RPL2B
Degree, path, shortest path
Degree: the number of edges adjacent to a node. A simple measure of the node centrality.
Path: a sequence of nodes such that from each of its nodes there is an edge to the next node in the sequence.
Shortest path: a path between two nodes such that the sum of the distance of its constituent edges is minimized.
BCHM352, Spring 2011 15
YDL176W
Degree: 3
Fhl1
Out degree: 4
In degree: 0
Obama vs Lady Gaga: who is more influential?
BCHM352, Spring 2011 16
Obama 7,035,548701,301
Gaga 8,873,525144,263
Eminem 3,509,4690
Twitter followers(in degree)
Twitter following (out degree)
BCHM352, Spring 2011 17
Albert et al., Nature, 406:378, 2000
Random network 130 nodes, 215 edges Homogeneous: most nodes
have approximately the same number of links
Five red nodes with the highest number of links reach 27% of the nodes
Scale-free network 130 nodes, 215 edges Heterogeneous: the majority
of the nodes have one or two links but a few nodes have a large number of links
Five red nodes with the highest degrees reach 60% of the nodes (hubs)
Network properties (I): hubs
BCHM352, Spring 2011 18
Scale-free biological networks
Jeong et al, Nature, 407:651, 2000 Noort et al, EMBO Reports,5:280, 2004Stelzl et al. Cell, 122:957, 2005
Metabolic networkC. elegans
Protein interaction networkH. sapiens
Gene co-expression networkS. cerevisiae
BCHM352, Spring 2011 19
Network properties (II): small world network
Stanly Milgram’s small world experiment
Social network
Average path length between two person
Small world network: a graph in which most nodes can be reached from every other by a small number of steps.
Biological interpretation: Efficiency in transfer of biological information
Six degrees of separation
Omaha Boston
Wichita
"If you do not know the target person on a personal basis, do not try to contact him directly. Instead, mail this folder to a personal acquaintance who is more likely than you to know the target person."
BCHM352, Spring 2011 20
Network properties (III): motifs
Network motifs: Patterns that occur in the real network significantly more often than in randomized networks.
Three-node patternsMilo et al., Science, 298:824, 2002
Feed-forward loop
Feedback loop
BCHM352, Spring 2011 21
Network properties (IV): modularity
Modularity refers to a group of physically or functionally linked molecules (nodes) that work together to achieve a relatively distinct function.
Examples Transcriptional module: a set of co-
regulated genes sharing a common function
Protein complex: assembly of proteins that build up some cellular machinery, commonly spans a dense sub-network of proteins in a protein interaction network
Signaling pathway: a chain of interacting proteins propagating a signal in the cell
Protein interaction modulesPalla et al, Nature, 435:841, 2005
Gene co-expression modulesShi et al, BMC Syst Biol, 4:74, 2010
Network distance vs functional similarity
Proteins that lie closer to one another in a protein interaction network are more likely to have similar function and involve in similar biological process.
Sharan et al. Mol Syst Biol, 3:88, 2007
22 BCHM352, Spring 2011
Network-based disease gene prioritization
Kohler et al. Am J Hum Genet. 82:949, 2008
23 BCHM352, Spring 2011
For a specific disease, candidate genes can be ranked based on their proximity to known disease genes.
Summary
Biological networks Protein-protein interaction network; Gene regulatory network; Metabolic network
Graph representation of networks Graph, node, edge, undirected graph, directed graph, degree, path, shortest path
Network properties Hubs and scale-free degree distribution
Small-world
Motifs
Modularity
Network-based applications Disease gene prioritization
BCHM352, Spring 2011 24
top related