biological networks
DESCRIPTION
Biological Networks. Can a biologist fix a radio?. Lazebnik, Cancer Cell, 2002. Building models from parts lists. Lazebnik, Cancer Cell, 2002. Building models from parts lists. Computational tools are needed to distill pathways of interest from large molecular interaction databases. - PowerPoint PPT PresentationTRANSCRIPT
Biological Networks
Can a biologist fix a radio?
Lazebnik, Cancer Cell, 2002
Building models from parts lists
Lazebnik, Cancer Cell, 2002
Building models from parts lists
Computational tools are needed to distill pathways of interest from large molecular interaction databases
Thinking computationally about biological process may lead to more accurate models,which in turn can be used to improve the design of algorithms
Navlakha an Bar-Joseph 2011
Jeong et al. Nature 411, 41 - 42 (2001)
The Protein-Protein Interaction Network in yeast
gene A gene Bregulates
protein A Protein Bbinds
Network Representation
edge (link)
Directional
Non-directional
node
Proteins
Physical Interaction
Protein-Protein
A
B
Protein Interaction
Metabolites
Enzymatic conversion
Protein-Metabolite
A
B
Metabolic
Transcription factorTarget genes
TranscriptionalInteraction
Protein-DNA
A
B
Transcriptional
Different types of Biological Networks
Nodes
Edges
Small-world Network
Biological networks exhibit small-world network (SWN) characteristics
(similar to social networks, internet etc)
Every node can be reached from every other by a small number of steps
SWN vs Random NetworksSmall World Network (SWN)Random Network
SWN have a small number of highly connected nodes
What can we learn from a network?
What can we learn from Biological Networks
• Hubs tend to be “older” proteins
• Hubs are evolutionary conserved
Hubs are highlyconnected nodes
Are hubs functionally important ?
Hubs are usually critical proteins for the species
LethalSlow-growthNon-lethalUnknown
Jeong et al. Nature 411, 41 - 42 (2001)
Networks can help to predict function
Can the network help to predict function
Begley TJ, Mol Cancer Res. 2002
•Systematic phenotyping of 1615 gene knockout strains in yeast•Evaluation of growth of each strain in the presence of MMS (and other DNA damaging agents)•Screening against a network of 12,232 protein interactions
Mapping the phenotypic data to the network
Begley TJ, Mol Cancer Res. 2002
Mapping the phenotypic data to the network
Begley TJ, Mol Cancer Res. 2002
Networks can help to predict function
Begley TJ, Mol Cancer Res. 2002.
A network approach to predict new drug targets
Aim :to identify critical positions on the ribosome which could be potential
targets of new antibiotics
Case Study
Keats (1795-1821) Kafka (1883-1924) Orwell (1903-1950)
Mozart (1756-1791) Schubert (1797-1828) Chopin (1810-1849)
In our days…
Infectious diseases are still number 1 cause of premature death
(0-44 years of age) worldwide..
Annually kill >13 million people (~33% of all deaths)
Antibiotics targets of the large ribosomal subunit
The ribosome is a target for approximately half of antibiotics characterized to date
Looking at the ribosome as a network
A1191
Looking at the ribosome as a network
1. Critical sites in the ribosome network may represent functional sites
(not discovered before)
2. New functional sites may be good site for drug design
Looking for critical positions in a networkLooking for critical positions in a network
Looking for critical positions in a networkLooking for critical positions in a networkDegree: the number of edges that a node has.
The node with the highest degree in the graph (HUB)
Degree: the number of edges that a node has.
The node with the highest degree in the graph (HUB)
Looking for critical positions in a networkLooking for critical positions in a network
ClosenessClosenessCloseness: measure how close a node to all other nodes in the network.
The nodes with the highest closeness
BetweennessBetweenness
The node with the highest betweenness
Betweenness: quantify the number of all shortest paths that pass through a node.
The node with the highest degree
The node with the highest betweenness
The nodes with the highest closeness
Looking for critical positions in a networkLooking for critical positions in a network
Looking at macromolecular structures as a Looking at macromolecular structures as a networknetwork
A1191
A1191 have the highest closeness, betwenness, and degree.
Which property best characterizes
the known function sites?
How can the network approach help How can the network approach help identify functional sites in the identify functional sites in the
ribosome ? ribosome ?
Characterize the whole ribosome as a network
Calculate the network properties of each nucleotide
?
Strong mutations
Mild mutations
12
When mutating the critical site on the When mutating the critical site on the ribosomeribosome
the bacteria will not grow the bacteria will not grow
p~0
p~0
p=0.01
Critical site on the ribosomeCritical site on the ribosomehave unique network properties have unique network properties
Strong mutations Mild mutations
David-Eden et al, NAR (2008)
‘ ‘Druggability Index’Druggability Index’Based on the network propertyBased on the network property
David-Eden et al. NAR (2010)
Bad site Good site
Pockets with the highest ‘Druggability Pockets with the highest ‘Druggability Index’Index’
overlap known drug binding siteoverlap known drug binding sitess
David-Eden et al. NAR (2010)
DI=1 DI=0.98
Erythromycin Telithromycin
Girodazole
DI=0.94 DI=0.93
Course Summary
What did we learn
• Pairwise alignment –
Local and Global Alignments
When? How ?
Tools : for local blast2seq , for global best use MSA tools such as Clustal X, Muscle
What did we learn• Multiple alignments (MSA)
When? How ?
MSA are needed as an input for many different purposes: searching motifs, phylogenetic analysis, protein and RNA structure predictions, conservation of specific nts/residues
Tools : Clustal X (for DNA and RNA), MUSCLE (for proteins)Tools for phylogenetic trees: PHYLIP …
What did we learn• Search a sequence against a database
When? How ? - BLAST :Remember different option for BLAST!!! (blastP blastN…. ), make sure to search
the right database!!!
DO NOT FORGET –You can change the scoring matrices, gap penalty etc
- PSIBLAST
Searching for remote homologies
- PHIBLAST
Searching for a short pattern within a protein
What did we learn• Motif search
When? How ?
- Searching for known motifs in a given promoter (JASPAR)
-Searching for overabundance of unknown regulatory motifs in a set of sequences ; e.g promoters of genes which have similar expression pattern (MEME)
Tools : MEME, logo, Databases of motifs : JASPAR (Transcription Factors binding sites)PRATT in PROSITE (searching for motifs in protein sequences)
What did we learn• Protein Function Prediction
When? How ?
- Pfam (database to search for protein motifs/domain (PfamA/PfamB)
- PROSITE
- Protein annotations in UNIPROT
(SwissProt/ Tremble)
What did we learn• Protein Secondary Structure Prediction-
When? How ?– Helix/Beta/Coil(PHDsec,PSIPRED).– Predicts transmembrane helices (PHDhtm,TMHMM).– Solvent accessibility: important for the prediction of
ligand binding sites (PHDacc).
What did we learn• Protein Tertiary Structure Prediction-
When? How ?– First we must look at sequence identity to a sequence with a known
structure!!– Homology modeling/Threading– MODEBase- database of models
Remember : Low quality models can be miss leading !!
Tools : SWISS-MODEL ,genTHREADER, MODEBase
What did we learn• RNA Structure and Function Prediction-
When? How ?– RNAfold – good for local interactions, several predictions of low
energy structures– Alifold – adding information from MSA– RFAM
– Specific database and search tools: tRNA, microRNA …..
What did we learn• Gene expression
When? How ?– Many database of gene expression
GEO …– Clustering analysis
EPClust (different clustering methods K-means, Hierarchical Clustering, trasformations row/columns/both…)
– GO annotation (analysis of gene clusters..)
So How do we start …
• Given a hypothetical sequence predict its function….
What should we do???
Example
• Amyloids are proteins which tend to aggregate in solution. Abnormal accumulation of amyloid in organs is assumed to play a role in various neurodegenerative diseases.
Question : can we predict whether a protein X is an amyolid ?