systematic analysis of interactome: a new trend in bioinformatics

19
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University

Upload: cruz-ware

Post on 03-Jan-2016

30 views

Category:

Documents


2 download

DESCRIPTION

KOCSEA Technical Symposium 2010. Systematic Analysis of Interactome: A New Trend in Bioinformatics. Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University. History of Bioinformatics. Stage 1. Sequence Analysis. Gene sequencing Sequence alignment - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Systematic Analysis of Interactome:

A New Trend in Bioinformatics

KOCSEA Technical Symposium 2010

Young-Rae Cho, Ph.D.

Assistant Professor

Department of Computer Science

Baylor University

Page 2: Systematic Analysis of Interactome: A New Trend in Bioinformatics

History of Bioinformatics

Stage 1. Sequence Analysis

• Gene sequencing

• Sequence alignment

• Homolog search

• Motif finding

Page 3: Systematic Analysis of Interactome: A New Trend in Bioinformatics

History of Bioinformatics

Stage 1. Sequence Analysis

Stage 2. Structure Analysis

• Protein folding

• Homolog search

• Binding site prediction

• Function prediction

Computational Biology

• Gene sequencing

• Sequence alignment

• Homolog search

• Motif finding

Page 4: Systematic Analysis of Interactome: A New Trend in Bioinformatics

History of Bioinformatics

Stage 1. Sequence Analysis

Stage 2. Structure Analysis

Stage 3. Expression Analysis

• Function prediction

• Gene clustering

• Sample classification

Functional Genomics Computational Biology

• Protein folding

• Homolog search

• Binding site prediction

• Function prediction

• Gene sequencing

• Sequence alignment

• Homolog search

• Motif finding

Page 5: Systematic Analysis of Interactome: A New Trend in Bioinformatics

History of Bioinformatics

Stage 1. Sequence Analysis

Stage 2. Structure Analysis

Stage 3. Expression Analysis

Stage 4. Network Analysis

• Network modeling

• Interaction prediction

• Function prediction

• Pathway identification

• Module detection

Systems Biology Functional Genomics Computational Biology

• Function prediction

• Gene clustering

• Sample classification

• Protein folding

• Homolog search

• Binding site prediction

• Function prediction

• Gene sequencing

• Sequence alignment

• Homolog search

• Motif finding

Page 6: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Definition

Maps of biochemical reactions, interactions, regulations between genes or proteins

Importance

Provide insights into the mechanisms of molecular function within a cell

Significant resource for functional characterization of genes or proteins

Require computational and systematic approaches

Examples

Metabolic networks

Protein-protein interaction networks

Genetic interaction networks

Gene regulatory networks (Signal transduction networks)

Biological Networks

Page 7: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Determination

Experimental methods: Y2H, MS, Protein Microarray

Computational methods: Homolog search, Gene fusion analysis, Phylogenetic profiles

Genome-scale protein-protein interactions Interactome

Representation

Un-weighted, undirected graph

Challenges

Unreliability

Large scale

Complex connectivity

Protein Interaction Networks

Page 8: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Strategy

To resolve complex connectivity

Converts the complex graph to

a hierarchical tree structure

Uses the concepts of path strength,

functional linkage, and centrality

Process

Input: a protein interaction network

Output: a list of functional modules

Network Re-structuring

unweighted network

edge weightingedge weighting

functional linkage measurementfunctional linkage measurement

network restructuringnetwork restructuring

hub confidence measurementhub confidence measurement

network clusteringnetwork clustering

weighted network

score matrix

structured network

hubs

clusters

Page 9: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Path Strength Model

Assumption: each node in a path chooses a succeeding edge based on the weighted

probability

Path Strength Factors

Edge weights

Path length

Node weighted degree

Path Strength

Page 10: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Measurements

Path strength of the strongest path between two nodes

Computational problem

Needs a heuristic approach

Uses a user-specified threshold of the max path length

Formula

k-length path strength:

Functional linkage:

Functional Linkage

shortest path length threshold

Page 11: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Centrality

Weighted closeness:

Algorithm

Computes centrality for each node a

Selects a set of ancestor nodes, T(a), of a by

Selects a parent node, p(a), of a by

Example

Network Restructuring

Page 12: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Measurement

Selects a set of child nodes, D(a), of a by

Selects a set of descendent nodes, La, of a by

Computes the hub confidence, H(a), of a by

Example

Hub Confidence

Page 13: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Algorithm

Iteratively select a hub a with the highest hub confidence

Output the sub-tree La including a as a cluster (functional module)

Cluster Depth

The max path length from the root of the sub-tree to a leaf

Example

Clustering

Page 14: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Network Vulnerability

Random attack: repeatedly disrupt a randomly selected node

Degree-based hub attack: repeatedly disrupt the highest degree node

Structural hub attack: repeatedly disrupt the node with the highest hub confidence

For each iteration, observes the largest component

Results

Topological Assessment of Hubs

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

0 20 40 60 80 100 120 140 160

number of nodes

frac

tio

n o

f la

rges

t co

mp

on

ent

random attack

degree-based hub attack

structural hub attack

Page 15: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Protein Lethality

Determines lethal / viable proteins by knock-out experiment

Lethality represents functional essentiality

Orders proteins by degree and hub confidence

Observes the cumulative proportion of lethal proteins for every 10 proteins

Results

Biological Assessment of Hubs

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 20 40 60 80 100 120 140

number of hubs

av

era

ge

leth

alit

y

structural hubs

degree-based hubs

Page 16: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Modularity

A combined measure of density within each cluster and separability among clusters

Estimated by the ratio of the number of edges within a cluster (sub-graph)

to the number of all edges starting from the nodes in the cluster (sub-graph)

Observes the average modularity of clusters with respect to the cluster depth

Results

More specific function module has

higher modularity

Justify the general-to-specific concepts

of hierarchical functional modules

Topological Assessment of Clusters

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12

cluster depth

ave

rag

e m

od

ula

rity

Page 17: Systematic Analysis of Interactome: A New Trend in Bioinformatics

f-Measure

Compares each output cluster X with the real functional annotation Y (from MIPS)

Recall = (# of common proteins of X and Y) / (# of proteins in Y)

Precision = (# of common proteins of X and Y) / (# of proteins in X)

f-measure = 2 × Recall × Precision / (Recall + Precision)

Results

Compared with the results from previous hierarchical clustering methods, e.g.,

edge-betweenness (top-down approach) and ProDistIn (bottom-up-approach)

Biological Assessment of Clusters

Page 18: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Motivation

Significant functional knowledge in protein interaction networks (interactome)

Complex connectivity

Contributions

Convert an unstructured network to a structured network

Conserve functional information through pathways

High network vulnerability, low functional lethality at hubs as a drug target

Applicable to various fields, e.g., social networks, WWW

Foundation of structural dynamics during network evolution

Conclusion

Page 19: Systematic Analysis of Interactome: A New Trend in Bioinformatics

Reference

Y.-R. Cho and A. Zhang, “Identification of functional modules by converting

interactome networks into hierarchical ordering of proteins”. BMC Bioinformatics,

11(Suppl 3):S3, 2010

Questions ?