20131216 stat journal

37
Topological network alignment 20131216 Statistics journal

Upload: medku

Post on 11-May-2015

258 views

Category:

Technology


3 download

DESCRIPTION

http://www.ncbi.nlm.nih.gov/pubmed/20236959 J R Soc Interface. 2010 Sep 6;7(50):1341-54. doi: 10.1098/rsif.2010.0063. Epub 2010 Mar 17. Topological network alignment uncovers biological function and phylogeny. Kuchaiev O, Milenkovic T, Memisevic V, Hayes W, Przulj N. http://www.ncbi.nlm.nih.gov/pubmed/19259413 Cancer Inform. 2008;6:257-73. Epub 2008 Apr 14. Uncovering biological network function via graphlet degree signatures. Milenkoviฤ‡ T, Przulj N.

TRANSCRIPT

Page 1: 20131216 Stat Journal

Topological network alignment

20131216Statistics journal

Page 2: 20131216 Stat Journal
Page 3: 20131216 Stat Journal

Result

G H

G(V, E) H(U, F)

EC = 0.089

Page 4: 20131216 Stat Journal

Motivation

HumanYeast

Are two networks the same or similar?

large-scale networks such as interactome

Page 5: 20131216 Stat Journal

Theoretical background

Network or GraphCollection of nodes (vertex) and connections between them (edges).Biology, social communication, and web pages

Page 6: 20131216 Stat Journal

Theoretical background

Graph G and HNode set V and U (V U)Edge set E V*V and F U*UPossible graphs: for G

G H

G(V, E) H(U, F)

Page 7: 20131216 Stat Journal

Theoretical background

Graph comparisonSubgraph isomorphismIs G an exact subgraph of H?NP-completeEfficient algorithms are not known.

Graph alignmentFitting G into HEdge correctness (EC): the % of E aligned to FNP-hard

G H

G(V, E) H(U, F)

Page 8: 20131216 Stat Journal

Previous approaches

Local alignment : ambiguous, different pairingMapping are chosen independently for local regions of similarity.PathBLAST : homology informationNetworkBLAST : conserved protein clusters with likelihood methodMaWISh : evolution (sequence alignment)GRAEMLIN : dense conserved subgraph with phylogeny

Global alignmentProvide unique alignment from each node in smaller graph to exactly one node in larger graphISORANK : maximize overall matchGRAEMLIN : training from known graph alignments and phylogeny

Page 9: 20131216 Stat Journal

New approaches

Never use a priori informationSequence, Homology, Clusters, Phylogeny ,and Known alignments

Topological similarityOrbit, graphlet, and signature similarity

Of course, a priori information can be used.

ใใ†ใ€ GRAAL ใชใ‚‰ใญ

Page 10: 20131216 Stat Journal

n-node graphlet and automorphism orbits

Page 11: 20131216 Stat Journal

n-node graphlet and automorphism orbits

graphlet

orbit

Topologically relevant

Topologically relevant

Topologically relevant

Page 12: 20131216 Stat Journal

Graphlet Degree Vector

Page 13: 20131216 Stat Journal

Graphlet Degree Vector

Page 14: 20131216 Stat Journal

Graphlet Degree Vector

Page 15: 20131216 Stat Journal

Graphlet Degree Vector

Page 16: 20131216 Stat Journal

n-node graphlet and automorphism orbits

Orbit 15 in touches orbit 0, 1, 4, and 15 once.

Page 17: 20131216 Stat Journal

Signature similarityWeight vector

[0, 1] 1 means is not affected by any other orbit.

๐‘œ15=4 ๐‘œ44=5

Page 18: 20131216 Stat Journal

Signature similarity

Node , denotes the i-th coordinates of its signature vector. The distance is the i-th orbits of nodes and is

The total distance between and is

The signature similarity is

S = 1 is that and are identical (D = 0).

Page 19: 20131216 Stat Journal

GRAph ALigner algorithm (GRAAL)

Compute costs of aligning each node with each node .

This matrix is row V and col U (all pairs of nodes).Align the densest parts (the minimal cost nodes, seed).Greedily alignment in the sphere.Repeat * while all nodes in the smaller graph will be aligned.

GRAAL uses only topological information.Biological information can be used by the equation

G H

G(V, E) H(U, F)

density topology

: degree of node

*

Page 20: 20131216 Stat Journal

GRAALSearch the densest part and align.

Search the minimal cost nodes pair (seed).If multi-minimal cost pairs, chosen randomly.

G(V, E) H(U, F)

Page 21: 20131216 Stat Journal

GRAALSearch the densest part and align.

Search the minimal cost nodes pair (seed).If multi-minimal cost pairs, chosen randomly.

Seed nodes pair

G(V, E) H(U, F)

Page 22: 20131216 Stat Journal

GRAALMake spheres and align.

Make sphere .Greedily align and with the minimal cost.

๐‘ข๐‘ฃ

G(V, E) H(U, F)

: length of the shortest path

Page 23: 20131216 Stat Journal

GRAALMake spheres and align.

Make sphere .Greedily align and with the minimal cost.

๐‘ข๐‘ฃ

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

: length of the shortest path

Page 24: 20131216 Stat Journal

GRAALMake spheres and align.

Make sphere .Greedily align and with the minimal cost.

๐‘ข๐‘ฃ

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Align

: length of the shortest path

Page 25: 20131216 Stat Journal

GRAALExpand radii of spheres and align.

๐‘ข๐‘ฃ

: length of the shortest path

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Make sphere .Greedily align and with the minimal cost.

Aligned node

Page 26: 20131216 Stat Journal

GRAALExpand radii of spheres and align.

๐‘ข๐‘ฃ

: length of the shortest path

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Make sphere .Greedily align and with the minimal cost.

Aligned node

radii :

Page 27: 20131216 Stat Journal

GRAALExpand radii of spheres up to 3.

๐‘ข๐‘ฃ

: length of the shortest path

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Make sphere .Greedily align and with the minimal cost.

Aligned node

Page 28: 20131216 Stat Journal

GRAALExpand radii of spheres up to 3.

๐‘ข๐‘ฃ

: length of the shortest path

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Make sphere .Greedily align and with the minimal cost.

Aligned node

radii :

Page 29: 20131216 Stat Journal

GRAALExpand radii of spheres up to 3.

๐‘ข๐‘ฃ

: length of the shortest path

G(V, E) H(U, F)

๐‘†๐‘ฎ (๐‘ฃ ,๐‘Ÿ )

๐‘†๐‘ฏ (๐‘ข ,๐‘Ÿ )

Some nodes are not aligned.

Make sphere .Greedily align and with the minimal cost.

Aligned node

radii :

Page 30: 20131216 Stat Journal

GRAALRepeat with new edge networks .

๐‘ฎ๐‘ (๐‘ฝ ,๐‘ฌ๐‘ )

The distance between and , Aligned node

๐‘ โ‰ค2๐‘ฏ ๐‘ (๐‘ผ ,๐‘ญ๐‘ )

๐‘†๐‘ฏ ๐‘ (๐‘ข ,๐‘Ÿ )

: length of the shortest path

๐‘†๐‘ฎ๐‘ (๐‘ฃ ,๐‘Ÿ )

๐‘ โ‰ค2

๐‘Ÿ=1

๐‘Ÿ=1

Page 31: 20131216 Stat Journal

GRAALRepeat with new edge networks .

๐‘ฎ๐‘ (๐‘ฝ ,๐‘ฌ๐‘ )

The distance between and , Aligned node

๐‘ โ‰ค2

๐‘†๐‘ฎ๐‘ (๐‘ฃ ,๐‘Ÿ )๐‘Ÿ=1

: length of the shortest path

edge()

edge

Path: 6 12 25 can be replaced by , which is analogous for insertion or deletion.

Page 32: 20131216 Stat Journal

GRAALRepeat with new edge networks .

๐‘ฎ๐‘ (๐‘ฝ ,๐‘ฌ๐‘ )

The distance between and , Aligned node

๐‘ โ‰ค2

New seed

๐‘ฏ ๐‘ (๐‘ผ ,๐‘ญ๐‘ )

๐‘†๐‘ฏ ๐‘ (๐‘ข ,๐‘Ÿ )

: length of the shortest path

New seed

๐‘†๐‘ฎ๐‘ (๐‘ฃ ,๐‘Ÿ )

๐‘ โ‰ค2

๐‘Ÿ=1

๐‘Ÿ=1

Page 33: 20131216 Stat Journal

GRAALRepeat with new edge networks .

๐‘ฎ๐‘ (๐‘ฝ ,๐‘ฌ๐‘ )

The distance between and , Aligned node

๐‘ โ‰ค2

New seed

๐‘ฏ ๐‘ (๐‘ผ ,๐‘ญ๐‘ )

๐‘†๐‘ฏ ๐‘ (๐‘ข ,๐‘Ÿ )

: length of the shortest path

New seed

๐‘†๐‘ฎ๐‘ (๐‘ฃ ,๐‘Ÿ )

๐‘ โ‰ค2

๐‘Ÿ=1

๐‘Ÿ=1

Page 34: 20131216 Stat Journal

GRAALNodes in G are aligned to exactly one node in H.

The distance between and , Aligned node

: length of the shortest path

G(V, E) H(U, F)

Page 35: 20131216 Stat Journal

Alignment scoreEdge correctness: the % of edges in G are aligned to edges in H.

Node correctness: the % of nodes in G are aligned to nodes in H.Correct mapping is needed.

Interaction correctness: the % of interactions that aligned correctly.Correct interaction is needed.

G H

G(V, E) H(U, F)

GRAAL function

The correct node mapping G to H๐‘” :๐‘ฝโ†’๐‘ผ๐‘“ :๐‘ฝโ†’๐‘ผ

Page 36: 20131216 Stat Journal

Statistical significance

: a random mapping between nodes in G(V, E) and H(U, F).The probability P of successfully aligning k or more edges by chance is the tail of the hypergeometric distribution:

G H

G(V, E) H(U, F)

๐‘›1=|๐‘‰|๐‘ƒ=โˆ‘๐‘–=๐‘˜

๐‘š 2 (๐‘š2

๐‘– )(๐‘โˆ’๐‘š2

๐‘š1โˆ’๐‘– )( ๐‘๐‘š1

)

The number of edges from G that are aligned to edges in H.

The number of node pairs in H.

Edge correctness

Page 37: 20131216 Stat Journal

Result

G H

G(V, E) H(U, F)

EC = 0.089