![Page 1: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/1.jpg)
TORQUE: TOPOLOGY-FREE QUERYING OF PROTEIN INTERACTION NETWORKS
Sharon Bruckner1, Falk Hüffner1 , Richard M. Karp2, Ron Shamir1, and Roded Sharan1
1 School of computer science, Tel Aviv University2 Int. Computer Science Institute, Berkley, CA
![Page 2: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/2.jpg)
OUR GOAL: NETWORK QUERYING Start with a protein-protein interaction network of
some species A. We seek subnetworks that match complexes or
pathways.
Network Querying: Given a protein complex from another species B, identify the subnetwork of A that is most similar to it.
Why network querying? Match hints at an evolutionary conserved region Infer the functionality of the matched region.
![Page 3: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/3.jpg)
Previous Methods Assume knowledge of the interactions within
the query complex (the topology). Look for a match in the network with the same topology. Examples: Qnet (Dost et al, 2008), GraphFind (Ferro et al,
2008).
??
![Page 4: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/4.jpg)
?
NO NEED FOR TOPOLOGY!
Interaction information is noisy and incomplete, and for some species – not available.
![Page 5: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/5.jpg)
THE PROBLEM
Input: Graph G=(V,E) , |V|
=n, |E|=m
Color set {1,2,...,k}
A coloring of network vertices
![Page 6: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/6.jpg)
THE PROBLEM
We seek:Is there are connectedsubgraph of G that
has exactly one vertex of each color?
Call such a subgraph “colorful”
![Page 7: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/7.jpg)
ABOUT THE PROBLEM NP-complete
Hard even when the graph is a tree with max degree 3 (via reduction from 3SAT (Fellows et al, 2007)
Our Contributions: A fixed parameter dynamic
programming algorithm. Integer Linear Program Fast heuristics Implementation using a combination of
the above.
![Page 8: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/8.jpg)
DEFINING THE BASIC DP ALGORITHM
Input: A graph where each vertex is colored by one of k colors.Output: Find a colorful tree
Every connected subgraph has a spanning tree
Every colorful connected subgraph will have a colorful spanning tree
Instead of looking for a colorful subgraph, look for a colorful tree
Input: A graph where each vertex is colored by one of k colors.Output: Find the highest scoring colorful tree
![Page 9: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/9.jpg)
DYNAMIC PROGRAMMING ALGORITHM (FELLOWS
ET AL, 2008)
Row for each vertex Column for each subset of
colors, in increasing size.
S1 S2 S3 S4
v1 0 0 None 3.4
v2 0 None 2.3 2
v3 None 0 3.15 None
v4 None None 13.5 7.42
v5 0 0 6.4 8.1
vertices
Score of best tree Rooted in v3 that Is colored exactlyBy S3
IDEA: Instead of looking at all nk possible subgraphs, look only at all 2k color sets
![Page 10: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/10.jpg)
DYNAMIC PROGRAMMING ALGORITHM
The last column contains, for every vertex v, the highest scoring tree rooted in v colored by all the colors of the query!
Running time: O(3k|E|).
1 2
1 2, , , ,u N vS S S
T v S MAX T v S T u S w u v
![Page 11: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/11.jpg)
EXAMPLE
vv
uu
T(v, { } )
ww
v
u
1 2
1 2, , , ,u N vS S S
T v S MAX T v S T u S w u v
![Page 12: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/12.jpg)
EXTENSION 1: ALLOWING DELETIONS – MATCHING WITH LESS COLORS
?
![Page 13: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/13.jpg)
EXTENSION 2: ALLOWING INSERTIONS: SPECIAL NON-COLORED VERTICES,ARBITRARY VERTICES
![Page 14: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/14.jpg)
ALLOWING NON-COLORED INSERTIONS
For j insertions, we would expect running time: O(3k+jm).
Can show: O(3kmj). Make j copies of each column, and
recursively solve:
B(v, S, j’) = Highest score of a tree, rooted in v, colored by S, using exactly j’ insertions
![Page 15: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/15.jpg)
FORMULA & EXAMPLE
1 2
1 2
1 2 21
' , , ' 0
, , , , , , ,u N vSj j jS S
j j B v S j
B v S MAX B v S B u S w u v otherw sj j j i e
a
d
b
c
f
g
e
Running Time: O(3km*j)
![Page 16: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/16.jpg)
Extension 3: ALLOWING MULTIPLE COLORS PER VERTEX
![Page 17: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/17.jpg)
?
PUTTING IT TOGETHER…
3
3
1.25
0.82
3.14
8
2.34
6.6
1.25
4.57
2.25
4.8
3.9 0.25
0.3
![Page 18: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/18.jpg)
A SECOND APPROACH
Formulate the problem as an integer linear program (ILP).
Use efficient ILP solvers.
![Page 19: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/19.jpg)
ILP at a glance
Want: Subset T of the vertices Formulate colorfulness
Only vertices in T are colored. Every vertex should get at most one color Every color should be given to at most one
vertex Formulate connectivity
Find a flow such that: Only vertices in T can be involved in the flow. Flow of k-1, single sink, k-1 sources Every source has connection to the sink via flow
edges.
![Page 20: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/20.jpg)
The Integer Linear Program
![Page 21: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/21.jpg)
Heuristic Speedups
First do data reduction only 5% of the vertices are associated with one
or more query colors many non-colored vertices are too far from any
colored vertex to be useful For each remaining connected component:
Try a shortest-paths based heuristic that does not allow mismatches.
If this fails: If few colors, but large instance, use dynamic
programming Otherwise, use ILP
![Page 22: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/22.jpg)
IMPLEMENTATION, EXPERIMENTS & RESULTS
![Page 23: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/23.jpg)
Experiments
We applied our method to query complexes within: yeast (5430 proteins, 39936 interactions), fly (6650 proteins, 21275 interactions) human (7915 proteins, 28972 interactions).
Queries: yeast, fly, human bovine, mouse, and rat.
![Page 24: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/24.jpg)
COMPARISON WITH OTHER METHODS Most previous work tested queries with a
known topology.
? We compare our results with those of Qnet (Dost
et al, 2008), designed to tackle topology-based queries.
QNet uses color coding to tackle the subgraph homemorphism problem, allowing insertions and deletions.
![Page 25: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/25.jpg)
Comparison with QNet
![Page 26: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/26.jpg)
Results Evaluation
Functional coherence Used GO TermFinder for functional enrichment in
T. Specificity
Looked at overlap between T and known complexes in the target species.
Compared to overlap between random subgraphs and the known complexes.
Corrected for multiple testing using FDR (q<0.05).
Quality match: Functionally coherent and specific.
![Page 27: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/27.jpg)
SELECTED RESULTS
![Page 28: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/28.jpg)
Thanks: Nir Yosef, the TAU Computational Genomics group , and the Computational System Biology group.
Israel Science Foundation, Edmond J. Safra Bioinformatics Program, Tel Aviv Univ.
The PPI network querying problem motivates the colorful connected subgraph problem. A fixed parameter dynamic programming algorithm, allowing insertions, deletions, and multiple colors per vertex, along with an ILP formulation and heuristics, obtains good results.
SUMMARY
![Page 29: TORQUE: T OPOLOGY -F REE Q UERYING OF P ROTEIN I NTERACTION N ETWORKS Sharon Bruckner 1, Falk Hüffner 1, Richard M. Karp 2, Ron Shamir 1, and Roded Sharan](https://reader036.vdocument.in/reader036/viewer/2022070403/56649f2a5503460f94c4457a/html5/thumbnails/29.jpg)
REFERENCES [FFHV07] M. R. Fellows, G. Fertin, D. Hermelin, and S. Vialette.
Borderlines for finding connected motifs in vertex-colored graphs. In Proc. ICALP’07, volume 4596, pages 340–351. Springer-Verlag, 2007.
[N06] R. Niedermeier. Invitation to Fixed-Parameter Algorithms. Number 31 in Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press, 2006.
[BFKN08] N. Betzler, M. R. Fellows, C. Komusiewicz, and R. Niedermeier. Parameterized algorithms and hardness results for some graph motif problems. In Proc. 19th CPM, volume 5029 of LNCS, pages 31{43. Springer, 2008.
[AYZ95] N. Alon, R. Yuster, and U. Zwick. Color coding. Journal of the ACM, 42: 844{856, 1995}.
[DSGRBS08] B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna, and R.Sharan. Qnet: A tool for querying protein interaction networks. Journal of Computational Biology, 15(7):913-925, 2008.