identifying functional subnetworks in large-scale datasets benno schwikowski institut pasteur –...
Post on 18-Dec-2015
216 Views
Preview:
TRANSCRIPT
Identifying functional subnetworksin large-scale datasets
Benno SchwikowskiInstitut Pasteur – Systems Biology Group
http://systemsbiology.fr
Benno Schwikowski
The three levels of this talk
1. Discovery of pathways active in HepC infection
2. Cytoscape plug-ins3. Cytoscape platform
Benno Schwikowski
Hepatitis C infection
• One person out of 30 is infected• No vaccine exists• In 20% of chronic infections, liver
fibrosis and cirrhosis• Frequently requires liver
transplants
Benno Schwikowski
Studying HepC infection mRNA changes
• 50% of transplant livers become re-infected with Hepatitis C
• Study expression of 7000 genes in re-infected livers after transplantation– 1-24 month post-transplant– Samples in 3-6 month intervals
• 28 biopsies from 11 patients– Mixture of hepatocytes, hepatic stellate cell,
Kupffer cells, various types of blood cells
• Compare against pre-transplant reference pool
Benno Schwikowski
Result of mRNA expression analysis
• Most genes (5968 of 7000)were significantly under- or overexpressed in one or more experiments
• High patient-to-patient variation
Benno Schwikowski
Our approach
1. Construct seed networkamong known molecular players
2. Expand seed networkto include differentially expressed genes
3. Identify putative pathwaysby the Active Modules approach
Seed network
Protein-proteinProtein-DNAPhosphorylationActivationRepressionCovalent bondMethylation
Types of interactions
Benno Schwikowski
InteractionFetcher plug-in
Purpose• Dynamically retrieves remote information for selected nodes
– From SQL database– Requests data via XML-RPC protocol
Currently implemented types• Protein/gene synonyms• Orthologs• Sequences (DNA, protein, DNA upstream)
– Gene, protein, • Interactions/associationsOptions• Cross-species queries• Ortholog information from Homologene• Inferred interactions (interologs)• Interactive links to Source Web pages100% open-source (client and server)
Benno Schwikowski
2. Expand seed network
Purpose• Bring significantly up-/downregulated
genes “into the picture”Approach• Add interactions with differentially
expressed genes (“in silico pull-down”)– Use BIND, HPRD databases– Only human-curated interactions
Benno Schwikowski
Identifying putative pathwaysWhy clustering can be problematic
• Many clustering methods are not model-based significance of clusters is unclear
• Any given cluster may not be supported by all experiments – noise problem
• Clusters tend to contain unrelated genes with vaguely similar profiles
Benno Schwikowski
The three levels of this talk
1. Discovery of pathways active in HepC infection
2. Cytoscape plug-ins3. Cytoscape platform
Benno Schwikowski
How can the clustering issuesbe addressed? The ActiveModules
Plug-in
• Define “up-/downregulated” on the basis of a well-defined statistical model
• Also derive clusters from some of the input experiments
• Use additional evidence to focus on “plausible” clusters protein interactions
Benno Schwikowski
A lot of interaction data is becoming available
Databases on...• Protein-protein interactions• Protein-DNA interactions• Genetic interactions• Metabolic pathways• Cell signaling pathways, similarity
relationships, literature-based relationships
Benno Schwikowski
Multi-criteria detection of modules
Experiments
Genes
2. Differential Gene/Protein
Abundances/Activities
1. Interaction networkbetween
genes/proteins
Pert
urb
ati
on
s /c
ond
itio
ns
Rank adjustment: Binomial summation
Pz = 1-(zA(j))
m
jh
hmz
hz
mhjA PPp 1 rA(j)=-1(1-
pA(j)) m = total number of conditions
j = size of subset of conditions
FinalScore
Ideker, Ozier, Schwikowski, Siegel(2002): Bioinformatics 18. S233-240
Scoring a module candidate
Benno Schwikowski
The three levels of this talk
1. Discovery of pathways active in HepC infection
2. Cytoscape plug-ins3. Cytoscape platform
Benno Schwikowski
Active Modules plug-in appliedto HCV re-infection data
• Iterative application results in four significant highly overlapping subnetworks
• Repeat analysis only retaining “late-active” re-infection experiments– Eliminates pathways activated by
transplant operation – Cutoff: 8 months
Which observations can we make locally?
Network after InteractionFetcher expansion
Bold: Differentially regulated subnetworkRed/Green: Late-active subnetwork
Benno Schwikowski
Cytotalk plug-in
• Overrepresentation analysis using Cytotalk plug-in, R, of overrepresentation of genes in Gene Ontology classes
• Cytotalk enables interactive communication with– C/C++ programs– Java processes– Python– UNIX shell scripts– R, R scripts
• Can be run on same machine or any other Internet-connected machine
• Can function as Cytoscape plug-in• 100% open-source
Benno Schwikowski
The three levels of this talk
1. Discovery of pathways active in HepC infection
2. Cytoscape plug-ins3. Cytoscape platform
Benno Schwikowski
Some Network Visualization Tools
• Pajek - Slovenia• Osprey - SLRI, Toronto• VisANT - BU• Biolayout - EBI• GraphViz• PowerPoint• Others• Cytoscape (only open-source biology)
Benno Schwikowski
Cytoscape Basic Concepts
• Objectsvisualized as nodes
• Relationshipsvisualized as edges
• Attributes (name, sequence, source,...)
• Mappingattributes drawing customizable throughvisual mapper
Cytoscape file formats
YDR216W pd YIL056WYDR216W pd YKR042WYDR216W pd YGL096WYDR216W pd YDR077W
[...]
GENE DESC exp0.sig exp1.sig exp0.sig exp1.sig
GENE0 G0 0.0 0.0 23.2 11.5
GENE1 G1 0.0 0.0 34.6 5.2
GENE2 G2 0.0 0.0 10.0 28.0
GENE3 G3 0.0 0.0 1.64 4.77
[...]
Sample interaction file
Sample interaction file
Benno Schwikowski
Display
• gene & protein expression
• protein interactions (physical andnon-physical)
• protein classifications
Analysis plug-in modules
http://www.cytoscape.org/
Java: platform independent + web-start
• 100% open-source
Cytoscape
Benno Schwikowski
Cytoscape Core – Differences to most other
approaches
• Emphasis on data analysis & integration
• No built-in semantics(added by plug-ins)
• Very simple concepts• Human-readable input formats• Extensibility
Benno Schwikowski
Cytoscape extensibility
• Core: 100% open source Java– Plug-in API– Plug-ins are independently licensed
• “Just need to do the biology”• Template code samples
Plug-in
Biomodules plug-in
Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A,Dimitrov K, Siegel AF, and Galitski TGenome Res. 2004 14: 380-390
Benno Schwikowski
Cytoscape PluginsModules in Complex
NetworksIliana Avila-Campillo,
Tim Galitski
Discovering Regulatory and Signaling Circuits in Molecular Interaction NetworksTrey Ideker, Owen Ozier, Benno Schwikowski, Andrew Siegel
Data Integration in Juvenile Diabetes Research
Marta Janer, Paul Shannon
A network motif samplerDavid Reiss, Benno
Schwikowski
Benno Schwikowski
Cytoscape Core Features
• Visualize and lay out networks• Display network data using visual styles• Easily organize multiple networks• Bird’s eye view navigation of large networks• Supports SIF and GML, molecular profiling
formats, node/edge attributes• Functional annotation from GO + KEGG• Metanode support (hierarchical groupings)• Extensible through plugins (20 developed)
Benno Schwikowski
Collaborators: HCV
Institute for Systems Biology, Seattle, WA
• David Reiss• Iliana Avila-Campillo• Vesteinn Thorsson• Tim Galitski
Benno Schwikowski
Collaborators: Cytoscape
• ISBLeroy HoodRowan Christmas
• Agilent Technologies• Unilever PLC
• Long-term funding from NIH and participating institutions
• UCSDTrey IdekerChris Workman
• Memorial-Sloan KetteringCancer CenterChris SanderGary BaderEthan Cerami
• Pasteur Melissa ClineAndrea SplendianiTero Aittokallio
Shannon, P., et al. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks . Genome Res 13, 2498-504.
Benno Schwikowski
Collaborators: Active Networks
• Trey Ideker• Owen Ozier• Andrew Siegel
• Richard Karp
top related