pathway studio workgroup/enterprise training course

118
©2006 Ariadne Genomics. All Rights Reserved. Pathway Studio Workgroup/Enterprise training course

Upload: vachel

Post on 08-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Pathway Studio Workgroup/Enterprise training course. DAY 1 Technology overview System architecture. Pathway Studio Desktop Pathway Studio Workgroup Pathway Studio Enterprise Main functionality: Data mining and pathway building Analysis of high-throughput data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Pathway Studio Workgroup/Enterprise training

course

Page 2: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1Technology overview System architecture

Page 3: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

3©2006 Ariadne Genomics. All Rights Reserved.

Products

• Pathway Studio Desktop• Pathway Studio Workgroup• Pathway Studio Enterprise

Main functionality:1) Data mining and pathway building

2) Analysis of high-throughput data

3) Text-mining and fact extraction

Page 4: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

4©2006 Ariadne Genomics. All Rights Reserved.

Ariadne Corporate OfferingSoftware solution for Knowledge management and pathway analysis of the high-throughput data

Knowledge Databases

ResNetBiological Association

Networks

Pathway BuildingPathway collection

MedScan

1000 abstracts/min

Proprietary data

Public interaction data

Analysis of High-Throughput data

Text-mining

Page 5: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

5©2006 Ariadne Genomics. All Rights Reserved.

Accomplishments (April, 2007)

188 publications using AGI software and ResNet database• Gene expression microarray analysis (105)• Pathway Analysis (80)• Disease mechanism (64)• Human genetics (7)• Publication by Ariadne Authors (13)• Text processing (9)• Reviews (6)• Databases (3)• Drug discovery (16)• Toxicogenomics (4)

Page 6: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

6©2006 Ariadne Genomics. All Rights Reserved.

Pathway Studio Workgroupclient-server architecture

DatabaseRead-only users

Data curators

Third party tools,in-house applications,API SQL interface,

bulk data management

PSW administrator

Page 7: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

7©2006 Ariadne Genomics. All Rights Reserved.

PathwayExpert Architecture

Bioinformaticians via Pathway Studio

Database

Application server

Read-only users via web browser

Data editorsvia web browser

Third party tools,in-house applications,API SQL interface,

bulk data management

Page 8: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

8©2006 Ariadne Genomics. All Rights Reserved.

“Everyone is an Expert” decentralized deployment schema

Hundreds or thousands of users some with read only and some with editor or publishers roles accessing one central database via Pathway Studio and/or Web browser to analyze experiments, browse pathway collection, do literature mining, sharing the data and analysis results.

Page 9: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

9©2006 Ariadne Genomics. All Rights Reserved.

“Bioinformatics service group” centralized deployment schema

Bioinformatics group servicing scientists for entire company by analyzing their experimental data and literature mining. Analysis results are published via Web browser interface for end users

Bioinformatics group1) Analysis of experimental data2) Text-mining and Pathway

Building

View only access to pathways and analysis networks annotated with experimental data via web browser and links to PathwayExpert Web Services

1) Experimental data2) Search requests

End users

Page 10: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

10©2006 Ariadne Genomics. All Rights Reserved.

“Disease area” decentralized clusters deployment schemaDisease area groups have bioinformatics, biologists and chemists working as a

team with focus on one disease

Cardiovascular group Cancer group

Digestive disorders group CNS group

Page 11: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Day 1Introduction to MedScan technology

Page 12: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

12©2006 Ariadne Genomics. All Rights Reserved.

Ariadne MedScan Text-To-Knowledge Technology Extracting biological association networks from text

Knowledge Databases

ResNetBiological Association

NetworksPathway Analysis in ResNet database

MedScan1000 abstracts/min

Pathway Studio to navigate knowledgebase

MedScan output: RNEF XML

Page 13: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

13©2006 Ariadne Genomics. All Rights Reserved.

How MedScan extracts facts from text?• Sentence in PubMed:

“Axin binds beta-catenin and inhibits GSK-3beta.”• Identify Proteins in Dictionary (in red):

“Axin binds beta-catenin and inhibits GSK-3beta.”• Identify Interaction Type (in black):

“Axin binds beta-catenin and inhibits GSK-3beta.”

• Extracted Facts: Axin - beta-catenin relation: Binding Axin -> GSK-3beta relation: Regulation, effect: Negative

Syntactic Layer Noun Phrase Verb Phrase Noun Phrase

Semantic Layer Protein Protein Relations

Protein

Page 14: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

14©2006 Ariadne Genomics. All Rights Reserved.

Describing MedScan

• Manually curated: dictionaries and grammar rules

• Fast: 14 mln PubMed abstracts in 2 days on modern PC

• Comprehensive: facts recovery rate > 90%

• Removes redundancy: 7,647,282 non-distinct relations =>1,000,000 distinct relations

• Accurate: false positive rate – 10%

• Customizable: dictionaries and patterns

Page 15: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

15©2006 Ariadne Genomics. All Rights Reserved.

MedScan Architecture

Entity recognizer

Semantic processor

Pattern matcher

Entity detection

Relationship extraction

Dictionaries

Rules

Patterns

Modules

Mam

mal

s

Pla

nt

s

Toxic

olo

gy

Cartridges

Future:•New modules: ConceptScan•New cartridges: Immunology, Clinical

Yea

st

Dro

sop

hila

Customizable by user

C-

ele

gan

s

RNEFXML

Page 16: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

16©2006 Ariadne Genomics. All Rights Reserved.

Overview of MedScan ArchitectureInput Text Input Text

Tokenizer

Semantic Interpreter

Semantic treeSemantic tree

Tagged SentencesTagged Sentences

Ontological interpreter

Syntactic Parser

Preprocessor

Sequence of Words Sequence of Words

Sentence StructureSentence Structure

Databaseof relations

Grammar

Lexicon

Extractionrules

Protein names dictionary

Converter

Extracted factsExtracted facts

Dictionary-based

Identifies proteins and small molecules

Context-free grammar

Grammar and lexicon are proprietary.

They are domain-independent by design but focused on biomedical field.

Rule-based

Rules are equivalentto ontology

Pattern Matcher

Extraction patterns

Page 17: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

17©2006 Ariadne Genomics. All Rights Reserved.

MedScan ApplicationsPubmed

Open access

Google

MedScan

Entity-based indexSemantic Index

Automatic reader’s digest Document Summary

Indexing the scientific literature

Extracting interactions to create databases for systems biology

Page 18: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

18©2006 Ariadne Genomics. All Rights Reserved.

Text-mining tools in Pathway Studio• Tools -> Start MedScan Reader

– Web-browser enhanced with MedScan technology– Search PubMed and manually select abstracts for fact extraction– Search Google Scholar and extract facts from top 100 hits– Search Google and extract facts from top 30 hits– Search Highwire and BioMed Central and extract facts from the

individual full-text articles• Tools -> MedScan: Extract pathways from text

– search PubMed– from file– from location

• Tools -> Update pathway• Tools -> Pathway Reference summary

– Export to EndNote

Page 19: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

19©2006 Ariadne Genomics. All Rights Reserved.

Medscan Reader settings1) Specifying MedScan

cartridge

2) Tracking favorite entities via highlight

3) Filtering for favorite entities and relations

4) Filtering against entities and relations

Page 20: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Day 1Ariadne ResNet database construction

Page 21: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

21©2006 Ariadne Genomics. All Rights Reserved.

ResNet Mammal Database

• Shipped with >1,000,000 unique relations derived by Medscan between proteins, metabolites, chemicals, cell processes and diseases

• ResNet physical interactions are manually curated• 712 manually curated pathways• Gene Ontology• Optional pathway updates:

– >300 Regulome pathways– >2500 Biological processes pathways– >200 Cellular component pathways– High-throughput interaction data

• ResNet automatically curation is possible to remove redundancy and cleanup false positives

Page 22: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

22©2006 Ariadne Genomics. All Rights Reserved.

Pathways collection in ResNet

• Canonical pathways (included, curated)• Signaling line pathways (included, curated)• Regulome pathways (optional, automatic)• Biological processes pathways (optional, automatic)• Cellular component pathways (optional, automatic)• KEGG metabolic pathways (optional, imported)• STKE (commercial)• Metabolic vision (commercial)• PathArt (commercial)

Page 23: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

23©2006 Ariadne Genomics. All Rights Reserved.

Ariadne databases for other organismsAll databases contain:- Relations extracted by MedScan organism-specific cartridge from organism-specific abstracts and full-text articles- Entrez Gene protein annotation- Protein interactions from Entrez Gene (include BIND, HPRD, BioGRID and EcoCyc datasets)- Gene Ontology annotation

Model Organism databases:• ResNet Plant >400,000 relations, supports 6 plant species

– Optional entity co-occurrence data– Additional protein physical interactions predicted by TAIR

• ResNet Drosophila– Additional interactions from published high-throughput datasets

• ResNet C-elegans– Additional interactions from published high-throughput datasets

•ResNet Yeast– Additional interactions from published high-throughput datasets

•ResNet Bacteria (beta version)– Additional interactions from published high-throughput datasets

Databases for non-model organisms containing interactions predicted from closest model organism are available from: http://www.ariadnegenomics.com/support/downloads/databases/

Page 24: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

24©2006 Ariadne Genomics. All Rights Reserved.

Additional Commercial Datasets

• KEGG: > 130 metabolic pathways from Kyoto U-ty• STKE: > 70 pathways from AAAS• Metabolic vision: >10,000 curated pathways for 587

organisms from Integrated Genomics Inc• Hynet: adds over 100,000 new protein physical

interactions to ResNet 5.0 from Prolexys Inc• PathArt: >600 disease pathways from Jubilant Inc

Page 25: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Day1Pathway Studio maintenance and

administration and technical support

Page 26: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

26©2006 Ariadne Genomics. All Rights Reserved.

Hardware requirements for Pathway Studio• Pathway Studio desktop or workgroup client

– CPU: 2 GHz or more – RAM: 512 MB or more – Disk space for application: 500 MB– Disk space for one local database: 2 GB

• PathwayStudio workgroup server– 1 CPU for 1-5 concurrent users: : >3.0 GHz – 2 CPU for 6-10 concurrent users: >3.0 GHz– RAM for 1-5 concurrent users: >2 GB– RAM for 6-10 concurrent users >3 GB – Disk space : 20 GB for the database – Optimal disk configuration:

• for 1-5 concurrent users: 4 hard drives in RAID 0• for 6-10 concurrent users: RAID 10 mode

Page 27: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

27©2006 Ariadne Genomics. All Rights Reserved.

Pathway Studio software requirements

• Pathway Studio desktop or workgroup client– Microsoft Windows Server (2000,2003), Windows XP

(Professional), Windows Vista (Professional, Ultimate, Corporate)

• PathwayStudio workgroup server– MS SQL Server 2000 or 2005 (Developer, Workgroup,

Standard or Enterprise Edition) on Windows 2000, Windows 2003 Server, Windows XP Professional

– Oracle 10g or later on any supported Oracle platform including Windows 2003 Server, Linux, etc.

Page 28: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

28©2006 Ariadne Genomics. All Rights Reserved.

Connecting to the central workgroup database

Page 29: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

29©2006 Ariadne Genomics. All Rights Reserved.

Connecting to the server enterprise database

Page 30: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

30©2006 Ariadne Genomics. All Rights Reserved.

Database Index folder

• Database statistics

• Viewing entities in the list pane

• Viewing pathways

• Viewing groups

• Expression experiments folder

• Simulation model folder

Page 31: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

31©2006 Ariadne Genomics. All Rights Reserved.

PS Workgroup Admin consoleUser roles in Workgroup environment

• Administrator• Editor – can edit public objects• Publisher – can publish private pathways• Regular user – can work only in his private spaceAsk your PSW administrator to get an account and choose your role

Page 32: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

32©2006 Ariadne Genomics. All Rights Reserved.

Ariadne Technical Support http://www.ariadnegenomics.com/products/support.html

Page 33: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

33©2006 Ariadne Genomics. All Rights Reserved.

Summary of the introduction slides

• Medscan technology

• Software architecture, hardware and software requirements

• User roles

• ResNet database overview

• Ariadne’s technical support

Page 34: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

34©2006 Ariadne Genomics. All Rights Reserved.

Summary for the rest of the day

• Working with objects in database• Working with pathway diagram and layout algorithms• Database search in PS• Build pathway tool and strategy • Data import/export• Pathways in ResNet • Pathway comparison and statistical algorithms Find

groups/pathways• Text-mining in PS• Microarray analysis: data import options and algorithms• Pathway kinetics simulation in PS

Page 35: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1Pathway Building in Pathway Studio

• Manual• Automatic using Graph navigation tools• Using text-mining with MedScan

Page 36: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

36©2006 Ariadne Genomics. All Rights Reserved.

Viewing and editing pathways in Pathway Studio

• Viewing entities in the List Pane• Entity and relation tables• Show all references• Pathway Reference summary• Export protein list• Display styles: By type, By effect, By reference count• UI options:

– magnifier– fit text to entities– simple and full graph view– fit to window– rotate– move– zoom by rectangle– advanced graph scaling

• resizing nodes in pathway pane

Page 37: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

37©2006 Ariadne Genomics. All Rights Reserved.

Finding entities and relations in Pathway Studio database• Quick search

• String search• Search by attribute• Build pathway tool

Page 38: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

38©2006 Ariadne Genomics. All Rights Reserved.

Viewing and editing entity/relation properties

Edit Entity property dialog, URN identifierLinks to external databasesAdding new properties, Declaring new properties in the database

Page 39: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

39©2006 Ariadne Genomics. All Rights Reserved.

Palette pane

• Making a figure legend for your publication

• Viewing group display styles

• Drag & drop entity icon into pathway pane

Page 40: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

40©2006 Ariadne Genomics. All Rights Reserved.

Images pane

• Drag & drop images into pathway pane

• Importing your own images

• Image properties

Page 41: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

41©2006 Ariadne Genomics. All Rights Reserved.

KEGG pathways layoutnode cloning in pathway graph

• 131 metabolic pathways• 20,972 connected proteins

Page 42: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

42©2006 Ariadne Genomics. All Rights Reserved.

Several methods for adding objects and relations to Pathway pane

Adding objects:

• Drag & drop from the palette

• Drag & drop from the list pane

Adding relations:

• Connect selected entities button

• Enter a fact box

• Drag & drop from the list pane

Page 43: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

43©2006 Ariadne Genomics. All Rights Reserved.

Building pathways by manual curation in Pathway Studio

In GeneMapp

In Pathway Studio

Page 44: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

44©2006 Ariadne Genomics. All Rights Reserved.

Building pathways by manual curation in Pathway Studio

• Complex Nodes• Adding components

to Complex Nodes

In GeneMapp

In Pathway Studio

Page 45: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

45©2006 Ariadne Genomics. All Rights Reserved.

Questioner about the previous slides

• How many chemical reactions in the ResNet database?

• What is the default image for Transcription factor in PS?

• How many images for cell membrane can be in PS?

• What is the quickest search in PS?• What is the quickest way to add relation to your

pathway diagram?

Page 46: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1Automatic Pathway Building using

Graph navigationBuild pathway tool

Page 47: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

47©2006 Ariadne Genomics. All Rights Reserved.

Mining regulatory relations in database

Basic principal:Regulatory interactions are mediated by physical interaction network

– Regulomes– Biological processes pathways– Disease pathways

Page 48: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

48©2006 Ariadne Genomics. All Rights Reserved.

Build Pathway dialog

•Build pathway options

•Filtering by direction

•Number of steps

•Build pathway filter

The main application of the Build pathway tool is to quickly find connections between entities of interest therefore its button is available from all panes:

Page 49: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

49©2006 Ariadne Genomics. All Rights Reserved.

Build pathway filters

• Using entity filters to answer different biological questions

• Using relation filter to analyze different types of high-throughput data

• Filtering by properties

Page 50: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

50©2006 Ariadne Genomics. All Rights Reserved.

Build pathway Edit Results

• Display filtering• Selecting results based on local connectivity• IsNew column

Page 51: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

51©2006 Ariadne Genomics. All Rights Reserved.

Automatic layout options• Direct force layout

– charges and springs– Good to find hubs in the pathway

• Hierarchical layout – Directed graph– Good for metabolic pathways (KEGG, ERGO)

• Symmetric layout (Centric graph)– Good for Expand pathway

• Cell localization layout (Circular and linear membrane)Configurable:– Cell localization annotation– Organelle images layout– Association of Cell localization value and Organelle image

• Dynamic layout– Direct-force like with adjustable spring force– Use cell localization if organelle

Page 52: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

52©2006 Ariadne Genomics. All Rights Reserved.

Regulome pathways: algorithm input

Page 53: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

53©2006 Ariadne Genomics. All Rights Reserved.

Regulome pathways: algorithm result

Page 54: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

54©2006 Ariadne Genomics. All Rights Reserved.

Building pathways by Data miningconverting regulatory network to protein physical interaction network for Cell Processes, Diseases, Regulomes

Page 55: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

55©2006 Ariadne Genomics. All Rights Reserved.

Disease networks2300 diseases, 230 cancers in ResNet 5.0

Entities associated with Endothelial cells cancer in ResNet

Page 56: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

56©2006 Ariadne Genomics. All Rights Reserved.

Endothelial cells cancer network

Page 57: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

57©2006 Ariadne Genomics. All Rights Reserved.

Data-mining techniques and hints

• Different filter settings – different biological questions. Know the relation type meaning

• Directional filter to perform upstream/downstream analysis

• Relaxing search by including the Regulation relations• To mine for more specific relations use search

Relation by Sentence include “your focus keyword”– Find relation mentioned in certain tissue– Find specific mechanism: trans-activation, cleavage etc…

• Filter by relation confidence using Relation table to increase network confidence

Page 58: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1 Build pathway settings asking different biological questions

Page 59: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

59©2006 Ariadne Genomics. All Rights Reserved.

Finding major regulators among DE genes

First choice for expression data

2

3

Second choice for expression data

1Third choice for expression data

Page 60: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

60©2006 Ariadne Genomics. All Rights Reserved.

Upstream analysis of DE genes and gene clusters

First choice for expression data2

3

Second choice for expression data

1

Third choice for expression data

12

3

Page 61: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

61©2006 Ariadne Genomics. All Rights Reserved.

Analysis of proteomics co-IP data

Page 62: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

62©2006 Ariadne Genomics. All Rights Reserved.

Analysis of proteomics phosphoprofiling experiments

Page 63: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

63©2006 Ariadne Genomics. All Rights Reserved.

Analysis of metabolomics experiment

12

3

21

3

Importing metabolomics experiment

Page 64: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

64©2006 Ariadne Genomics. All Rights Reserved.

Relaxing Build pathway settings

• Replace Find only direct interactions by Find shortest path

• Increase Maximum number of steps in Find common regulators or in Find shortest path

Page 65: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Day 1Pathway Building by text-mining

Non-melanoma skin cancer >1,000,000 cases, (<2,000 deaths), in USA

Page 66: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

66©2006 Ariadne Genomics. All Rights Reserved.

MedScan Reader: PubMed search

Keep searching and adding relations

At the end Send extracted relations to Pathway Studio

Page 67: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

67©2006 Ariadne Genomics. All Rights Reserved.

MedScan Reader: Import top 100 Hits from Google Scholar search: downloads found articles and processes them with MedScan

Page 68: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

68©2006 Ariadne Genomics. All Rights Reserved.

MedScan Reader: Import top 30 Hits from Google search: downloads found web-pages and processes them with MedScan

Page 69: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

69©2006 Ariadne Genomics. All Rights Reserved.

Full-text article found on Highwire press with “non-melanoma skin cancer” text search

Page 70: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

70©2006 Ariadne Genomics. All Rights Reserved.

“Non-melanoma skin cancer” literature network – result of text-mining by MedScan Reader

Every entity in this network was mentioned in the context of non-melanoma skin cancer

Page 71: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

71©2006 Ariadne Genomics. All Rights Reserved.

Protein interaction network for non-melanoma skin cancer using information from entire ResNet

Compare this pathway with your experimental patient data

Page 72: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

72©2006 Ariadne Genomics. All Rights Reserved.

Text-mining techniques and hintscontrolling relevance of literature networks

• Search with keywords for full-text articles and subsequent MedScan fact extraction loosely associates keywords with facts: you find all facts mentioned in the one article with your keywords

• Search with keywords for PubMed abstracts and subsequent MedScan fact extraction provides better relevance of the extracted facts to your keywords: you find all facts mentioned in the one abstract with your keywords

• Search with keywords for sentences extracted by MedScan provides the most relevant relevance of the extracted facts to your keywords: you find all facts mentioned in the one abstract with your keywords

Relevance Vs. Recovery

0

20

40

60

80

100

120

full-text abstract senetence

%

Relevance

Recovery

Page 73: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1Data Import/Export

Page 74: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

74©2006 Ariadne Genomics. All Rights Reserved.

Tools ->Import Protein List

• Choice of identifiers

• Lookup preview

• Paste and Load from file

• Import as New group of proteins

Page 75: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

75©2006 Ariadne Genomics. All Rights Reserved.

Tools -> Import Protein Network

• Choice of identifiers• Lookup preview• Paste and Load from file• Import of Regulatory relations

Page 76: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

76©2006 Ariadne Genomics. All Rights Reserved.

Importing Chip-On-Chip data as PromoterBinding relations using Tools->Import Protein Network

Page 77: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

77©2006 Ariadne Genomics. All Rights Reserved.

Import creates a new pathway with new relations

Page 78: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

78©2006 Ariadne Genomics. All Rights Reserved.

Database ->Import Wizard

• Importing from Internet• Import formats and options• Specifying source for entities and relation• Specifying source folder for pathways

Page 79: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

79©2006 Ariadne Genomics. All Rights Reserved.

Database ->Export Wizard

• Exporting pathways• Export filters• Export strategy• Exporting entities annotation in Plain text format

Page 80: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 1Data management, pathway

comparison, find groups/pathways

Page 81: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

81©2006 Ariadne Genomics. All Rights Reserved.

Working with groups in Pathway Studio

• Create group

• Add Entities to a group

• Add group as a node into pathway pane

• Select/Highlight by group

• Maintaining group hierarchy

Page 82: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

82©2006 Ariadne Genomics. All Rights Reserved.

Edit -> Combine Pathway

• Union

• Intersection

• Subtract

Page 83: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

83©2006 Ariadne Genomics. All Rights Reserved.

Tools for pathways comparison in Pathway Studio

• Combine pathways• Select• Highlight

Page 84: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

84©2006 Ariadne Genomics. All Rights Reserved.

Statistical algorithms for pathway comparison in Pathway Studio

• Find Pathways

• Find Groups

• Gene Ontology analysis

Page 85: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

DAY 2 Analysis of high-throughput data

in Pathway Studio

Page 86: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

86©2006 Ariadne Genomics. All Rights Reserved.

Experiment types

• Gene expression– Find major regulators– Find biomarkers– Gene clustering

• Metabolomics– Find major metabolism regulators– Combined analysis with gene

expression • Proteomics

– Mass-spec protein level– Finding major kinases/phosphatase

for phosphoprofiles

Page 87: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

87©2006 Ariadne Genomics. All Rights Reserved.

Data model in ResNet database Use different networks for different types of experimental data

Expression

PromoterBinding

DirectRegulation

ProtModification

Binding

MolSynthesis

MolTransport

Regulation

Interpretation of Gene Expression data

Interpretation of Proteomics data

Interpretation of Metabolomics data, Biomarkers prediction and validation

…MORE….

Page 88: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

88©2006 Ariadne Genomics. All Rights Reserved.

Analysis of gene expression microarray data: import and selection of responsive genes

• Data import– Tab-delimited and Excel files– Affymetrix CEL files (with RMA normalization)– GenePix (GPR)Result: Save the experiment in the Expression favorites

• Selection of responsive genes– Find differentially expressed genes (significance analysis via t-test) for

analysis of two samples measured in multiple replicas – Gene clustering via correlation networks (Pearson correlation)– Find responsive genes in the 3d party software for statistical analysis of

microarray data and import it as a list (Tools->Import protein list)Result: save as group of genes in Groups folder

Page 89: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

89©2006 Ariadne Genomics. All Rights Reserved.

Analysis of gene expression microarray data: Pathway Analysis

• Network analysis– Identification of DE expressed protein complexes and physical networks

• Build pathway: Find direct regulation, filter for physical interactions (Binding, DirectRegulation, ProtModification)

• Build differentially expressed networks, filter by Binding (PS Enterprise only)– Identification of major regulators and targets in expression network:

• Build pathway: Find direct regulation, filter for Expression and/or PromoterBinding interactions, use hierarchical layout

• Find significant regulators (network enrichment analysis) filter by Expression, PromoterBinding (PS Enterprise only)

Result: save as pathway

• Functional analysis– Find groups/pathways

• Gene ontology analysis• Comparative gene ontology analysis

– Build pathway: Find common targets, filter by CellProcess– Find DE groups/pathways (Gene Set Enrichment analysis, GSEA)Result: List of groups/pathways with p-values indicating statistical significance of differential

expression. Save as a group, as analysis results or export to Excel

Page 90: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

90©2006 Ariadne Genomics. All Rights Reserved.

Most common workflow for microarray analysis in Pathway Studio for disease

• Identify genes differentially expressed in disease (DE genes)

• Identify genes known to associate to disease according to the literature using Pathway Studio

• Identify DE genes that are linked to known diseases genes using Pathway Studio

• Report novel disease genes

Page 91: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

91©2006 Ariadne Genomics. All Rights Reserved.

Expression Data Import wizard

• Generic tab-delimited format– Import any matrix expression data containing

expression values and/or p-values. Minimum requirement: one column with gene identifiers and one column with sample

• Import of Affymetrix CEL (RMA averaging)

• Import of Molecular devices Genepix format with Vera & Sam normalization

Page 92: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

92©2006 Ariadne Genomics. All Rights Reserved.

Expression experiment viewer in Pathway Studio

• Experiment properties

• Gene identifier column: views, sorting, find

• Heat map scale

• Filter genes by value

• Filtering by genes by pathway

• Text view for expression matrix

• Create group from selection

Page 93: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

93©2006 Ariadne Genomics. All Rights Reserved.

Finding differentially expressed genes in Pathway Studio (significance analysis):Two-sample t-test = Between groups t-test

Finds genes that are differentially expressed between two classes of samples measured independently on single color microarrays. Examples: multiple replicas of one untreated (1) and multiple replicas of one treated sample (2); multiple replicas of one normal sample (1) and multiple replicas of one disease sample (2);Calculated p-values indicate significance of expression difference between replicas marked 1 and replicas marked 2.

Page 94: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

94©2006 Ariadne Genomics. All Rights Reserved.

Finding differentially expressed genes in Pathway Studio (significance analysis): Paired samples t-test, usually for two channel microarray platform

Find genes which are differentially expressed between two classes of samples when the comparison is performed in one experiment (two color or two channel microarray) but multiple times.The first class is marked by positive integer and the corresponding sample from the second class measured on the same array is marked by the negative integer with the same absolute value. Calculated p-values indicate significance of expression difference between two sample classes.

Page 95: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

95©2006 Ariadne Genomics. All Rights Reserved.

Finding differentially expressed genes in Pathway Studio (significance analysis): DE genes in multiple experimental log ratio samples

If you have imported pre-calculated your data as log ratios of the normalized expression values you should use this test to find differentially expressed genes for multiple replicas of normalized expression values. Calculated p-values indicate how far the ratio of a given gene deviates from the global mean of ratios across all genes and samples.

Page 96: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

96©2006 Ariadne Genomics. All Rights Reserved.

Gene expression clustering using Relevance networkExpression -> Build network from expression -> Pearson correlation

Page 97: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

97©2006 Ariadne Genomics. All Rights Reserved.

Parameters for Pearson correlation

Major parameters:• Percent of genes to remove – removes less variable genes. Controls number of vertices in the

graph. Keep number of proteins under 1000 in the network• Threshold – allows correlation links above threshold. Controls number of edges in the graph.• Number of permutations – turn on automatic Threshold calculation using randomized

expression samples.• P-value – select most non-random correlation links. Controls number of edges in the graph.

Value 0.01 corresponds to 10% of all possible links equal to (number of vertices)2

Page 98: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

98©2006 Ariadne Genomics. All Rights Reserved.

Finding upstream regulator for a gene cluster using Build pathway option Find common regulators

Page 99: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

99©2006 Ariadne Genomics. All Rights Reserved.

Finding major transcription regulators among differentially expressed genes

Use Build pathway tool option Find direct interactions with filtering for PromoterBinding and Expression to reduce the complexity of your differential expression pattern

Page 100: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

100©2006 Ariadne Genomics. All Rights Reserved.

Build pathway filter stringencies

Gene Expression:• Promoter Binding > Expression > Regulation > Co-

occurrence• Protein > Complex > Functional ClassMetabolomics:• MolSynthesis > RegulationProteomics:• Direct Regulation > ProtModification > Binding >

Regulation• Protein > Complex > Functional Class

Page 101: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

101©2006 Ariadne Genomics. All Rights Reserved.

Questioner Day 1

• What is the quickest Entity search in Pathway Studio?

• What is the most comprehensive Entity search in Pathway Studio?

• How to create a group in PathwayStudio and add entities to it?

• How to Build pathway from the up-regulated genes in you microarray experiment?

Page 102: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

102©2006 Ariadne Genomics. All Rights Reserved.

Workflow 1 Build pathway for EDG regulation

• Using GeneMapp pathway as a guide build the EDG1 pathway in PathwayStudio:– Find proteins for EDG1 pathway– Find relations for EDG1 pathway– Create additional relations missing from ResNet

database – Arrange nodes by cell localization– Save pathway as HTML for web publication

Page 103: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

103©2006 Ariadne Genomics. All Rights Reserved.

Workflow 2Create a pathway containing groups and sub-

pathways as nodes.

• Continue building EDG pathway by adding sub-pathways and groups

• Complet the pathway by text-mining search with filtering

Page 104: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

104©2006 Ariadne Genomics. All Rights Reserved.

Workflow 3Find drug regulating kinases

• Find kinases in the database with connectivity >0– Search by attribute for Functinal class = Kinase and

Connectivity >0

• Find drugs regulating these kinases– Expand pathway from kinases with filter by small molecules– Select drugs in the expanded pathway– Select neighbors for drugs– Copy selection in the new pathway

Page 105: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

105©2006 Ariadne Genomics. All Rights Reserved.

Workflow 4Find biological processes regulated by proteins

involved in prostate cancer

• Find prostate cancer disease node

• Find proteins regulating prostate cancer

• Find cell processes affected by these proteins

• Sort found processes by number of prostate cancer protein regulators

Page 106: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

Day 2Advanced workflows in Pathway Studio

Page 107: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

107©2006 Ariadne Genomics. All Rights Reserved.

Workflow 1. Comparative Gene ontology analysis (Folberg’s experiment)

Import of CEL files1) Calculation of the differentially expressed genes2) Creating a group from DE genes3) Finding statistically significant GO groups4) Creating a pathway from GO groups5) Comparing two lists of GO groups6) Finding DE genes in GO groups

Comparing lists of the differentially expressed GO groups rather than DE genes is more sensitive when comparing the responses in two cell lines, patients and other samples.

Page 108: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

108©2006 Ariadne Genomics. All Rights Reserved.

Comparing lists of the differentially expressed GO groups rather than DE genes is more sensitive when comparing the responses in two cell lines, patients and other samples

Subtracting groups

6 genes

Subtracting genes

No significant groups

Page 109: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

109©2006 Ariadne Genomics. All Rights Reserved.

Two groups of genes differentially expressed during growth in 3D culture vs. flat culture for aggressive and non-aggressive tumors are selected

non-aggressive

aggressive

flat

3Dno growth

flat

3Dgrowth

1. Genes of interest2. Groups of interest

Page 110: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

110©2006 Ariadne Genomics. All Rights Reserved.

Comparative GO group analysis of aggressive vs. non-aggressive uveal melanoma

Open DE GO groups from aggressive tumors

Compare with DE GO groups from non-aggressive tumors

Page 111: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

111©2006 Ariadne Genomics. All Rights Reserved.

Select GO groups related to your experimental goals(cell adhesion DE groups unique for aggressive tumors)

These groups are significant in aggressive melanoma when we compare its growth in 3D matrix vs. flat culture

These groups are NOT significant in non-aggressive melanoma when we compare its growth in 3D matrix vs. flat culture

Page 112: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

112©2006 Ariadne Genomics. All Rights Reserved.

A network of differentially expressed in aggressive uveal melanoma involved in cell adhesion

Page 113: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

113©2006 Ariadne Genomics. All Rights Reserved.

23 SP1 targets among DE genes in cell adhesion network unique for aggressive uveal melanoma during 3D growth

Page 114: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

114©2006 Ariadne Genomics. All Rights Reserved.

Supportive evidence for SP1 role in melanoma aggressiveness

Page 115: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

115©2006 Ariadne Genomics. All Rights Reserved.

Workflow 2. Three methods to find biological processes affected by DE genes

1) Find groups from Biological processes Gene Ontology classification2) Find pathways indicating biological processes3) Build pathway option Find common targets filtering for Cell ProcessIncludes: - Finding proteins using Search by attribute (cell localization) and then

determining their biological processes

Page 116: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

116©2006 Ariadne Genomics. All Rights Reserved.

Workflow 3. Three ways to find biomarkers in Pathway Studio

• By text-mining– Extract pathways from text: PubMed Search for your Disease

• By data-mining– Search for disease of interest in the database– Use Build Pathway: Expand option to find Disease biomarkers

• By gene expression data analysis– Identify Differentially expressed genes– Use Build pathway: Direct interaction option to find proteins that are

downstream of many DE genes. These proteins are most likely biomarkers according to your expression data (See also next slide)

Page 117: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

117©2006 Ariadne Genomics. All Rights Reserved.

Workflow 4. Building disease network using Build pathway tool

Includes:• Finding disease of interest in the database• Finding proteins contributing to disease• Finding biomarkers for a disease• Building disease networks using:

– Build pathway Find direct interactions for protein regulating disease

– Build pathway Expand pathway for protein biomarkers– Combining two pathways– Layout by cell localization - Text –mining : updates?

Page 118: Pathway Studio Workgroup/Enterprise training course

©2006 Ariadne Genomics. All Rights Reserved.

118©2006 Ariadne Genomics. All Rights Reserved.

Workflow 5. Building pathway by text-mining for Li-Fraumeni syndrome

Includes:• Creating new local database• Use of Search PubMed option (Db import)• Consolidation of the db (db updates / groups)• Understanding the major protein players in Li-

Fraumeni syndrome• Understanding regulators / targets / cell

processes associated with Li-Fraumeni syndrome