egan tutorial: a basic use-case october, 2009 jesse paquette ucsf helen diller family comprehensive...

Post on 19-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

EGAN Tutorial:A Basic Use-case

October, 2009

Jesse Paquette

UCSF Helen Diller Family Comprehensive Cancer Center

jesse.paquette@cc.ucsf.edu

Preamble

• This document has many slides with multi-step animations– Best viewed in Slide Show mode

• The EGAN graphical user interface is evolving– Icons may change

– Menus may change

– Button/widget placement may change

– This document probably won’t change as quickly

– Please contact the developers if you notice major discrepancies between this and EGAN

A basic EGAN use-case: Overview

• This document will guide you through a brief demonstration of EGAN functionality; you will

– Select gene nodes using experiment results– Show gene nodes on the Network View– Perform an automatic layout of the Network View– Save a custom gene set– Navigate the Network View– Link out to Entrez Gene– Link out to PubMed– Calculate enrichment scores for association nodes– Show association nodes on the Network View– Export a screenshot of the Network View– Export a Node Table with enrichment statistics

• Yes, that’s just the brief demonstration– There is a large amount of functionality that won’t be covered

• Launch the EGAN Demo to follow along– Try to make your screen match the screenshots on each slide

A basic EGAN use-case: Select gene nodes

Here’s where we begin. You should see this screen after the EGAN demo loads. In step one we will select gene nodes to be placed on the graph.

Drag the vertical divider to the left in order to give the Entrez Gene Node Table maximum visible width.

Entrez Gene Node Table

A basic EGAN use-case: Select gene nodes

Next, click the Expts. tab to show the Experiments Table in the bottom panel.

A basic EGAN use-case: Select gene nodes

The Experiments Table shows two experiment results from Neve (2006); one focused (blue), one unfocused (gray).

For each experiment result there are three columns in the Entrez Gene Node Table.

Column one shows the summary statistic value from the experiment for each gene. In this example, values represent each gene’s expression correlation with Herceptin sensitivity.

Column two indicates the sign of each gene’s summary statistic value and a color indicating the position of that value in the overall distribution. Positive values are green, negative values are blue, and brighter colors indicate statistics near the tails.

Column three indicates the p-value from the experiment for each gene.

Click on the header of column three to sort the Entrez Gene Node Table by p-value.

A basic EGAN use-case: Select gene nodes

These are the most significant genes in the focused experiment. Note that there are genes that correlate with both Herceptin sensitivity and Herceptin resistance. We are going to construct a network using only the genes associated with Herceptin resistance.

Click on the header of column two to sort the Entrez Gene Node Table by the sign of the correlation statistic.

A basic EGAN use-case: Select gene nodes

This sorts the gene nodes into two groups, - and +. Within each group, the nodes are still sorted by p-value. Now we can easily select all nodes in the - group.

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

A basic EGAN use-case: Select gene nodes

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

A basic EGAN use-case: Select gene nodes

Left-click on the top gene row (POLR2G) and drag downward until you reach gene rows that have p-values greater than 0.0.

A basic EGAN use-case: Select gene nodes

Your selection block will end with the gene SEC61A1. To confirm the number of gene nodes now selected, click on the Nodes tab below to show the Node Types Table.

A basic EGAN use-case: Show selected gene nodes

The Selected Nodes column value for the Entrez Gene row shows that there are 41 genes selected. Remember, these are the top 41 genes having expression values correlated with Herceptin resistance.

We’re ready to show these gene nodes on the graph. Drag the Node Table divider back to the right to give room to the Network View.

A basic EGAN use-case: Show selected gene nodes

Click the Show selected button to show all selected nodes on the graph.

A basic EGAN use-case: Perform layout

And there they are! All stacked on top of each other. To separate them, click the Force layout button above.

A basic EGAN use-case: Group selected gene nodes

Ok, we’re almost ready to explore the graph. But first, let’s save this group of 41 genes so we can quickly retrieve it later. Click the Group selected button to the right.

A basic EGAN use-case: Group selected gene nodes

Give this gene set a very descriptive name – you may not revisit this set until a future analysis, at which point you will need to know exactly what this set represents. Suggested: “Top 41 genes with expression correlation to Herceptin resistance (Neve 2006), p < 0.01”

To confirm that this set was created, left-click the Custom Node row in the Node Types Table. This will show the Custom Node Node Table to the right.

A basic EGAN use-case: Group selected gene nodes

The group now appears as an association node in the Custom Node Node Table.

Now that our new gene set is saved, deselect all nodes by clicking the Deselect all button.

A basic EGAN use-case: Navigating the Network View

Now for some Network View basics: right-click on the Network View in empty space (i.e. not on a node or an edge). While holding down the button, drag downward.

A basic EGAN use-case: Navigating the Network View

You can also zoom in and out with the mouse wheel or the buttons above the Network View. Pan and zoom the Network View to focus on the cluster of 4 inter-connected genes to the left.

A basic EGAN use-case: Link out to Entrez Gene

Right-click on the node TAP1.

A basic EGAN use-case: Link out to Entrez Gene

This brings up the Node Menu. The first item, Summary, shows the Entrez Gene summary information for TAP1. Next, select Link out -> Link out ‘TAP1’ from this Node Menu.

A basic EGAN use-case: Link out to Entrez Gene

If you have an active internet connection, using Link out from the Node Menu will connect you to the database entry corresponding with each node. In this case, Link out will load the Entrez Gene entry for TAP1.

A basic EGAN use-case: Link out to PubMed

Now let’s consider the edges connecting these gene nodes. You will notice that there are three different edge colors, pink, orange and gray. To understand what each edge represents, click the Edges tab to bring up the Edge Types Table.

A basic EGAN use-case: Link out to PubMed

The demo version of EGAN contains 3 pre-collated edge types:

Protein-protein interactions (defined by BIND, BioGRID, HPRD, IntAct and MINT)

Chromosomal sequence proximity (an edge exists if the genes are adjacent on the chromosome)

PubMed co-occurrence (an edge exists if the genes are discussed in the same article)

To view how many articles support each edge, click Display options -> Edges -> Reference count labels.

A basic EGAN use-case: Link out to PubMed

We can now see that EGAN is aware of three articles in PubMed that mention both PSMB10 and PSMB8. Right-click on that edge and select Link out -> PubMed pages for all references for this edge. Those articles should load in your browser.

We’re almost done with the tutorial. Two things left to cover: enrichment statistics and exporting results. To calculate association node enrichment (i.e. over-representation) in the set of visible gene nodes, click Enrichment options -> Association visible enrichment below.

Next, click the Nodes tab to bring up the Node Types Table.

A basic EGAN use-case: Calculate enrichment scores

Enrichment scores have been calculated for all association nodes (i.e. gene sets) in EGAN. Let’s explore this information. Click the Gene Ontology Process row in the Node Types Table.

A basic EGAN use-case: Calculate enrichment scores

There are now two new columns in the Gene Ontology Process Node Table.

Column one shows the number of genes in the Network View that are also connected to each Gene Ontology Process association node.

Column two shows the p-value for the corresponding hypergeometric enrichment test.

Click the Visible Enrichment column header to sort the Gene Ontology Process Node Table by enrichment p-value.

A basic EGAN use-case: Show enriched association nodes

Click the checkboxes in the Visible column to selectively show response to biotic stimulus, cellular macromolecule catabolic process and negative regulation of ubiquitin-protein ligase activity during mitotic cell cycle on the Network View.

A basic EGAN use-case: Show enriched association nodes

Next, click the Force layout button.

A basic EGAN use-case: Show enriched association nodes

Congratulations, you’ve just created your first gene association network. Let’s add some enriched KEGG association nodes. Click the KEGG row in the Node Types Table.

A basic EGAN use-case: Show enriched association nodes

Then sort the KEGG Node Table by the Visible Enrichment column.

A basic EGAN use-case: Show enriched association nodes

Click the corresponding checkboxes to show Glycan structures – biosynthesis 1 and N-Glycan biosynthesis on the Network View. Then click the Force layout button above.

A basic EGAN use-case: Show enriched association nodes

Now let’s show enriched Cytoband association nodes. Show the Cytoband Node Table, sort the table by Visible Enrichment and selectively show Cytoband association nodes enriched with p-values less than 0.01. After that, click the Force layout button.

A basic EGAN use-case: Export to PDF

You should have shown nodes 1p36 and 6p21.3. Easy, eh? Note that your layout might look slightly different than this screenshot – this is because the Force layout algorithm is non-deterministic.

Let’s export the Network View to PDF. First, we want to manipulate the Network View so that we can take the best screenshot. Click the Maximize button above to give full screen space to the Network View.

A basic EGAN use-case: Export to PDF

Let’s assume that we’ll want to print this network to paper at some point. It’s best to switch to the white color scheme to save on black ink. Click Display options -> Background -> White.

A basic EGAN use-case: Export to PDF

To export the Network View to PDF, click Screenshot options -> Network-only PDF…

A basic EGAN use-case: Export the Node Table

Ok, almost done…just one last useful tip. You can also export any Node Table to tab-delimited (Excel-ready) file. Click the Show all tables button above.

A basic EGAN use-case: Export the Node Table

Then drag the divider to the left to give more space to the Node Table.

A basic EGAN use-case: Export the Node Table

Finally, click the Export button at the top of the Node Table. You can export the Node Table for every node type shown in the Node Types Table below.

That’s all for this tutorial. Thanks for taking the time to learn EGAN!

Questions/comments?

• Visit http://groups.google.com/group/ucsf-egan for downloads, documentation and discussion– Requires an account with Google Groups

top related