working with gene lists: finding data using geo & biomart june 5, 2014

21
Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Upload: suzan-eaton

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Working with gene lists:Finding data using GEO

& BioMart

June 5, 2014

Page 2: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Analyzing a gene listWith hundreds of genes but a limited budget and lab

personnel, you need to prioritize the gene list to candidate genes for follow-up

Pick ones that are “interesting”Known to be involved in other related processes but

not (yet) in your process of interestHas protein features which suggest a function in your

process, but it has not been characterizedNo known function or domain, but it shows up in

other, related high-throughput experiments suggesting a key role in your process of interest

Page 3: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Our approach

Analyzing gene lists by:

1. Finding overlap with other high-throughput experiments

2. Finding additional information using BioMart1. Mouse/human homologs2. Protein domain content3. GO classification

Page 4: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

GEO (gene expression omnibus)GEO Datasets

Curated gene expression datasets i.e. there is backlog of experiments that haven’t made it

into the databaseCan search for experiments and conduct differential

gene expression queries on some datasetsCan download datasets & do offline analyses

GEO ProfilesProfiles of expression data for genes

Page 5: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Why search GEO?What other experiments have been done that are

similar to yours?GEO datasets

How do my genes of interest behave in other large scale experimentsGEO profiles

Page 6: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

GEO Profile searchSearch on a gene name (C04F5.7):

Page 7: Working with gene lists: Finding data using GEO & BioMart June 5, 2014
Page 8: Working with gene lists: Finding data using GEO & BioMart June 5, 2014
Page 9: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

GEO Dataset search

“C. elegans”: 4434

Page 10: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

GEO Dataset searches

Query Total datasets

C. elegans datasets

C. elegans 4434 4072

C. elegans AND response 131 121

C. elegans AND host response 5 5

C. elegans AND immune 24 20

C. elegans AND antimicrobial 109 94

Page 11: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Once dataset identifiedDownload data

SOFT format: tab-delimited data Issues:

Not necessarily processed such that they have the ratios of experiment/control

If starting with raw data, may not be able to replicate exactly what authors did or lack expertise/software to generate a list of DE genes

Look for supplementary data from publication Usually they provide a list of all DE genes

Page 12: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Choice of dataset for comparison

In class demo

Page 13: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Biomart – EBI EnsemblUse series of menus

Data source – organism (genes, variation, ect) Filters -- reduce the number of results Attributes – what data to return

Can set up very precise and multilayered queriesCan query across multiple organisms

Simple query:Given a list of gene IDs, you can obtain attributes or

sequences for the entire listTools

ID converter – very useful, easy to use

Page 14: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Two sites for BioMart access

www.biomart.org

Page 15: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Database journal issue on BioMart

Page 16: Working with gene lists: Finding data using GEO & BioMart June 5, 2014
Page 17: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Filtering in BioMart

Page 18: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Attributes in BioMart

Page 19: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

BiomartFilters

C. elegans genes with a human homologSpecify only genes with >= # isoformsprotein coding genes with a transmembrane domain

AttributesEntrez Gene IDs, WormBase IDs, Affy IDsSequence data

transcript, protein, UTRs, flanking regions, ect.

Page 20: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

BioMartIn class demo

Page 21: Working with gene lists: Finding data using GEO & BioMart June 5, 2014

Today’s exerciseCompare current dataset from PLoS Pathogens

paper to data from a different datasetIdentify & retrieve additional information about C.

elegans genes using BioMart