ncbi resources iii: geo and expression data analysis
DESCRIPTION
NCBI resources III: GEO and expression data analysis. Yanbin Yin Fall 2014. Homework assignment 2. Given the publication http://www.ncbi.nlm.nih.gov/pubmed/ 19723656 , find GEO datasets that are associated with the paper. Choose the first data series and perform a GEO2R analysis - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/1.jpg)
1
NCBI resources III: GEO and expression data analysis
Yanbin YinFall 2014
![Page 2: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/2.jpg)
2
Homework assignment 2• Given the publication http://www.ncbi.nlm.nih.gov/pubmed/
19723656, find GEO datasets that are associated with the paper.
• Choose the first data series and perform a GEO2R analysis
• Find the top two differentially expressed genes and search their gene symbol at Gene database and explain what they are
• Write a report (in word or ppt) to include all the operations and screen shots
Due on 9/23 (send by email or bring printed hard copy to class) Office hour: Tue, Thu and Fri 2-4pm, MO325AOr email: [email protected]
![Page 3: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/3.jpg)
3
GEO is an international public repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomics data submitted by the research community.
The three main goals of GEO are to:
Provide a robust, versatile database in which to efficiently store high-throughput functional genomic data
Offer simple submission procedures and formats that support complete and well-annotated data deposits from the research community
Provide user-friendly mechanisms that allow users to query, locate, review and download studies and gene expression profiles of interest (Query and analysis)
Gene Expression Omnibus (GEO)http://www.ncbi.nlm.nih.gov/geo/
![Page 4: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/4.jpg)
4
Basic intro to microarray
Cyanine
![Page 5: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/5.jpg)
5
People are moving from microarray to high throughput sequencing
![Page 6: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/6.jpg)
6http://jermdemo.blogspot.com/2012/01/when-can-we-expect-last-damn-microarray.html
When can we expect the last microarray paper?
![Page 7: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/7.jpg)
7
What data does GEO have?
• Submitter supplied: Platform, Sample, Series
• NCBI curated: DataSets and Profiles
• Tools: GEO BLAST and GEO2R
http://www.ncbi.nlm.nih.gov/geo/
Omics data:
GenomicsTranscriptomicsEpigenomicsProteomics…
![Page 8: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/8.jpg)
8
GEO accession number (GPLxxx)GSMxxx
GSExxx
![Page 9: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/9.jpg)
9
Microarray
NGS
![Page 10: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/10.jpg)
10
Expression
Genome variation
DNA-binding
Methylation/Epigenomics
Protein array
ncRNAs
![Page 11: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/11.jpg)
11
![Page 12: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/12.jpg)
12
![Page 13: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/13.jpg)
13
Platform, Sample, Series
Experiment centricData of a GEO Series are reassembled by GEO staff into GEO Dataset records (GDSxxx).
A DataSet represents a curated collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools.
Not all submitted data are suitable for DataSet assembly, so not all Series have corresponding DataSet record(s).
Profiles are derived from DataSets
A Profile consists of the expression measurements for an individual gene across all Samples in a DataSet.
Gene centric
http://www.ncbi.nlm.nih.gov/geo/info/overview.html
![Page 14: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/14.jpg)
14
Hands on exercise 1
GEO browse and query
![Page 15: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/15.jpg)
15
http://www.ncbi.nlm.nih.gov/geo/
![Page 16: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/16.jpg)
16
Try: cancercolon cancerarabidopsisecoli
These are only DataSets
Type the keyword in the search box and click search
![Page 17: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/17.jpg)
17
stem development AND arabidopsis[organism]
term [field] OPERATOR term [field]Construct queries to narrow down the results
![Page 18: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/18.jpg)
18
term [field] OPERATOR term [field]
http://www.ncbi.nlm.nih.gov/geo/info/qqtutorial.html
![Page 19: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/19.jpg)
19
![Page 20: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/20.jpg)
20
Hands on exercise 2
GEO gene profiles
![Page 21: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/21.jpg)
21
Search for a gene: GAUT1
![Page 22: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/22.jpg)
22
![Page 23: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/23.jpg)
23
Click here
Scroll down to find record 17
![Page 24: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/24.jpg)
24
![Page 25: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/25.jpg)
25
Profile neighbors: what are the co-expressed genes sharing similar expression profiles?
Go back to result page
![Page 26: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/26.jpg)
26
Chromosome neighbors: are neighboring genes co-expressed?
![Page 27: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/27.jpg)
27
Hands on exercise 3
GEO DataSets analysis tool
![Page 28: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/28.jpg)
28
stem development AND arabidopsis[organism]
Click on 893
![Page 29: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/29.jpg)
29
![Page 30: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/30.jpg)
30
We want to use this DataSet to identify differentially expressed genes in stem developmentHow: define two groups of samples and run two sample t test
Click on step 2 to define two groups of samples
![Page 31: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/31.jpg)
31
Click samples to select
![Page 32: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/32.jpg)
32
Step 1: you can choose different statistical methods for analysis
Step 3 to perform analysis
![Page 33: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/33.jpg)
33
Group 1Group 2
Result page is a list of genes with significantly different expression between two groups of samples
![Page 34: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/34.jpg)
34
GEO2R: differentially expressed genes
http://www.youtube.com/watch?v=EUPmGWS8ik0
“Analyze DataSet” is for GEO DataSets“GEO2R” is for GEO Series
![Page 35: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/35.jpg)
35
stem development AND arabidopsis[organism]
Click on 893
![Page 36: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/36.jpg)
36
![Page 37: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/37.jpg)
37
Hard to choose? Let’s modify the query text to narrow down
stem development[title] AND arabidopsis[organism]
Click on the title to get detailed info about this data series
![Page 38: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/38.jpg)
38
Description of experiments
Platform and sample data
![Page 39: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/39.jpg)
39
Click on Define groups and type in group namesSelect samples from the table and click on the defined group to assign to the groupClick on Top 250 in the bottom of the page to run the job
![Page 40: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/40.jpg)
40
The result page, click on the ID will give the graph
The 4 groups have different profiles for each gene
![Page 41: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/41.jpg)
41
FTP stands for File Transfer Protocol.HTTP stands for Hyper Text Transfer Protocol.
When ftp appears in a URL it means that the user is connecting to a file server and not a Web server and that some form of file transfer is going to take place.
When http appears in a URL it means that the user is connecting to a Web server and not a file server. The files are transferred but not downloaded, therefore not copied into the memory of the receiving device.
http://wiki.answers.com/Q/What_is_the_difference_between_FTP_and_HTTP
ftp
![Page 42: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/42.jpg)
42
ftp server of NCBI
![Page 43: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/43.jpg)
43
ftp resources
• Refseq genomes, proteins, mRNAs• Microbial genomes• Plant genomes• Fungal genomes• Blast database folder• Sra reads• Geo datasets
![Page 44: NCBI resources III: GEO and expression data analysis](https://reader030.vdocument.in/reader030/viewer/2022032313/56812b48550346895d8f679f/html5/thumbnails/44.jpg)
44
Next lecture: EBI resources I