a bioinformatics meta-analysis of differentially expressed genes in colorectal cancer
DESCRIPTION
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer. Simon Chan, [email protected] Thursday Trainee Seminar – October 11 th , 2007. Introduction to Colorectal Cancer (CRC). Cancerous growths in the colon, rectum or appendix - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/1.jpg)
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer
Simon Chan, [email protected] Trainee Seminar – October 11th, 2007
![Page 2: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/2.jpg)
Introduction to Colorectal Cancer (CRC)
• Cancerous growths in the colon, rectum or appendix
• In 2007, an estimated 20,800 Canadians will be diagnosed with CRC and approximately 8,700 will die of it (Source: Canadian Cancer Society)
• Stages of CRC (Image Source: Cardoso J, et al. 2007)
![Page 3: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/3.jpg)
High throughput gene expression analysis
• Many high throughput gene expression analyses have been performed and published:– Cancer versus Normal– Cancer versus Adenoma– Adenoma versus Normal
• Various technologies used:– Serial Analysis of Gene Expression (SAGE)– Oligo-nucleotide microarrays– cDNA microarrays
• Goal: To determine candidate diagnostic and prognostic molecular biomarkers in CRC
![Page 4: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/4.jpg)
Problems• Unfortunately, low overlap between expression profiling studies
• Why?
– Different methods to obtaining tissues (ie Laser Capture Microdissection vs Microdissection)
– Tissue heterogeneity– Inadequate sample numbers– Use of different gene expression platforms (SAGE, microarray,
etc)– Different statistical methods, fold change thresholds, etc applied
• Questions: – Which genes are actually differentially expressed in CRC?
• Which genes would make good CRC biomarkers?
![Page 5: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/5.jpg)
One Solution
• Determine the intersection of a comprehensive collection of high throughput gene expression studies.
• Expect that genes biologically relevant to CRC will be reported the most often.
• System-specific spurious genes should be under-represented.
![Page 6: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/6.jpg)
• However, the statistical significance of this overlap is often not considered
• A certain level of overlap among studies can be expected due to chance alone
Table source: Cardoso J et al, 2007
![Page 7: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/7.jpg)
Meta-analysis Method
• Developed a vote-counting strategy to rank differentially expressed genes based on the following criteria, in order of importance:
– Number of studies reporting a gene as differentially expressed
– Number of tissue samples showing this differential expression
– Fold Change of differential expression
![Page 8: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/8.jpg)
Published Gene Expression Studies
• Collected 25 published gene expression studies– 23 studies compared Cancer versus Normal
– 7 studies compared Adenoma versus Normal
– 5 studies compared Cancer versus Adenoma
Platform Count (Total: 25)
Commerical cDNA microarrays 12
Custom cDNA microarrays 7
Affymetrix oligo-nucleotide microarrays 3
Oligo-nucleotide microarrays 2
SAGE 1
![Page 9: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/9.jpg)
Study 1 Study 2 Study 25
Differentiallyexpressedgene list 1
Differentiallyexpressedgene list 2
Differentiallyexpressedgene list 25
Platform gene list 1
Platformgene list 2
Platform gene list 25
![Page 10: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/10.jpg)
Example
• Croner RS et al, 2005– Compared Cancer versus Normal
– Utilized Affymetrix HG-U133A GeneChip
• Obtained platform annotation file for HG-U133A from Affymetrix website– Mapped Affy probe ids to Enterz Gene IDs (platform gene list)
• Mapped differentially expressed genes to Entrez Gene IDs (differentially expressed gene list)
![Page 11: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/11.jpg)
• Therefore, for each study, two files would be produced:
– File 1: All genes (represented by Entrez Gene IDs) covered on the platform:
• 759• 10581• 11234• 76013• etc
– File 2: Differentially expressed Entrez Gene IDs• 759 UP• 1434 DOWN• 1112 UP• etc
![Page 12: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/12.jpg)
Simulations– Developed custom Perl scripts to perform Monte Carlo simulation.
– For 10,000 iterations,• For each study,
– Determine number of up-regulated (X) and down-regulated (Y) genes reported in the study
– Randomly choose X genes from the platform gene list and label as up-regulated
– Randomly choose Y genes from the platform gene list and label as down-regulated
• Determine number of overlapping genes across the studies in this simulation
– Calculate the average number of genes with overlap of 2,3,4, etc and associated P-values
![Page 13: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/13.jpg)
410
95
30 20 10 5
258.3
18.371.14 12
0
50
100
150
200
250
300
350
400
450
2 3 4 5 6 7 9 11
Number of studies
Nu
mb
er
of
ge
ne
s
Actual Overlap
Simulation
Cancer versus Normal
![Page 14: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/14.jpg)
Summary of Comparisons Analyzed for Overlap
ComparisonTotal Num of Studies
Total Num of Differentially Expressed Genes Reported (mapped)
Total Num of Differentially Expressed Genes with Multi-study Confirmation
P-value
Cancer versus Normal
23 6537 (5886) 573 < .0001
Adenoma versus Normal
7 1101 (986) 39 < .0001
Cancer versus Adenoma
5 538 (415) 5 .08
![Page 15: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/15.jpg)
GeneName
DescriptionStudiesReportingthis Gene
TotalSampleSizes
MeanFoldChange
Validation
TGFβI
Transforminggrowthfactor, betainduced,68kDa
9 369 8.94 RT-PCR
IFITM1
InterferoninducedTransmembraneProtein 1 (9-27)
9 351 7.52 RT-PCR
MYC
V-mycMyelocytomatosisViral OncogeneHomolg (avian)
7 329 5.02 RT-PCR
SPARCSecreted protein,acidic, cysteine-rich(osteonectin)
7 244 6.30 IHC
GDF15Growthdifferentiationfactor 15
7 230 7.42 RT-PCR
![Page 16: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/16.jpg)
Future Studies
• Purchased antibodies for certain high ranking candidates
• Validate protein expression level on colorectal tissue microarrays
• Correlate to certain prognostic outcomes
![Page 17: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/17.jpg)
Conclusions
• Low overlap of results between many colorectal cancer high throughput gene expression studies
• Meta-analysis method identified consistently reported differentially expressed genes
• Cancer versus Normal and Adenoma versus Normal, but not Cancer versus Adenoma, studies resulted in genes consistently reported at a statistically significant frequency
![Page 18: A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer](https://reader035.vdocument.in/reader035/viewer/2022062217/56813a3b550346895da22401/html5/thumbnails/18.jpg)
Acknowledgements:
• Dr. Steven Jones
• Dr. Isabella Tai
• Obi Griffith
• Chan SK, Griffith OL, Tai IT, Jones SJM. Meta-analysis of Colorectal Cancer Gene Expression Profiling Studies Identifies Consistently Reported Candidate Biomarkers. Manuscript in review with Cancer Epidemiology, Biomarkers & Prevention.