normalization for cdna microarray data yee hwa yang, sandrine dudoit, percy luu and terry speed....
Post on 19-Dec-2015
217 views
TRANSCRIPT
![Page 1: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/1.jpg)
Normalization for cDNA Normalization for cDNA Microarray DataMicroarray Data
Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed.
SPIE BIOS 2001, San Jose, CA
January 22, 2001
![Page 2: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/2.jpg)
Normalization issuesNormalization issues
Within-slide– What genes to use– Location– Scale
Paired-slides (dye swap)– Self-normalization
Between slides
![Page 3: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/3.jpg)
Within-Slide NormalizationWithin-Slide Normalization
—Normalization balances red and green intensities.
—Imbalances can be caused by – Different incorporation of dyes– Different amounts of mRNA– Different scanning parameters
—In practice, we usually need to increase the red intensity a bit to balance the green
![Page 4: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/4.jpg)
Methods?log2R/G -> log2R/G - c = log2R/ (kG)
Standard Practice (in most software)
c is a constant such that normalized log-ratios have zero mean or median.
Our Preference:
c is a function of overall spot intensity and print-tip-group.
What genes to use?— All genes on the array— Constantly expressed genes (house keeping)— Controls
– Spiked controls (e.g. plant genes)– Genomic DNA titration series
— Other set of genes
![Page 5: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/5.jpg)
KO #8
Probes: ~6,000 cDNAs, including 200 related to lipid metabolism.
mRNA samplesR = Apo A1 KO mouse liverG = Control mouse liver(All C57Bl/6)
Experiment
![Page 6: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/6.jpg)
M vs. AM vs. AM = log2(R / G)A = log2(R*G) / 2
![Page 7: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/7.jpg)
Normalization - MedianNormalization - Median
—Assumption: Changes roughly symmetric
—First panel: smooth density of log2G and log2R.
—Second panel: M vs. A plot with median set to zero
![Page 8: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/8.jpg)
Normalization - lowessNormalization - lowess— Global lowess— Assumption: changes roughly symmetric at all intensities.
![Page 9: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/9.jpg)
Normalisation - print-tip-groupNormalisation - print-tip-groupAssumption: For every print group, changes roughly symmetric
at all intensities.
![Page 10: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/10.jpg)
M vs. A - after print-tip-group M vs. A - after print-tip-group normalizationnormalization
![Page 11: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/11.jpg)
Effects of Location NormalisationEffects of Location Normalisation
Before normalisation After print-tip-groupnormalisation
![Page 12: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/12.jpg)
Within print-tip-group box plots forWithin print-tip-group box plots forprint-tip-group normalized Mprint-tip-group normalized M
![Page 13: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/13.jpg)
Assumptions:
– All print-tip-groups have the same spread.
True ratio is ij where i represents different print-tip-groups, j represents different spots.
Observed is Mij, where
Mij = ai ij
Robust estimate of ai is
MADi = medianj { |yij - median(yij) | }
Taking scale into accountTaking scale into account
II
i i
i
MAD
MAD
1
![Page 14: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/14.jpg)
Effect of location + scale normalizationEffect of location + scale normalization
![Page 15: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/15.jpg)
Effect of location + scale normalizationEffect of location + scale normalization
![Page 16: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/16.jpg)
Comparing different normalisation Comparing different normalisation methodsmethods
![Page 17: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/17.jpg)
Follow-up ExperimentFollow-up Experiment
— 50 distinct clones with largest absolute
t-statistics from the first experiment.
— 72 other clones.
— Spot each clone 8 times .
— Two hybridizations:
Slide 1, ttt -> red ctl-> green.
Slide 2, ttt -> green ctl->red.
![Page 18: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/18.jpg)
Follow-up Experiment
![Page 19: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/19.jpg)
Paired-slidesPaired-slides: : dye swapdye swap
— Slide 1, M = log2 (R/G) - c
— Slide 2, M’ = log2 (R’/G’) - c’
Combine by subtract the normalized log-ratios:
[ (log2 (R/G) - c) - (log2 (R’/G’) - c’) ] / 2
[ log2 (R/G) + (log2 (G’/R’) ] / 2
[ log2 (RG’/GR’) ] / 2
provided c = c’
Assumption: the separate normalizations are the same.
![Page 20: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/20.jpg)
Verify AssumptionVerify Assumption
![Page 21: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/21.jpg)
Result of Self-NormalizationResult of Self-NormalizationPlot of (M - M’)/2 vs. (A + A’)/2
![Page 22: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/22.jpg)
SummarySummaryCase 1: A few genes that are likely to changeWithin-slide:
– Location: print-tip-group lowess normalization.– Scale: for all print-tip-groups, adjust MAD to equal
the geometric mean for MAD for all print-tip-groups.
Between slides (experiments) :– An extension of within-slide scale normalization
(future work).
Case 2: Many genes changing (paired-slides)– Self-normalization: taking the difference of the two
log-ratios.– Check using controls or known information.
![Page 23: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/23.jpg)
http://www.stat.berkeley.edu/users/terry/zarray/Html/
Technical Reports from Terry’s group:
http://www.stat.Berkeley.EDU/users/terry/zarray/Html
/papersindex.html— Comparison of Discrimination Methods for the Classification of Tumor
s Using Gene Expression Data
— Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments.
— Comparison of methods for image analysis on cDNA microarray data.
— Normalization for cDNA Microarray Data
Statistical software R
http://lib.stat.cmu.edu/R/CRAN/
![Page 24: Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d3a5503460f94a144c9/html5/thumbnails/24.jpg)
AcknowledgmentsAcknowledgments
Terry Speed
Sandrine Dudoit
Natalie Roberts
Ben Bolstad
Matt Callow (LBL)
John Ngai’s Lab (UCB)
Percy Luu
Dave Lin
Vivian Pang
Elva Diaz