henrik bengtsson [email protected] mathematical statistics centre for mathematical sciences
DESCRIPTION
Plate Effects in cDNA Microarray Data. Henrik Bengtsson [email protected] Mathematical Statistics Centre for Mathematical Sciences Lund University. Outline. Intensity dependent effects A new way of plotting microarray data Plate effects Plate normalization Measure of Fitness Results - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/1.jpg)
Henrik [email protected]
Mathematical StatisticsCentre for Mathematical Sciences
Lund University
Plate Effects inPlate Effects incDNA Microarray DatacDNA Microarray Data
![Page 2: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/2.jpg)
Outline
• Intensity dependent effects• A new way of plotting microarray data• Plate effects• Plate normalization• Measure of Fitness• Results• Discussion
![Page 3: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/3.jpg)
Data• Matt Callow’s ApoAI experiment (2000):
– (8 ApoAI-KO mice vs. pool of 8 control mice),8 control mice vs. pool of 8 control mice.
– 5357 ESTs/genes (6 triplicates, 175 duplicates, 4989 single spotted) & 840 blanks=> 6384 spots in all.
– Labeled using Cy3-dUTP and Cy5-dUTP.– Signals extracted from images by Spot.
![Page 4: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/4.jpg)
Intensity dependent effectsThe log-ratio, M, depends on the intensity of the spot, A.
![Page 5: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/5.jpg)
Print-tip effectsThe log-ratio (and its variance) depends on printtip group.
How are the spots printed…?
![Page 6: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/6.jpg)
Print order plotThe spots are order according to when they were spotted/dipped onto the glass slide(s).
![Page 7: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/7.jpg)
Plate effectsThe log-ratios depends on the plate the spotted clone comes from.
(384-well plates from 6 different labs were used)
![Page 8: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/8.jpg)
Plate NormalizationAssumption:The genes from one plate are in averagenon-differentially expressed.
Correctness?Are clones on the plates selected randomly? Spots on plates are less random that for instance spots in print-tip groups.
The ApoAI mouse experiment is a comparison between 8 control mice and the pool of them. Even if clones on plates were from different tissues, e.g. plate 9-12 from brain, in this setup it should not affect the ratios, just the strength of the signals.
![Page 9: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/9.jpg)
Removing plate biases
![Page 10: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/10.jpg)
Intensity normalization
• Intensities (A) also have plate effects.
• Intensity normalization => plate biases again!
Should we normalize A for plate? Probably not!Blanks and ”brain” spots have lower intensities, whereas the ”liver” spots have higher...
![Page 11: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/11.jpg)
Sources of Artifacts
scanning
data: (R,G,...)
cDNA clones
PCR product amplificationpurification
printing
Hybridize
RNA
Test sample
cDNA
RNA
Reference sample
cDNA
excitationred lasergreen
laser
emission
overlay images
Production
Plate effects(?)
Intensity effects(labelling efficiency)
Intensity effects(quenching)
![Page 12: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/12.jpg)
Several possible approaches ;(
Decisions to make:
• Background correction?• Plate normalization?• Intensity (slide, print-tip or scaled print-tip) normalization?• Platewise-intensity normalization?
If both plate and intensity normalization, in what order? Maybe plate-intensity-plate-intensity-plate-... and so on?
Need a way to compare different approaches...
![Page 13: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/13.jpg)
Measure of FitnessMedian absolute deviation (MAD) for gene i:
di = 1.4826 · median | rij |
where rij = Mij – median Mij is residual j for gene i.
The measure of fitness is defined as the mean of the genewise MADs:
m.o.f. = di / N
where N is the number of genes. (...or or look at the density of the di ’s)
Important. Compare on the same scale!
![Page 14: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/14.jpg)
Visual comparison between the ”best”Slidewise intensity normalization:
(m.o.f.=0.228)Plate+print-tip int.+plate normalization:
(m.o.f.=0.188)
![Page 15: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/15.jpg)
bg – background corrected, P – Plate biases removed, S – slide-intensity normalized,B – printtip-intensity normalized, sB – scaled printtip intensity normalized.
m.o.f.
• Removing plate biases first significantly lowers the gene variabilities. (15-20% lower than intensity normalization only)
• It is critical not to dobackground correction.
• Using measure of fitness is helpful in deciding what to do.
Results
![Page 16: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/16.jpg)
Discussion
• What are the reasons for plate effects and where do they actually occur? i) On the plates, ii) during printing or iii) at hybridization?
• How should one best standardize the measure of fitness? i) Based an all spot, ii) on a subset (blanks?), or iii) ?
![Page 17: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences](https://reader036.vdocument.in/reader036/viewer/2022062400/56814dfd550346895dbb686f/html5/thumbnails/17.jpg)
AcknowledgementsStatistics Dept, UC Berkeley:* Sandrine Dudoit * Terry Speed* Yee Hwa Yang
Lawrence Berkeley National Laboratory:* Matt Callow
Ernest Gallo Research Center, UCSF:* Karen Berger
Mathematical Statistics, Lund University:* Ola Hössjer
com.braju.sma - object oriented extension to sma (free):http://www.braju.com/R/
[R] Software (free):http://www.r-project.org/
The Statistical Microarray Analysis (sma) library (free):http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html