Transcript

The microarray data analysis Ana Deckmann Carla Judice Jorge Lepikson Jorge Mondego Leandra Scarpari Marcelo Falsarella Carazzolle Michelle Servais Tais Herig Summary - Statistics background - Introduction to microarray - Pre-processing microarray data - Statistics analysis - Applications on the LGE - Gene Chip - measurement = truth + error - error = bias + variance Error model Normalization Experimental replicate (techniques and biological) and statistics Bias describe a systematic tendency of the measurement. Ex: dyes Cy3 and Cy5 dont have the same efficient Variance is often normally distributed, ex : instrumentation imperfection and biological variation Statistics background - Standard deviation Mean : Standard deviation : mean(x) Gaussian function Assume data with one outlier: x = (8, 85, 7, 9, 5, 4, 13, 6, 8) The mean of all xs, i.e. (x 1 +x x K )/K, is affected by the outlier: mean(x) = (7.5) The median of all xs, i.e. the middle value of (x 1 +x x K ), is not (if < 50% values are outliers): x ordered = (4,5,6,7,8,8,9,13,85) median(x) = 8.0 Use the median instead of the mean if you expect artifacts. (If there are a lot of measurements and the errors are symmetrically distributed the median will give the same result as the mean without outliers.) - Mean vs median : - Quantiles Mean the fraction (or percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the point at which 30% percent of the data fall below and 70% fall above that value. Q p =30% x=(0,10,40,25,15,50,70,60) x=(0,10,15,25,40,50,60,70) ordered values Quantil(x ; 30%) = (0,10,15) 1 quartil = 10 3 quartil = 60 Median = (25+40)/2 = 32.5 Introduction to microarray -Three different microarray technologies : - Spotted cDNA microarrays (500 to 2500 bp) - Spotted oligonucleotide microarrays (30 to 70 bp) - Affymetrix chips (25 bp) - Can be used to : - Differential gene expression studies, gene co-regulation studies, gene function identification studies. time-course studies, dose-response studies, clinical diagnosis, Two color architecture Probes: 30-meros, 90% at 550 bases downstream extremidade 3 Targets: 10ug cRNA biotinilado Codelink architecture (one color) higher frequency, more energy lower frequency, less energy excitation red laser green laser emission overlay images Scanning A B C H G F D E a b c d e f g h i j k Scarpari, Leandra 2006 Tese Doutorado Ludwig flags : (0) Int


Top Related