panu somervuo, march 19, 2007 1 cdna microarrays

19
Panu Somervuo, March 19, 2007 1 cDNA microarrays

Upload: moris-james

Post on 30-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Panu Somervuo, March 19, 2007 1

cDNA microarrays

Panu Somervuo, March 19, 2007 2

cDNA microarrays• small slides with several

measurement units, spots • e.g. 2.5cm-by-7.6cm glass

slide with 30,000 spots• each spot contains specific

nucleotide sequences, probes• in hybridization process,

labeled (Cy5, Cy3) samples attach to probes

• comparative genome hybridization (CGH): DNA samples

• gene expression: RNA samples

• relative intensity of hybridization can be measured

Cy5 Cy3

Panu Somervuo, March 19, 2007 3

Data flow

• biological data, DNA/RNA extraction, fluoresence dye labeling, hybridizationarray

• scanningimage• image processing: spot segmentationdatafile

• data preprocessing and normalization:• data analysis1: statistical tests to find differentially expressed

genes gene lists

• data analysis2: biological interpretations of results

Panu Somervuo, March 19, 2007 4

Image processing• segmentation: spot signals are extracted from

background• intensity information from both spot foreground

and background• other information like spot size and shape

Panu Somervuo, March 19, 2007 5

Image analysis results file

Panu Somervuo, March 19, 2007 6

Plotting data

Panu Somervuo, March 19, 2007 7

Logarithm of ratio

• log(Cy5/Cy3) = log(Cy5) – log(Cy3)

• log2(4/1) = 2• log2(2/1) = 1• log2(1/1) = 0• log2(1/2) = -1• log2(1/4) = -2

Panu Somervuo, March 19, 2007 8

Plotting data

• scatterplot• MA plot (Ratio vs Intensity)

Panu Somervuo, March 19, 2007 9

Panu Somervuo, March 19, 2007 10

Normalization

• goal: to remove the effects of non-biological causes from data (dye-effect, hybridization, scanning, noise) and keep the biological information as well as possible

• normalization can be based on the behavior of the majority of the spots on the array, or small set of special control spots

• each normalization method is based on some assumption of the data

Panu Somervuo, March 19, 2007 11

Spot background subtraction

• how to know if spot signal is real and not just noise?• comparison against background signal• global versus local background• should background subtraction be used or not?

Panu Somervuo, March 19, 2007 12

Normalization• can be applied to both single channel and ratio data• mean

• variance

Panu Somervuo, March 19, 2007 13

Mean normalization

• global mean vs intensity dependent mean• Loess/Lowess normalization

Panu Somervuo, March 19, 2007 14

Print tip loess normalization

Panu Somervuo, March 19, 2007 15

Panu Somervuo, March 19, 2007 16

Control spots (spike-in controls)

fold change up 3log2(3)=1.58

fold change up 10log2(10)=3.32

fold change down 10log2(1/10)=-3.32

fold change down 3log2(1/3)=-1.58

Panu Somervuo, March 19, 2007 17

What is the best normalization method?

• each method is based on some assumption each method can fail

• if utilizing the behavior of majority of the spots, array should represent all genes

• if utilizing control spots, check if they are reliable

• lots of methods have been introduced, lots of methods will be introduced…

Panu Somervuo, March 19, 2007 18

Finding differentially expressed genes

• Manually set fold change cutoff• Fold change cutoff based on data• Statistical test, p-value

Panu Somervuo, March 19, 2007 19

Limma package in R

• analysis of microarray data– data import– data plotting– data normalization– statistical tests differentially expressed genes

• online help and tutorial available> help(package=limma)

> library(limma)

> limmaUsersGuide()