introduction to microarray analysis and technology dave lin - november 5, 2001

40
roduction to Microarray Analysis and Technol Dave Lin - November 5, 2001

Post on 18-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Introduction to Microarray Analysis and TechnologyDave Lin - November 5, 2001

Page 2: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

OverviewOverview

—Why Biologists care about Genomics

—Why statisticians/computer scientists

—may care about genomics•Preprocessing issues

•Sources of variability in constructing

microarrays•Postprocessing issues

•Analysis of data

Page 3: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

What makes one cell different from another?

liver vs. brain

Cancerous vs. non-cancerous

Treatment vs. control

Page 4: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Old Days

100,000 genes in mammalian genome

each cell expresses 15,000 of these genes

each gene is expressed at a different level

estimated total of 100,000 copies of mRNA/cell

1-5 copies/cell - “rare” -~30% of all genes

10-200 copies/cell - “moderate”

200 copies/cell and up - “abundant”

Page 5: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Cells can be defined by:Complement of Genes (which genes are expressed)How much of each gene is expressed (quantity)

What makes one cell different from another?Try and find genes that are differentially expressedStudy the function of these genesFind which genes interact with your favorite gene

Extremely time-consuming.

Huge amounts of effort expended to find individual genes that may differ between two conditions

Page 6: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Genomics. Almost useless term-defines many different concepts and applications.

Microarrays-massively parallel analysis of gene expression-screen an entire genome at once-find not only individual genes that differ,but groups of genes that differ.-find relative expression level differences-how quantitative can they be?

Page 7: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Microarrays-

Based on old techniquemany flavors- majority are of two essential varieties

cDNA Arrays printing on glass slides

miniaturization, throughputfluorescence based detection

Affymetrix Arraysin situ synthesis of oligonucleotideswill not consider Affymetrix arrays further.

Page 8: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

THE PROCESSTHE PROCESSBuilding the Chip:

MASSIVE PCR PCR PURIFICATIONand PREPARATION

PREPARING SLIDES PRINTING

Preparing RNA:

CELL CULTUREAND HARVEST

RNA ISOLATION

cDNA PRODUCTION

Hybing the Chip:POST PROCESSING

ARRAY HYBRIDIZATION

PROBE LABELING

DATA ANALYSIS

Page 9: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

MASSIVE PCR PCR PURIFICATION and PREPARATION

PREPARING SLIDES

PRINTING

Building the Chip:

Full yeast genome = 6,500 reactions IPA precipitation +EtOH

washes + 384-well format

The arrayer: high precision spotting device capable of printing 10,000 products in 14 hrs, with a plate change every 25 mins

Polylysine coating for adhering PCR products to glass slides

POST PROCESSING

Chemically converting the positive polylysine surface to prevent non-specific hybridization

Page 10: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Fabrication of “Spotted Arrays”Fabrication of “Spotted Arrays”

20,000Precipitations

20,000 resuspensions

Consolidate forprinting

Spot on Glass Slides

Arrayed LibraryNormalized/Subtracted

20,000 PCRreactions

Page 11: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001
Page 12: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Printing ApproachesPrinting Approaches

Non - Contact

• Piezoelectric dispenser

• Syringe-solenoid ink-jet dispenser

Contact (using rigid pin tools, similar to filterarray)

• Tweezer

• Split pin

• Micro spotting pin

Page 13: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Micro Spotting pin

Micro Spotting pin

Page 14: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Microarray GridderMicroarray Gridder

Page 15: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Practical ProblemsPractical Problems

— Surface chemistry: uneven surface may lead to high background.

— Dipping the pin into large volume -> pre-printing to drain offexcess sample.

— Spot variation can be due to mechanical difference between pins.Pins could be clogged during the printing process.

— Spot size and density depends on surface and solutionproperties.

— Pins need good washing between samples to prevent samplecarryover.

Page 16: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Hybing the Chip:

ARRAY HYBRIDIZATION

PROBE LABELING

DATA ANALYSIS

Cy3 and Cy5 RNA samples are simultaneously hybridized to chip. Hybs are performed for 5-12 hours and then chips are washed.

Two RNA samples are labelled with Cy3 or Cy5 monofunctional dyes via a chemical coupling to AA-dUTP. Samples are purified using a PCR cleanup kit.

Ratio measurements are determined via quantification of 532 nm and 635 nm emission values. Data are uploaded to the appropriate database where statistical and other analyses can then be performed.

Page 17: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Labeling of RNAs with Cy3 or Cy5

Two general methods

-Dye conjugated nucleotide

-Amino-allyl indirect labeling

Page 18: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Direct labeling of RNA

AAAAAAA RNATTTTTTTT

CCAACCTATGG T

T

Cy5-dUTP

GGTTGGATACC

cDNA

cDNA synthesis + or

Cy3-dUTP

Page 19: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

AAAAAAATTTTTTTT

CCAACCTATGG

GGTTGGATACC

Indirect labeling of RNA

T Modified nucleotide

Cy3

GGTTGGATACC

addition

cDNA synthesis

Page 20: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Dye effect issues

Direct methodUnequal incorporation of Cy5 vs. Cy3Very poor overall incorporation of direct-conjugatednucleotide = more starting RNA for labeling.

Indirect methodPresumably less bias in initial incorporation of activated nucleotide, but not clear if more or lessdye is added

Both MethodsCy3 fluoresces more brightly than Cy5labeling is very highly sequence dependent

Page 21: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Micrograph of a portion of hybridization probe from a yeast mciroarray (after hybridization).

Page 22: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Layout of the cDNA MicroarraysLayout of the cDNA Microarrays

—Sequence verified, normalized mouse cDNAs—19,200 spots in two print groups of 9,600

each– 4 x 4 grid, each with 25 x24 spots– Controls on the first 2 rows of each grid.

77

pg1 pg2

Page 23: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001
Page 24: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001
Page 25: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 1

• Comet Tails• Likely caused by

insufficiently rapid immersion of the slides in the succinic anhydride blocking solution.

Page 26: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 2

Page 27: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 3

High Background• 2 likely causes:

– Insufficient blocking.

– Precipitation of the

labeled probe.

Weak Signals

Page 28: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 4

Spot overlap:Likely cause: toomuch rehydrationduring post -processing.

Page 29: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 5

DustDust

Page 30: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001
Page 31: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Pin-specific printingdifferences

Page 32: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Normalization - lowessNormalization - lowess• Global lowess

• Assumption: changes roughly symmetric at all intensities.

Page 33: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, The Walter and Eliza Hall Institute of Medical Research,

Normalisation - print-tip-groupNormalisation - print-tip-groupAssumption: For every print group, changes roughly symmetric

at all intensities.

Page 34: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Pre-processing Issues

-Definition of what a real signal iswhat is a spot, and how to determine what shouldbe included in the analysis?

-How to determine backgroundlocal (surrounding spot) vs. global (across slide)

-How to correct for dye effect-How to correct for spatial effect

e.g. print-tip, others-How to correct for differences between slides

e.g. scale normalization

Page 35: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Experimental Design Issues

What is the best means of performing the experimentTo obtain the desired answer?

Biologists’ assumptions and statisticians’ differ.

Biologist viewpointmake everything exactly the same so that differences will stand out

Statistician viewpointmake everything as random as possibleso that real trends will stand out

Page 36: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Most biologists will ask- what are the differences betweentwo samples?

-implicit questions associated with microarrays-

What is the best way to determine this? e.g. Design; replicates; conditions.

How do I obtain the most reliable results? e.g. measurements, normalization

How do I determine what a significant difference is?Do I care about “subtle” changes, or justthe extremes?

How is information best extracted?Is correlation useful? What type of clustering?

How is information combined?How do you model the interactions of 1000s of genes

Page 37: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Design: Two Ways to Do the Comparisons

Page 38: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001
Page 39: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001

Advantages of Our DesignAdvantages of Our Design

—Lower variability —Increased precision—Increase in

measurement of expression -> increased precision

Page 40: Introduction to Microarray Analysis and Technology Dave Lin - November 5, 2001