1 mbg-487 microarrays - i 2 human genome project

116
1 MBG-487 Microarrays - I

Upload: jared-crawford

Post on 21-Jan-2016

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

1

MBG-487

Microarrays - I

Page 2: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

2

HUMAN GENOME PROJECT

Page 3: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

3

Knowledge about the effects of DNA variations among

individuals can lead to revolutionary new ways to

diagnose, treat, and someday prevent the thousands

of disorders that affect us. Besides providing clues to

understanding human biology, learning about

nonhuman organisms' DNA sequences can lead to an

understanding of their natural capabilities that can be

applied toward solving challenges in health care,

agriculture, energy production, environmental

remediation, and carbon sequestration.

What are some practical benefits to learning about DNA?

Page 4: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

4

Genome

• The complete complement of an organism’s genes; an organism’s genetic material.

Page 5: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

5

•identify all the approximately 20,000-25,000 genes in

human DNA,

•determine the sequences of the 3 billion chemical base

pairs that make up human DNA

•store this information in databases,

•improve tools for data analysis,

•transfer related technologies to the private sector, and

•address the ethical, legal, and social issues (ELSI) that

may arise from the project.

GOALS OF HUMAN GENOME PROJECT

Page 6: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

6

Drosophila melanogaster

Caenorhabtitis elegans

Arabidopsis thaliana

Saccharomyces cerevisiae

E. coli

Mus musculus

Bacteriophage

Fugu rubripes

Homo sapiens

Page 7: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

7

Genome sequencing helps in:• identifying new genes (“gene discovery”) • looking at chromosome organization and structure• finding gene regulatory sequences• comparative genomics

These in turn lead to advances in: •medicine•agriculture•biotechnology •understanding evolution and other basic science questions

Page 8: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

8

Some current and potential applications of genome

research include:

• Molecular medicine• Energy sources and environmental applications• Risk assessment• Bioarchaeology, anthropology, evolution, and human

migration• DNA forensics (identification)• Agriculture, livestock breeding, and bioprocessing

Page 9: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

9

Molecular Medicine

• Improved diagnosis of disease

• Earlier detection of genetic predispositions to disease

• Rational drug design

• Gene therapy and control systems for drugs

• Pharmacogenomics "custom drugs"

Page 10: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

10

Bioarchaeology, Anthropology, Evolution, and

Human Migration

•Study evolution through germline mutations in lineages

•Study migration of different population groups based

on female genetic inheritance

•Study mutations on the Y chromosome to trace

lineage and migration of males

•Compare breakpoints in the evolution of mutations

with ages of populations and historical events

Page 11: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

11

• Understanding genomics will help us understand human

evolution and the common biology we share with all of

life.

• Comparative genomics between humans and other

organisms such as mice already has led to similar genes

associated with diseases and traits.

• Further comparative studies will help determine the yet-

unknown function of thousands of other genes.

Page 12: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

12

Genes (i.e., protein coding)

But. . . only <2% of the human genome encodes proteins

Other than protein coding genes, what is there?• genes for noncoding RNAs (rRNA, tRNA, miRNAs, etc.)• structural sequences (scaffold attachment regions)• regulatory sequences• “junk” (including transposons, retroviral insertions, etc.)

It’s still uncertain/controversial how much of the genome is composed of any of these classes

The answers will come from experimentation and bioinformatics.

What’s in a genome?

Page 13: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

13

Why sequence is not enough

• Identifying genes and control regions is not enough to decipher the inner workings of the cell:

• We need to determine the function of genes.

• We would like to determine which genes are activated in

which cells and under which conditions.

• We would like to know the relationships between genes

(protein-DNA, protein-protein interactions etc.).

• We would like to model the various dynamic systems in

the cell.

Page 14: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

14

• transcription• post transcription (RNA stability)

• post transcription (translational control)• post translation (not considered gene regulation)

usually, when we speak of gene regulation, we are referring to transcriptional regulation

the “transcriptome”

Genes can be regulated at many levels

RNA PROTEINDNATRANSCRIPTION TRANSLATION

The “Central Dogma”

Page 15: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

15

• high throughput assays

• robotics

• high speed computing

• statistics

• bioinformatics

Because of the vast amounts of data that are generated, we need new approaches

Page 16: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

16

High-throughput Technologies and ‘OMİKS’ Science

Page 17: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

17

Terms and Definitions

Genomics

Analysis of an organisms genome – identification of single genes and their function

Functional genomics

Global and dynamic survey of gene expression; detection of functional relationship

Proteomics

Analysis of protein-sequences, expression-patterns and protein-interactions of a given organism

Bioinformatics

computer-aided processing of biological data detection of complex interrelations interpretation and conclusion structuring, saving, search

Page 18: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

18

Functional genomics

The ability to perform genome-wide patterns of gene expression and the mechanisms by which gene expression is coordinated.

Page 19: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

19

Functional Genomics

Page 20: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

20

Functional Genomics

Page 21: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

21

High-throughput analysis

Page 22: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

22

Idea: finding which genes are expressed by measuring the mRNA amount in the cell (or other materials).

Finding gene expression

Page 23: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

23

Microarrays can show us when

and where genes are expressed.

But what regulates this

expression?

Page 24: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

24

One way of looking at the transcriptome is with DNA microarrays. With microarrays, the expression of thousands of genes can be assessed in a single experiment.

cDNAs or oligonucleotides representing all genes in the genome are deposited on a glass slide using a robotic arrayer:

Looking at the transcriptome: DNA

microarrays

Benfey, P. and Protopapas, A. Genomics. 2005. New Jersey: Pearson Prentice Hall. pp. 131-2

Page 25: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

25

Page 26: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

26

Why use microarrays?

•Each cell type expresses ~ 10- 20 000 genes

•Physiological and pathophysiological responses are

linked to changes in gene expression

•Knowledge of gene expression variation at different

states may create new hypotheses about gene

function and underlying mechanisms

Page 27: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

27

Microarray Technology

• Microarray:– New Technology (first paper: 1995)

• Allows study of thousands of genes at same time

– Glass slide of DNA molecules • Molecule: string of bases (25 bp – 500 bp) • uniquely identifies gene or unit to be studied

http://kbrin.a-bldg.louisville.edu/CECS694/

Page 28: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

28

Fabrications of Microarrays

• Size of a microscope slide

Images: http://www.affymetrix.com/

Page 29: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

29

Differing Conditions

• Ultimate Goal:– Understand expression level of genes under

different conditions

• Helps to:– Determine genes involved in a disease– Pathways to a disease– Used as a screening tool

Page 30: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

30

Gene Conditions

• Cell types (brain vs. liver)

• Developmental (fetal vs. adult)

• Response to stimulus

• Gene activity (wild vs. mutant)

• Disease states (healthy vs. diseased)

Page 31: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

31

Expressed Genes

• Genes under a given condition– mRNA extracted from cells– mRNA labeled– Labeled mRNA is mRNA present in a given

condition– Labeled mRNA will hybridize (base pair)

with corresponding sequence on slide

Page 32: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

32

Page 33: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

33

Page 34: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

34

Page 35: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

35

Page 36: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

36

Page 37: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

37

Page 38: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

38

Page 39: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

39

Page 40: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

40

DNA microarray ProbesProduction of high-density DNA microarrays is complex and requires:

-sequence information of the organism-gene transcript analysis-gene clustering and annotation-probe design

cDNA: reverse-transcribed from cellular mRNA populationcDNA libraries (~105 clones) represent a snapshot of cellular gene expression.

PCR-samples for probe generation (300-800 nt)amplified DNA needs purification from enzymes, nucleotidessuch contaminants can interfere with the microarray analysis

Oligo-nucleotides: 50-70, multiple 25merless time and effort; precision

surface chemistry: to facilitate the attachment of probes to the slide

Page 41: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

41

Chip design and content

standard size: 1“ x 3“ (2.54 x 7.62 cm) glass slideDNA fragments (corresponding to a particular gene) are spottedonto the array’s surface along a defined gridspot size: ~100μm/ >20.000 individual samples

Microarray platformsfull genome chips

Affymetrix: Gene Chips: A,B,C sets: in situ25mer Oligos, 16 probes/geneone color, biotinylated targets,post labeling with SA-PEclosed system

Agilent:22k, 44k60mer Oligos, 1 probe/geneTwo color labeling (Cye dyes)open source

Page 42: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

42

MIAME - Minimum Information About a Microarray Experiment

• -to enable the interpretation of the results• -to potentially reproduce the experiment verify the conclusions• -to make microarray data available to the scientific community

MIAME principlesExperiment Design

– The goal of the experiment– Keywords - e.g. time course, cell type comparison– Experimental factors - parameters or conditions tested

Samples used, extract preparation and labeling– The origin of each biological sample– Manipulation of samples and protocols used

Hybridization procedures and parameters-Measurement data and specifications

Data extraction and processing protocols– Image scanning hardware and software– processing procedures and parameters– Normalization, transformation and data selection

Array Design:– General array design, including the platform type– Array feature and annotation

Page 43: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

43

Microarray Flow

Page 44: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

44

Sample Preparation

Page 45: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

45

Two major technologies

• cDNA arrays

- probes are placed on the slides

- allows comparison of different cell types

• Oligonucleotide arrays

- partial sequences are printed on the array

- measure values in one tissue type

Page 46: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

46

Two Different Types of Microarrays

• Custom spotted arrays (up to 20,000 sequences)– cDNA– Oligonucleotide

• High-density (up to 100,000 sequences) synthetic oligonucleotide arrays– Affymetrix (25 bases)

Page 47: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

47

Custom Arrays

• Mostly cDNA arrays

• 2-dye (2-channel)– RNA from two sources (cDNA created)

• Source 1: labeled with red dye• Source 2: labeled with green dye

Page 48: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

48

Two Channel Microarrays

• Microarrays measure gene expression

• Two different samples:– Control (green label)– Sample (red label)

• Both are washed over the microarray– Hybridization occurs – Each spot is one of 4 colors

Page 49: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

49

Page 50: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

50

cDNA microarray experiments

mRNA levels compared in many different contexts

• Different tissues, same organism (brain v. liver) • Same tissue, same organism (ttt v. ctl, tumor v. non-

tumor) • Same tissue, different organisms (wt v. ko, tg, or

mutant)

• Time course experiments (effect of ttt, development)

• Other special designs (e.g. to detect spatial patterns).

Page 51: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

51

cDNA Microarray

• Measure the relative levels of expression

• Parallel analysis

• Competitive hybridization

• Need cDNA library

mRNA cDNA

Reverse Transcription

Page 52: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

52

PCR Amplification

Printing

Hybridization

Laser Scan

Labeling

SamplesReverse Transcription

Expression Data

Page 53: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

53

Page 54: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

54

Exponential Amplification of a Gene

Return

Page 55: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

55

Labeling and Hybridization of

Sample cDNAs

Return

Page 56: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

56

Page 57: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

57

cDNA microarrays

Compare the genetic expression in two samples of cells

PRINTcDNA from one gene on each spot

SAMPLEScDNA labelled red/green

e.g. treatment / control

normal / tumor tissue

Page 58: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

58

HYBRIDIZE

Add equal amounts of labelled cDNA samples to microarray.

SCAN

Laser Detector

Page 59: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

59

Looking at the transcriptome: DNA

microarrays

extract mRNA

make labeled cDNA

hybridize to microarray

cell type A

cell type B

more in “A ”

more in “B”

equal in A & B

Page 60: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

60

Page 61: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

61

Microarrays provide a means to measure gene expression

Page 62: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

62

Page 63: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

63

Page 64: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

64(Slide source: http://www.bsi.vt.edu/)

Page 65: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

65

Microarray Image Analysis

• Microarrays detect gene interactions: 4 colors: – Green: high control– Red: High sample– Yellow: Equal– Black: None

• Problem is to quantify image signals

Page 66: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

66

Information Extraction

— Spot Intensities—mean (pixel intensities).—median (pixel intensities).

— Background values—Local —Morphological opening—Constant (global)—None

— Quality Information

Take the average

Speed Group Microarray Page

http://stat-www.berkeley.edu/users/terry/zarray/Html/image.html

Signal

Background

Page 67: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

67

Data verification• Gene expression ratio?

Low High Expression level

Gen A Gen B

Sample 2

Sample 1

Page 68: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

68

Quantification of expression

For each spot on the slide we calculate

Red intensity = Rfg - Rbg

(fg = foreground, bg = background) and

Green intensity = Gfg - Gbg

and combine them in the log (base 2) ratio

Log2( Red intensity / Green intensity)

Page 69: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

69

Data Normalization• Purpose

Adjust bias from variation in microarray technology.

E.g. differences between labeling, scanner setting, spatial positions

• Within-array normalizationlogarithmic transformation of ratio, subtract by mean log ratio

Red Green Difference Ratio (G/R) Log2 Ratio Centered R

16500 15104 -1396 0.915 -0.128 -0.048

357 158 -199 0.443 -1.175 -1.095

8250 8025 -225 0.973 -0.039 0.040

978 836 -142 0.855 -0.226 -0.146

65 89 24 1.369 0.453 0.533

684 1368 529 2.000 1.000 1.080

13772 11209 -2563 0.814 -0.297 -0.217

856 731 -125 0.854 -0.228 -0.148

Page 70: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

70

Gene Expression Data On p genes for n slides: p is O(10,000), n is O(10-100), but

growing,

Genes

Slides

Gene expression level of gene 5 in slide 4

= Log2( Red intensity / Green intensity)

slide 1 slide 2 slide 3 slide 4 slide 5 …

1 0.46 0.30 0.80 1.51 0.90 ...2 -0.10 0.49 0.24 0.06 0.46 ...3 0.15 0.74 0.04 0.10 0.20 ...4 -0.45 -1.03 -0.79 -0.56 -0.32 ...5 -0.06 1.06 1.35 1.09 -1.09 ...

These values are conventionally displayed on a red (>0) yellow (0) green (<0) scale.

Page 71: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

71

• Microarray data converted to n x p table

(p –gene number, n – sample number)

0.091.85Gene 4

1.053.34Gene 3

10.53.2Gene 2

2.081.04Gene 1

Sample 2Sample 1

Microarray gene expression data

Page 72: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

72

Statistical Analysis• Differences in ratios due to

– random variation

– meaningful changes

• Convention

– ratio >= 2 or ratio <= ½

• Analysis of variance (ANOVA)– 4 and 10 replicates of each treatment

– statistical significance

Page 73: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

73

Single Color Microarrays

• Prefabricated – Affymetrix (25mers)

• Custom– cDNA (500 bases or so)– Spotted oligos (70-80 bases)

Page 74: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

74

Single Color Microarrays

• Expressed sequences washed over chips

• Expressed genes hybridize

• Light passed under to see intensity (or hybridized oligos show dark color)

Page 75: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

75

Affymetrix GeneChip System

• Large number of genes and ESTs

• Several number of species

• Oligonucleotide arrays for expression monitoring are

designed and synthesized based on sequence

information alone, without the need for physical

intermediates such as clones, PCR products, cDNAs.

• Printed oligos are of the same length, allowing for

equal hybridization.

Page 76: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

76

Affymetrix Technology

DESOKY, 2003

Page 77: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

77

Affymetrix Technology

Biotin (one dye) instead of 2 colors

One treatment per chip• For two conditions, need two slides• Compare patterns of both slides to get results

11, 16, or 20 gene markers pairs per gene

DESOKY, 2003

Page 78: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

78

Affymetrix Technology

DESOKY, 2003

Page 79: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

79

Affymetrix Genechip: experimental steps

Page 80: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

80

Page 81: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

81

Lithography

• It is a printing technology.• Lithography was invented by Alois Senefelder

in Germany in 1798.• The printing and non-printing areas of the

plate are all at the same level, as opposed to intaglio and relief processes in which the design is cut into the printing block.

• Lithography is based on the chemical repellence of oil and water.

Page 82: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

82

Affymetrix TechnologyLight-directed synthesis of DNA chips

• Attachment of synthetic linkers modified with photochemically removable protecting groups to a glass substrate and direct light through a photolithographic mask to specific areas on the surface to produce localized photodeprotection.

• The first of a series of chemical building blocks, hydroxyl-protected deoxynucleosides, is incubated with the surface, and chemical coupling occurs at those sites that have been illuminated in the preceding step.

• Next, light is directed to different regions of the substrate by a new mask, and the chemical cycle is repeated.

• Current technology allow for 300,000 polydeoxynucleotides in a 1.28x1.28 cm arrays.

Page 83: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

83

Affymetrix Array Construction

STROMBERG, 2003

Page 84: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

84

Page 85: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

85

Page 86: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

86

PM to maximize hybridization

MM to ascertain the degree of cross-

hybridization

Affymetrix Design of probes

Page 87: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

87

Affy Tech – Number of Features

Multipleoligo probes

25-mers

Features

5’ 3’Gene Sequence

– Use multiple oligos per gene

– Redundancy improves detection and quantification of the target gene

DESOKY, 2003

Page 88: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

88

Affy Tech – Mismatches for Control

Multipleoligo probes

25-mers

Perfect MatchMismatch

5’ 3’Gene Sequence

• Each probe has a “control” – a DNA sequence which differs only slightly from the feature

• In a 25-mer, the mismatch sequence differs in the 13th position (A-T or G-C)

DESOKY, 2003

Page 89: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

89

Page 90: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

90

Probe Tiling Strategy

• Gene expression monitoring with oligonucleotide arrays. Expression probe and array design. Oligonucleotide probes are chosen based on uniqueness criteria and composition design rules. For eukaryotic organisms, probes are chosen typically from the 3´ end of the gene or transcript (nearer to the poly(A) tail) to reduce problems that may arise from the use of partially degraded mRNA. The use of the PM minus MM differences averaged across a set of probes greatly reduces the contribution of background and cross-hybridization and increases the quantitative accuracy and reproducibility of the measurements.

Page 91: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

91

PMMM

Probe set

Probe pair

STROMBERG, 2003

Page 92: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

92

Affymetrix Data

• Each gene labeled as “present”, “marginal”, or “absent.” – Present: gene expressed and reliably

detected in the RNA sample

• Label chosen based on a p-value

Page 93: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

93

Page 94: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

94

Why Probe redundancy?• use of multiple independent detectors for the same molecule improves

signal-to-noise ratios (due to averaging over the intensities of multiple array features), improves the accuracy of RNA quantification (averaging and outlier rejection), increases the dynamic range, mitigates effects due to cross-hybridization, and drastically reduces the rate of false positives and miscalls.

• An additional level of redundancy comes from the use of mismatch (MM) control probes that are identical to their perfect match (PM) partners except for a single base difference in a central position. The MM probes act as specificity controls that allow the direct subtraction of both background and cross-hybridization signals, and allow discrimination between ‘real’ signals and those due to non-specific or semi-specific hybridization (hybridization of the intended RNA molecules produces more signal for the PM probes than for the MM probes resulting in consistent patterns that are highly unlikely to occur by chance

Page 95: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

95

Page 96: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

96

Page 97: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

97

Page 98: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

98

Gene Expression Data

Gene expression data on p genes for n samples

Genes

mRNA samples

Gene expression level of gene i in mRNA sample j

=Log (Red intensity / Green intensity)

Log(Avg. PM - Avg. MM)

sample1 sample2 sample3 sample4 sample5 …

1 0.46 0.30 0.80 1.51 0.90 ...2 -0.10 0.49 0.24 0.06 0.46 ...3 0.15 0.74 0.04 0.10 0.20 ...4 -0.45 -1.03 -0.79 -0.56 -0.32 ...5 -0.06 1.06 1.35 1.09 -1.09 ...

Page 99: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

99

Page 100: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

100

Page 101: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

101

What is gene expression?

Gene expression= Expression degree of a gene in a particular experiment (protein)

genes

Experiments (overtime)

Base line expression

Higherexpressioncompared tobaseline

Lowerexpressioncompared tobaseline

Spellman et al Mol. Biol. Cell 1998

Page 102: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

102

Looking at the transcriptome: microarrays

genes

co

nd

itio

ns

condition 1 condition 2

condition 3

statistical processing and analysis

Page 103: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

103

Page 104: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

104

Microarrays yield information

Image: bioinfo.mbb.yale.edu/~mbg/ fun3/microarray-mona/

Page 105: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

105

Are they important for clinical use?

High-throughput Technologies and ‘OMİKS’ Science

Page 106: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

106

Adrenal Gland

Endometrium

Pancreas

Brain

Breast

Uterus

Esophagus

Gall BladderKidney

LiverLung

Ovary

Skin Bone

Stomach

ThyroidHead & Neck

ProstateGerm Cell

Soft Tissue

Lymph

CervixBladder

GISTColon

Adrenal Gland

Endometrium

Pancreas

Brain

Breast

Uterus

Esophagus

Gall BladderKidney

LiverLung

Ovary

Skin Bone

Stomach

ThyroidHead & Neck

ProstateGerm Cell

Soft Tissue

Lymph

CervixBladder

GISTColon

Adrenal Gland

Endometrium

Pancreas

Brain

Breast

Uterus

Esophagus

Gall BladderKidney

LiverLung

Ovary

Skin Bone

Stomach

ThyroidHead & Neck

ProstateGerm Cell

Soft Tissue

Lymph

CervixBladder

GISTColon

Why gene expression profiles can classify cancer types?

Cancers from different origins are

Derived from cells thatpasses through differentdevelopmental stages.

Expression profiles of thecells coming from different

developmental stagesdiffer from each other.

Page 107: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

107

Revolution of Breast Cancer Classification

DNA Chip Analysis

Page 108: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

108

(Baselga and Norton, 2002)

Breast Cancer Classification

Page 109: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

109

Sorlie et al., 2001

Breast Cancer Classification

Page 110: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

110

Portrait of Breast Cancer

Sørlie et al. Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10869-10874.

Basal–like

HER-2

“Normal

Luminal B

Luminal A

Page 111: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

111

Subtypes of breast cancer identified by gene expression patterns (Sorlie et al, PNAS, 98: 10969-74, 2001)

Gene expression profiles provide classification of

the sub-types of cancers with different clinical

outcome.

Two ER positive subgroup:

•Luminal A Best clinical outcome

•Luminal B

Three ER negative subgroup:

•“Normal” breast-like

•ERBB2+ (ERBB2 amplic high expression)

•Basal-like Worst clinical outcome

Page 112: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

112

Molecular Grading of Breast Cancer

Sotiriou C, et al.. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006 Feb 15;98(4):262-72.

• Gen ifade profil verisi meme kanserinde iki moleküler derece (grade) göstermektedir.

• Histolojik Grade 2 durumları moleküler grade 1 ve 2 arasında dağılmıştır.

• Moleküler dereceleme ER/PR ve HER2 gibi geleneksel prognostik faktör multivaryant analizlerinden daha iyi performans vermektedir.

Page 113: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

113

Gene-based breast cancer testMammaPrint Array

FDA Approved

MammaPrint 70 genin aktivitesini ölçen DNA

mikroarray-bazlı bir testtir.

- Test ile bu genlerin herbirinin kadının meme

kanseri örneğindeki ifadeleri ölçülmekte ve özel

bir hesaplama kullanarak hastanın kanserinin

diğer bölgelere geçme olasılığının düşük mü

yoksa yüksek riskli mi olduğu hesaplamaktadır.

- Kimin tedavi edilmesi gerektiğine yön verici….

Page 114: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

114

“ MammaPrint is a DNAmicroarray-based test thatmeasures the activity of 70genes... The test measureseach of these genes in asample of a woman'sbreast-cancer tumor andthen uses a specific formulato determine whether thepatient is deemed low riskor high risk for the spreadof the cancer to anothersite.”

FDA Approves Gene-BasedBreast Cancer Test*

Page 115: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

115

DNA MİKROARRAY ANALİZİ İLE GEN İMZASI OLUŞTURMA - MammaPrint ARRAY

78 adet lenf nodu negatif genç hastanın primer meme tümörü kullanıldı:

- Bu hastalardan 5 yıl içinde uzak metastaz görülen 34’ünün gen ifade profilleri, 5 yıl içinde hastalığı olmadan yaşayan 44 hastanın gen ifade profilleri ile karşılaştırıldı.

- Analizler meme tümörlerini iyi veya kötü prognozlu grup olarak sınıflandırmalarına olanak veren 70 genlik bir gen ifade setinin çıkarılmasını sağladı.

Page 116: 1 MBG-487 Microarrays - I 2 HUMAN GENOME PROJECT

116

Intra-operative Cancer Detection

“The rapid RT-PCR assay has found a breast cancer stem cell related mRNA signature in the sentinel

lymph node”