how to do successful gene expression analysis vs gene maximization sample maximization –to be...

47
How to do successful gene expression analysis Jan Hellemans, PhD Center for Medical Genetics Biogazelle qPCR meeting June 25 th 2010 Sienna, Italy

Upload: others

Post on 24-Apr-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

How to do successful gene expression analysis

Jan Hellemans, PhD

Center for Medical Genetics

Biogazelle

qPCR meeting – June 25th 2010 – Sienna, Italy

Page 2: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

qPCR: reference technology for nucleic acid quantification

sensitivity and specificity

wide dynamic range

speed

relative low cost

conceptual and practical simplicity

easy to perform ≠ easy to do it right

many steps involved

all need to be right

Introduction

Page 3: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Introduction

Choice of

chemistry

Choice of

RT

RNA quality

assessment

Sample selection

and handlingData

reporting

Sample extraction

RT and PCR

primer design

cDNA synthesis

strategy

Assay validationData

analysis

Page 4: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

prepare – cycle – report

Page 5: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Prepare

experiment design

• power analysis

• sample vs gene maximization

• run layout

samples

• preparation

• quality control

• pre amplification

assays

• design

• in silico validation

• empirical validation

reference gene

• selection

• validation

Page 6: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Prepare

experiment design

• power analysis

• sample vs gene maximization

• run layout

samples

• preparation

• quality control

• pre amplification

assays

• design

• in silico validation

• empirical validation

reference gene

• selection

• validation

Page 7: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Power analysis

determination of the number of data points needed to reach statistical significance for a given

difference

variability

technical constraints

confidence interval (CI)

3 (~ critical t-value t*)

CI = SEM x t*

0,00

2,00

4,00

6,00

8,00

10,00

12,00

14,00

2 3 4 5 10 20 100

cri

tic

al t-

va

lue

number of datapoints

Page 8: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Power analysis

determination of the number of data points needed to reach statistical significance for a given

difference

variability

technical constraints

confidence interval (CI) 3

Mann-Whitney test: nA + nB 8

Wilcoxon test: 6 pairs

http://www.cs.uiowa.edu/~rlenth/Power/

Page 9: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

how to set-up an experiment with

3 genes of interest (GOI) & 3 reference genes (REF)

11 samples (S) & 1 no template control (NTC)

Sample vs gene maximization

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11NTC

S1

S2

S3

S4

S5

S6

S7

NTC

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11NTC

S1

S2

S3

S8

S9

S10

S11

NTC

sample maximization

GOI2

GOI3

REF1

REF2

REF3

GOI1

gene maximization

REF1 REF2 REF3 GOI1 GOI2 GOI3

GOI2 GOI3REF1 REF2 REF3 GOI1

Page 10: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Sample vs gene maximization

sample maximization – to be preferred

no increase in variation due to absence of inter-run variation

suitable for retrospective studies and controlled experiments

gene maximization

introduces (under-estimated) inter-run variation

applicable for prospective studies or large studies in which the number of samples do not fit in the run anymore

inter-run variation can be measured and corrected for using inter-run calibrators (IRC) through a procedure called inter-run calibration

Page 11: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Prepare

experiment design

• power analysis

• sample vs gene maximization

• run layout

samples

• preparation

• quality control

• pre amplification

assays

• design

• in silico validation

• empirical validation

reference gene

• selection

• validation

Page 12: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Preparation

cDNA synthesis

most variable step in the workflow (> RT replicates)

different performance of the enzymes

linearity and yield are important

DNase treament

retropseudogenes (15%) and single exon genes (5%)

on column vs. in solution

verify absence of DNA

• qPCR for genomic DNA target on RNA as input

Page 13: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Evaluate integrity of 18S and 28S rRNA

Agilent Bioanalyzer

Bio-Rad Experion

Caliper GX

Qiagen QIAxcel

Shimadzu MultiNA

Quality control – RNA integrity value

Page 14: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

universally expressed low abundant reference

anchored oligo(dT) reverse transcription

increasing delta-Cq values upon artificial RNA degradation

Quality control – 5’-3’ ratio

AAAAAA

5’ 3’

Cq 5’ Cq 3’Thermic degradation

0

1

2

3

4

5

6

7

8

9

109 109* 109** 275 275* 275** 539 539* 539**

samples

5'-

3' d

elt

a C

t

Page 15: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

spiking of synthetic sequence lacking homology with any known human sequence into RNA

Quality control – SPUD assay for inhibition

SPUD

+

H2O

SPUD

+

heparin

SPUD

+

RNA1

SPUD

+

RNA2

SPUD

+

RNA3

Cq 22 Cq 27 Cq 22 Cq 25 Cq 22

ΔCq > 1: presence of inhibitors

------------RT-qPCR---------

Page 16: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

methods

WT-Ovation (NuGEN)

limited cycle PCR (PreAmp - Applied Biosystems)

preservation of differential expression (fold changes) before (B) and after (A) sample pre-amplification

(G1S1)B/(G1S2) B = (G1S1) A/(G1S2) A G1B/G2B < > G1A/G2A

gene G, sample S, before B, after A

Pre amplification

Page 17: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Prepare

experiment design

• power analysis

• sample vs gene maximization

• run layout

samples

• preparation

• quality control

• pre amplification

assays

• design

• in silico validation

• empirical validation

reference gene

• selection

• validation

Page 18: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

http://www.rtprimerdb.org

Page 19: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Assay design guidelines

location

sequence repeats, protein domains

splice variants

intron spanning vs intra exonic

short amplicons: 80-150bp

SNPs

primers

dTm < 2°C

identical Tm for all assays

maximum 2 GC in last 5 nucleotides

use software to design assays

Primer3(Plus), BeaconDesigner, RTprimerDB

Page 20: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

In silico assay validation

do thorough in silico assay evaluation

BLAST/BiSearch specificity analysis

mfold secondary structure

SNP analysis of primer annealing regions

splice variant specificity

streamline in silico analyses with RTprimerDB pipeline

Page 21: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Empirical assay validation

specificity

size analysis (only once)

• agarose or polyacrylamide gel

• capillary electrophoresis

melting curves (SYBR, repeated)

[sequence / restriction digest]

amplification efficiency

standard curve

• range & number dilution points

• representative sample

[single curve efficiency algorithms]

for absolute quantification

linear range and limit of detection

Page 22: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Prepare

experiment design

• power analysis

• sample vs gene maximization

• run layout

samples

• preparation

• quality control

• pre amplification

assays

• design

• in silico validation

• empirical validation

reference gene

• selection

• validation

Page 23: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Single reference gene

quantitative RT-PCR analysis of 10 reference genes (belonging to different functional and abundance classes) on 85 samples from 13 different human tissues

0

1

2

3

4

ACTB

HMBS

HPRT1

TBP

UBC

A B C D E F G

Page 24: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Single vs multiple reference genes

single reference gene

errors related to the use of a single reference gene:> 3 fold in 25% of the cases> 6 fold in 10% of the cases

multiple reference genes

developed a robust algorithm for assessment of expression stability of candidate reference genes

proposed the geometric mean of at least 3 reference genes for accurate and reliable normalisation

geNorm analysis in pilot study

Vandesompele et al. Genome Biol. 2002 Jun 18;3(7):RESEARCH0034.

Page 25: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

geNorm

validation

insensitive to outliers

reduce most of the variation

statistically more significant results

accurate assessment of small expression differences

de facto standard for reference gene validation

2 400 citations of the geNorm technology

~12 000 geNorm software downloads in 112 countries

Page 26: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

genormPLUS

Page 27: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

genormPLUS

Page 28: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

genormPLUS

Page 29: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Cycle

cycle

• instrument

• chemistry

• controls

Page 30: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

fast PCR

fast ramping ≠ fast qPCR experiment

96-well vs 384-well

384-well system is slightly more expensive

384-well plates harder to pipet (multichannel pipets or pipetting robot)

384-well run gives 4x more data in same time

384-well plates require smaller volumes

plate homogeneity test

Instrument

Page 31: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Chemistry

choose probes for

multiplexing

genotyping

absolute sensitivity (detection past cycle 40) (e.g. clinical-diagnostic setting, GMO detection)

choose SYBR Green I for

all other applications

low cost

seeing what you do

Page 32: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

melting curve

unique melt peak for all samples?

replicates

delta-Cq < 0.5 cycles?

controls

negative control really blankdelta-Cq samples/NTC > 5?

positive controls with expected Cq?

amplification plot shape (kinetic outlier detection)

Controls

Page 33: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Report

relative quantification

• efficiency correction

• multiple reference gene normalization

• inter-run calibration

• error propagation

bio statistical analysis

• biological replicates

• log transform data

• selection of statistical test

reporting guidelines

• RDML

• MIQE

Page 34: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Report

relative quantification

• efficiency correction

• multiple reference gene normalization

• inter-run calibration

• error propagation

bio statistical analysis

• biological replicates

• log transform data

• selection of statistical test

reporting guidelines

• RDML

• MIQE

Page 35: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Calculation methods

Cq RQ NRQ CNRQ

Norm

aliz

ation

Inte

r-ru

n c

alib

ration

CqERQ

nref

n

i

toi

iRQ

RQNRQ

n

irc

n

i

soi

iNRQ

NRQCNRQ

Hellemans et al. Genome Biol. 2007;8(2):R19.

ref

toi

RQ

RQNRQ

CqRQ 2irc

soi

NRQ

NRQCNRQ

Page 36: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Data processing - relative quantification

Page 38: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for
Page 39: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Quality controls

PCR replicates

∆Cq < 0.5 cycles

no template control

no signal (no Cq value)

Cq (NTC) > Cq (samples) + 5

reference gene stability

M < 0.5M < 1 for heterogeneous samples

CV < 25%CV < 50% for heterogeneous samples

normalization factors

no unexpected high variation

Page 40: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Report

relative quantification

• efficiency correction

• multiple reference gene normalization

• inter-run calibration

• error propagation

bio statistical analysis

• biological replicates

• log transform data

• selection of statistical test

reporting guidelines

• RDML

• MIQE

Page 41: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Replicates

technical vs biological replicates

repeated measures vs. replication

PCR replicates (pipetting error & Poisson’s law)

RT replicates

repeated RNA extraction from same sample

repeated cell cultures / patient sampling

true biological replicates (from different subjects)

no statistics on repeated measures

type of replicates dictates conclusions that can be drawn

Page 42: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

relative quantities are not normally distributed

log transformation makes them more symmetrical

relevant tests in the field of relative quantification

comparison of 2 unpaired groups

• t test

• Mann-Whitney

• randomization test

comparison of 2 paired groups

• ratio t test (paired t test on log values)

• Wilcoxon rank sum test

correlation analysis

• Pearson

• Spearman

linear regression

correct for multiple testing

Statistical tests

Page 43: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

Report

relative quantification

• efficiency correction

• multiple reference gene normalization

• inter-run calibration

• error propagation

bio statistical analysis

• biological replicates

• log transform data

• selection of statistical test

reporting guidelines

• RDML

• MIQE

Page 44: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

MIQE

http://www.rdml.org/miqe

Bustin et al. Clin Chem. 2009 Apr;55(4):611-22.

authors

improve quality of qPCR experiments

reliable and unequivocal interpretation of results

reviewers and editors

assess technical merit

full disclosure of reagents and analysis methods

consumers of published research

published results easier to reproduce

Page 45: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

MIQE checklist for authors, reviewers and editors

experimental design

sample

nucleic acid extraction

reverse transcription

target information

oligonucleotides

qPCR protocol

qPCR validation

data analysis

E – essential

D – desirable

Page 46: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

RDML

http://www.rdml.org

Lefever et al. Nucleic Acids Res. 2009 Apr;37(7):2065-9.

Page 47: How to do successful gene expression analysis vs gene maximization sample maximization –to be preferred no increase in variation due to absence of inter-run variation suitable for

acknowledgements

Jo Vandesompele

Stefaan Derveaux

http://www.biogazelle.com - [email protected]