advanced rna-seq course introduction...2011/08/25  · top genes sage and cage cage gene ratio...

39
Advanced RNA-Seq course Introduction Peter-Bram ’t Hoen

Upload: others

Post on 18-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Advanced RNA-Seq course

Introduction

Peter-Bram ’t Hoen

Page 2: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Expression profiling

• DNA mRNA protein

• Comprehensive RNA profiling possible: determine the abundance of all mRNA molecules in a cell / tissue

Page 3: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Expression profiling: applications

• Qualitative: which part of the genome is expressed, in which cells, which mRNA isoforms

• Quantitative: compare across conditions, understand biological processes / mechanisms

• Tumor vs. Normal tissue

• Knock-out vs. wild-type mouse

• Changing nutrient conditions in yeast

• Etc.

Page 4: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Transcriptome analysis

• Genome-wide expression profiling• Serial Analysis of Gene Expression (SAGE)

• Expression Microarrays

• Digital Gene Expression (DeepSAGE)

• Shotgun RNA sequencing

Page 5: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Expression microarray• Relative abundance

• Limited by content

Page 6: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Serial analysis of gene expression (SAGE)• Sequence and count short tags representative for a transcript

• Absolute abundance of transcript

Page 7: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

NGS-based sequencing vs. microarray

• RNA-seq• Counting

• Absolute abundance of transcript

• All transcripts present

• Expression microarray• Recording hybridization signal to complementary probe

• Relative abundance

• Cross-hybridization possible

• Content limited

1 2 3

4 5 6

7 8 9

T G C T A C G A T …

T T T T T T T G T …

111 222 33

444 555 666

777 888 999

T G C T A C G A T …

T T T T T T T G T …

Page 8: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Absolute transcript abundance

• Dynamic range dependent on sequencing method, sequencing depth and cell type

• Millions required • NOT Roche 454

• BUT Illumina, SOLiD, Helicos

Page 9: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Deep sequencing-based expression profiling

• Tag-based: one read per transcript• DeepSAGE most 3’ CATG

• DeepCAGE 5’-end

• PolyA -> ultimate 3’-end

• RNA-Seq: multiple reads per transcript• Whole mRNA sequencing after fragmentation

• miRNA (short RNA) sequencing

Page 10: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

DeepSAGE – sample preparation

PCR enrichment and gel purification (~85bp)

Page 11: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example gene: Gapd

14542

12555

Page 12: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example gene: alternative polyadenylation

97

99

Page 13: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

CAGE (Cap analysis of gene expression)

Page 14: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example CAGE

Page 15: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

More new transcription start sites (CAGE)

Better annotation of promoter regions

Page 16: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

General RNA-seq sample prep

1. Isolation of polyA+ mRNA with oligo-dT

2. Fragmentation by heating for 8 min at 94°C

3. Random-primed first and second strand cDNAsynthesis

4. End repair

5. Fragmentation

6. Adenylation of 3’-ends

7. Ligation of adapters (containing barcodes)

8. PCR amplification (15 cycles)

9. Clean-up

Page 17: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example RNA-Seq

Page 18: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Alternative splicing

Mortazavi et al. Nature Methods 5, 621 - 628 (2008)

Page 19: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Strand-specific random-primed sequencing

Cloonan, Nature Methods 2008

Page 20: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Ovation: not so random-primed

• No polyA+ selection

• No fragmentation

Page 21: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Helicos single molecule sequencing

Page 22: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example RNA-Seq (Helicos)ADAMTS8

ADAMTS15

NOV

Peter Henneman

Page 23: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example polyA profiling on Helicos

Eleonora de Klerk

Page 24: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Example polyA profiling

Eleonora de Klerk

Page 25: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

miRNA sequencing

• SOLiD small RNA (whole transcriptome) seq kit

Page 26: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Helicos direct RNA sequencing

Page 27: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Analysis steps - Introduction

1. Alignment to genome (transcriptome)

2. Remapping of unaligned reads

3. (Determining transcript isoform structures)

4. Quantifying transcript abundanceRPKM: reads per kilobase per million reads

FPKM: fragments per kilobase per million reads

5. Statistical testing for differential expression

Page 28: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Comparison to microarrays

Page 29: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Illumina features: Excellent reproducibility

Raw data Square root-transformed and scaled data

Page 30: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Excellent reproducibility between labs

Page 31: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Analysis of replicate samples

• Pooling: small contaminations can have large effect on outcome

• Technical replicates: not really necessary when sufficient sequencing depth is reached

• Biological replicates important for determination of biological variation

Page 32: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Power comparison

Van Iterson, BMC Genomics, 2009

Power

Number of samples

Page 33: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Power comparison (2)Intensity range

Power

Number of samples

Page 34: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

CAGE vs. SAGE

C2C12myoblast

Page 35: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Correlation CAGE vs. SAGE (gene level)

Logratio differentiated vs. proliferating

32341702Unchanged

21602144Differentially expressed (*)

UnchangedDifferentially expressed (*)

Differentially expressed genes

SAGE

CA

GE

* Bayesian error rate < 0.05Vencio et al. Bioinformatics 5: 119 (2004)

SAGE

CA

GE

Only detected with CAGE: 1169Only detected with SAGE: 1747

Page 36: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Top genes SAGE and CAGECAGE gene Ratio Microarray SAGE gene Ratio Microarray

Hfe2 4,073 NA RP23-36P22.5 576 NA

Myom3 1,624 NA Neb 525 NA

Lmod2 1,305 NA Mylpf 504 Yes

Myh7 1,124 Yes Ttn 380 NA

Mb 908 Yes Myh3 368 Yes

RP23-36P22.5

735 NA Xirp1 306 Yes

Pygm 717 Yes 1110002H13Rik

263 NA

Myl4 614 Yes Tnnc1 232 Yes

Synpo21 595 NA Cav3 150 Yes

Myh1 561 Yes Cbfa2t3 133 Yes

…… ……

13 out of 30 not found by microarray 10 out of 30 not found by microarray

Page 37: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Most significant pathwaysCAGE GO SAGE GO Microarray GO

Regulation of striated muscle contraction

Regulation of muscle contraction

Cycline-dependent protein kinase inhibitor activity

Cardiac muscle contraction Cardiac muscle contraction Myogenesis

Myogenesis Myogenesis Skeletal muscle development

Regulation of muscle contraction

Regulation of striated muscle contraction

Myoblast differentiation

Skeletal muscle development

Skeletal muscle development

6-phosphofructokinase activity

Muscle development Myofibril assembly Muscle development

Striated muscle contraction Muscle development Muscle cell differentiation

Myoblast differentiation Myoblast fusion Tumor suppressor activity

Muscle cell differentiation Striated muscle contraction Myofibril assembly

Sarcomere organization Muscle cell differentiation Heart development

10/10 muscle related 10/10 muscle related 7/10 muscle related

Page 38: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

Conclusions

• Next generation sequencing provides higher power, sensitivity and reproducibility than expression microarrays

• Deep sequencing offers more than microarrays• Alternative transcription start site usage

• Alternative splicing

• Alternative polyadenylation

• Allele-specific expression

Page 39: Advanced RNA-Seq course Introduction...2011/08/25  · Top genes SAGE and CAGE CAGE gene Ratio Microarray SAGE gene Ratio Microarray Hfe2 4,073 NA RP23-36P22.5 576 NA Myom3 1,624 NA

AcknowledgementsYavuz AriyurekHenk BuermansTassos MastrokoliasMatt HestandJohan den DunnenGertjan van Ommen

DNAFORM

Andreas KlinghoffMatthias ScherfThomas Werner

Matthias HarbersMakoto Suzuki

Wilbert van Workum