aug2014 abrf interlaboratory study plans

19
The ABRF Next Generation Sequencing Study: Multi-Platform and Cross- Methodological Reproducibility of RNA and DNA Profiling Genome in a Bottle Consortium Workshop August 2014 Don A. Baldwin, Ph.D CSO, Pathonomics LLC

Upload: genomeinabottle

Post on 19-Jun-2015

220 views

Category:

Health & Medicine


5 download

DESCRIPTION

ABRF update

TRANSCRIPT

Page 1: Aug2014 abrf interlaboratory study plans

The ABRF Next Generation Sequencing Study:Multi-Platform and Cross-Methodological Reproducibility of RNA and DNA Profiling

Genome in a Bottle Consortium Workshop

August 2014

Don A. Baldwin, Ph.D.CSO, Pathonomics LLC

Page 2: Aug2014 abrf interlaboratory study plans
Page 3: Aug2014 abrf interlaboratory study plans

ABRF is an international organization of over 700 scientists from shared research resource core facilities and biotechnology laboratories.

Members represent over 250 core labs in academic and research institutions, government, and industry.

“Yellow pages” and “MarketPlace” databases of members at www.ABRF.org Electronic discussion group facilitates sharing of technical advice and core facility

networking.

The Journal of Biomolecular Techniques covers genomics, proteomics, imaging, and other biotechnologies, and core facility operational management.

Page 4: Aug2014 abrf interlaboratory study plans

www.abrf.orgMetagenomics (MGRG)

Page 5: Aug2014 abrf interlaboratory study plans

The ABRF Next Generation Sequencing (NGS) Study:

• Produce reference data sets to establish baseline performance• Promote the use of standard samples• Provide public access to data for self-evaluation, performance monitoring

and methods development

Phase I: RNA-Seq and degraded RNA-Seq (2011-2013)Phase II: DNA-Seq and hard-to-sequence regions and samples (2014-2016)Phase III: Clinical genetics sequencing panels

Page 6: Aug2014 abrf interlaboratory study plans

Phase I Study Design

Page 7: Aug2014 abrf interlaboratory study plans

Major Conclusions

Intraplatform concordance: Spearman rank R > 0.86

Interplatform concordance: R > 0.83

Q10 – Q60, most variation at read starts and ends

Higher alignment rates with platform-specific algorithms vs. STAR

Higher single-base mismatch and indel rates with platform-specific algorithms vs. STAR

Wide range of efficiencies and costs for splice junction profiling

Highly similar profiles from rRNA-depleted and polyA-enriched samples

Effective analysis of degraded RNA after rRNA depletion

Page 8: Aug2014 abrf interlaboratory study plans

Funded by:Vendor donations of sample preparation and sequencing reagents

Participating laboratories

ABRF

Nature Biotechnology, September 20146 figures, 2 tables39 supplementary figures, 7 supplementary tables

The ABRF NGS Study, Phase I

26 primary scientists34 contributing scientists21 research institutions

4.3 billion reads447 billion nucleotides

Page 9: Aug2014 abrf interlaboratory study plans

The ABRF NGS Study, Phase II

DNA sequencing topics were brainstormed and prioritized by the study consortium

Samples were chosen based on the August 2013 Genome in a Bottle Workshop

Page 10: Aug2014 abrf interlaboratory study plans

Phase II DNA sequencing aims

Reference data sets• Intra- and inter-lab replication to model the range of performance

expected under normal service laboratory conditions Reference samples• Easily accessible for self-evaluation by comparison to the reference data• Standardized, stably reproduced, suitable for methods development Immediate utility• Performance metrics and data applicable to methods used now or in the

near future by sequencing core facilities

Page 11: Aug2014 abrf interlaboratory study plans

Projectsin no particular order, with project scope and sequencing coverage to be prioritized by interest and funding:

Performance using different platforms and technical protocols• NIST GiaB designated human genomic DNA• Measure sequencing accuracy and coverage Performance using damaged DNA and chimeric cell populations• DNA from formalin-fixed, paraffin embedded cells• Measure sequencing accuracy, coverage, and limits of detection for

somatic mutations

Performance on small genomes over a range of GC content• NIST GiaB (with FDA) designated bacterial genomic DNA• Measure sequencing accuracy and coverage

Page 12: Aug2014 abrf interlaboratory study plans

Samples

Sample ID DNA source

Sequencing Project

Per replicateBreadth Depth

A Ashkenazim Jew PGP, maternal cell line 1 genome 35xB Ashkenazim paternal cell line 1 genome 35xC Ashkenazim child cell line from NIST stock 1 genome 35x

Performance using different platforms and technical protocols

Page 13: Aug2014 abrf interlaboratory study plans

Performance using damaged DNA and chimeric cell populations

Sample ID DNA sourceSequencing

ProjectPer replicate

Breadth Depth

M pool of FFPE DNA from mutant AcroMetrix lines #1, #2 and #3 plus Horizon Dx line #4:1 and 4 40% each, 2 and 3 10% each by copy number

Syn Accugenomics pool of synthetic templates for the mutations in lines #1-#4; tagged, stock = 40:40:10:10

C2 Ashkenazim child cell line from Coriell stock 2 exome 100x

C2f Ashkenazim child cell line suspension, formalin-fixed, paraffin embedded 2 exome 100x

Mf0 100% M DNA from FFPE,spike gDNA with Syn at molarity = single copy gene in total M DNA 2 exome 100x

Mf1 25% C2f, 75% M (each target’s copy number = 15% or 3.75%);spike gDNA with Syn = M 2 exome 100x

Mf2 50% C2f, 50% M (targets = 10% or 2.5%); spike gDNA with Syn = M 2 exome 100x

Mf3 75% C2f, 25% M (targets = 5% or 1.25%); spike gDNA with Syn = M 2 exome 100x

Mf4 90% C2f, 10% M (targets = 2% or 0.5%); spike gDNA with Syn = M 2 exome 100x

Mf5 95% C2f, 5% M (targets = 1% or 0.25%); spike gDNA with Syn = M 2 exome 100x

Mf6 99% C2f, 1% M (targets = 0.2% or 0.05%) Mf7 99.5% C2f, 0.5% M (targets = 0.1% or 0.025%)

Mf8 99.9% C2f, 0.1% M (targets = 0.02% or 0.005%)

Samples

Page 14: Aug2014 abrf interlaboratory study plans

Oncogenic mutations

• BRAF V600E• KRAS G12C• EGFR c.2235_2249 del15 • EML4-ALK

Page 15: Aug2014 abrf interlaboratory study plans

Sample ID DNA sourceSequencing

Project

Per replicateBreadth Depth

Sta Staphylococcus aureus 3 genome 100x

Sae Salmonella enterica 3 genome 100x

Psa Pseudomonas aeruginosa 3 genome 100x

Cls Clostridium sporogenes 3 genome 100x

P pooled metagenomic sample with all four bacterial genomes 3 genome 100x

Performance on small genomes over a range of GC content

Samples

Page 16: Aug2014 abrf interlaboratory study plans

Species Genome (bp)

Avg % GC

Reference strain Distributor

Staphylococcus aureus 2.8x106 33 NRS77 (NCTC 8325)

NARSA #NRS77

Salmonella enterica subsp. enterica serovar Typhimurium

4.9x106 52 LT2 ATCC #700720

Pseudomonas aeruginosa 6.7x106 67 PA01 ATCC #47085Clostridium sporogenes 4.1x106 28 Metchnikoff ATCC #15579

Small genomes project: sizes and GC content

Page 17: Aug2014 abrf interlaboratory study plans

Platforms and library methodsPlatform Project 1 Samples Project 2 Samples Project 3 Samples

Illumina X10 A, B, C, C2

Illumina 1 T A, B, C, C2

Illumina NextSeq 500 A, B, C, C2 Sta, Sae, Psa, Cls, P

Illumina HiSeq 2500 A, B, C, C2 C2, C2f, Mf0-Mf5

Illumina 2500 Rapid run C for long-read scaffold

Illumina MiSeq C for long-read scaffold Sta, Sae, Psa, Cls, P

Life Technologies Proton A, B, C C2, C2f, Mf0-Mf5 Sta, Sae, Psa, Cls, P

Life Technologies PGM Sta, Sae, Psa, Cls, P

Pacific Biosciences C for long-read scaffold Sta, Sae, Psa, Cls, P

Qiagen GeneReader Sta, Sae, Psa, Cls, P

Library Protocol

Illumina Moleculo A, B, C

Nextera on HiSeq A, B, C Sta, Sae, Psa, Cls

NuGEN on HiSeq A, B, C Sta, Sae, Psa, Cls

New England Biolabs on HiSeq A, B, C Sta, Sae, Psa, Cls

Kapa on HiSeq A, B, C Sta, Sae, Psa, Cls

Rubicon on HiSeq A, B, C Sta, Sae, Psa, Cls

Bioo on HiSeq A, B, C Sta, Sae, Psa, Cls

EXOME: Agilent Sure Select C2, C2f, Mf0-Mf5

EXOME: Roche Nimblegen SeqCap EZ C2, C2f, Mf0-Mf5

EXOME: Ampliseq Exome Panel for Proton C2, C2f, Mf0-Mf5

Page 18: Aug2014 abrf interlaboratory study plans

An ABRF – GiaB collaboration

• Get vendor commitments for technical support and reagent donations• Extract high-quality genomic DNA from cultured cells for A, B, C, C2, Sta,

Sae, Psa and Cls• Prepare equimolar blend of bacterial DNA for pool P• Procure somatic mutation cell lines in FFPE blocks• Extract genomic DNA from FFPE blocks of cell suspensions, prepare blends• Assemble platform groups with at least 3 labs per instrument or method• Each platform group will determine a consensus protocol for library

preparation and sequencing• Distribute aliquots of DNA reference stocks to participating study labs• Construct and/or sequence libraries (intra-lab replicates encouraged)• Collect and annotate data in a central repository• Analyze sequencing performance

planned started complete

Page 19: Aug2014 abrf interlaboratory study plans

Name email Contact regarding:Baldwin, Don [email protected] study designGrills, George [email protected] vendor and partner relationsMason, Chris [email protected] data analysisNicolet, Charlie [email protected] sequencing methodsTighe, Scott [email protected] logistics

The ABRF NGS Study leadership groupin alphabetical order, with level of participation and devotion to be prioritized by alcoholic intake: