iwgsc-whole genome shotgun assembly … · curtis pozniak* ron maclachlan kirby nilsen krysta wiebe...

16
Crop Development Centre Durum Wheat Breeding and Genetics Program Dr. Curtis Pozniak IWGSC Workshop, PAG Jan 2016 ©Use and/or distribution of these slides is prohibited unless approval by the author is granted IWGSC-WHOLE GENOME SHOTGUN ASSEMBLY OF CHINESE SPRING

Upload: voque

Post on 28-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Crop Development Centre Durum Wheat Breeding and Genetics Program

Dr. Curtis PozniakIWGSC Workshop,

PAG Jan 2016

©Use and/or distribution of these slides is prohibited unless approval by the author is granted

IWGSC-WHOLE GENOME SHOTGUN ASSEMBLY OF CHINESE SPRING

Kellye Eversole* Jane Rogers *

Gil Ronen* Kobi BaruchOmer Barad

Nils Stein*Axel Himmelbach

Ines Walde Martin Mascher

Mike Thompson*

Jesse Poland*

Andrew Sharpe*Alvaro Hernandez

Curtis Pozniak*Ron MacLachlan

Kirby NilsenKrysta Wiebe

Assaf Distelfeld *

Principle Investigators (*) and Co-authors

David KonkinSateesh Kagale

Fred Choulet*

August Sept October November December January February

DNA Extraction

Agreement Go!

Libraries Complete

Sequencing Complete

Assembly V0.1

March

De novo assembly using Illumnia generated (200X) PE and MP data and NRGene Assembly pipline.

IWGS-Whole Genome Assembly Project

Assembly V0.2

Complete QA

IWGSC-WGA v0.2 MetricsTotal No. Scaffolds 138,484Assembly size: 14.5 GbGaps size: 262 Mb Gaps: 1.8 %L50: 7.1 MbN50 (#sequences): 566L90: 1.26 MbN90 (#sequences): 2,363Max size 46 Mb

Assembly Comparisons

• Alignment to MTP-based assemblies

• Alignment to physical maps

• Genic/Intergenic space

• Anchoring of assembly

Preliminary QCof IWGS-WGA

Martin Mascher,IPK Gatersleben

Fred ChouletINRA Clermont

Alignment to MTP-based Assemblies>95% of the 3B Pseudomolecule is present in the IWGSC-WGA

position in 3B reference scaffold (Mb)

1 2 3 4 5

Posit

ion

in N

RGen

esc

affo

ld (M

b)

SSt1 ta3B Interval (4 Mb)

Comparison of Physical Maps to CS-WGS AssemblyMapping of whole genome profiling tags of LTC-derived physical maps

to IWGSC-WGA reveals near perfect colinearity

Analysis of Genic / Intergenic Space

IWGSC-CSS Gene Models

4315 scaffolds (14.0 Gb) with 1 + CSS genes

3197 scaffolds (13.6 Gb) with 3 + CSS genes

Full length 1 Scaffold Partial Sequence >1 Scaffold Missing Genes

97,553*

Gene Models

*99% identity, 90% coverage

TE junction-based markers

Mapped ISBPs Unmapped ISBPs

62,631ISBP Markers

Chrom 1B.

>97% of the CSS coding exons are present on a single scaffold>98% of ISBP markers map to the IWGSC-WGA

Anchoring of IWGSC-WGA

Source No. markers % Mapped

CSS Survey Sequence (337,000contigs) 99.3%

Cs x Re 272,218 92.1%

Syn x Opata (437,973contigs*) 98.3%

90K iSelect 81,987 97.6%

U of Bristol Axiom** 819,572 98.1%

High density marker data allows anchoring of 14.0 Gb to genetic maps

>14.1 Gb (96.5 %) assigned to

chromosome arms

*Number of JGI Contigs anchored to Syn x Opata genetic map** U of Bristol WISP Project (www.cerealsdb.uk.net)

>431,617 anchored scaffolds

map to Syn x Opata

Chromosome arm assignment and Hi-C

Ines WaldeAxl HimmelbachNils Stein Martin Mascher

BioNano map 84

7A group scaffolds

Bionano single molecule maps of chrom. 7A confirm assembly order and allow super-scaffolding of IWGSC-WGA

Optical Mapping

Analysis courtesy of Rudi Appels and Gabriel Keeble

BioNano map 49

Approx. 5 % scaffolds >1 Mb are putative chimeras

Evidence for Chimeric Assemblies

IWGSC Whole Genome Assembly Data Release

IWGSC Data Repository hosted by INRA-URGIhttp://wheat-urgi.versailles.inra.fr/Seq-Repository

After QC, IWGSC-WGA will be released under Toronto agreement through the IWGSC sequence repository

April 2016

Initial QA suggests that the IWGSC-WGA is impressively

complete

Chromosome-specificMTP assembliesIWGSC-WGA

Physical Maps

Optical / Bionano map / Genetic maps

Other WGS assemblies:TGAC

PACBio/IlluminaCSS

WGP Tags of All MTPs

Discussion initiated and ongoing!

A Path Forward: Integration of all Resources A Complete Wheat Genome Sequence by 2017

Thanks to Our Funders!