wgp tomato eu-sol meeting july 15, 2009 antoine janssen
Post on 01-Jan-2016
225 Views
Preview:
TRANSCRIPT
overviewWhole Genome Profiling
Whole Genome Profiling: the concept POP in Arabidopsis WGP melon Combining WGP and WGS WGP Tomato
Whole Genome Profiling:
Sequence-based physical mapping BAC clones using Illumina Genome Analyzer (Solexa)
Next-generation sequencing technologies have accelerated whole genome re-sequencing approaches and reduced their costs dramatically
but,
de novo construction of genomes in complex organisms is still costly
therefore,
An improved de novo draft genome sequencing strategy is needed taking full advantage of the power of next-generation sequencing
The challengeWhole Genome Profiling
BAC libraries
- BACs 125 kb average insert size, covering 5-20 times the genome (GE)
Chromosome
BAC1BAC3
BAC5BAC4
BAC2
Whole Genome Profiling
TTAA……ACTTAGTTAGCTTGGACTAACGAATTCGTAGGCATAGTGACTAGCATTG…..……TTAA
EcoRIMseI MseI
Restriction fragmentsWhole Genome Profiling
Arabidopsis Genome – 125 Mbp 6144 BACs (5 GE) in 384 well plates Each Illumina GA lane:
• 768 BACs ~ 3 M reads Total 8 lanes
Individual BAC target preparation is too time consuming/costly Therefore: BAC 2D pooling Each pool identified by unique sample identification tag
Pooling BAC clones
R1 - CTACT
R2 - CAGGT
R3 - GCATC
R4 - TGCAG
R5 - TACTA
R6 - CCTAG
TC
TG
T -
AG
AC
T -
GA
GT
C -
GT
GC
A -
AT
CA
C -
GT
AT
C -
384 wells plate =
384 BACs
column pools
row pools
19 20 21 22 23 24
Whole Genome Profiling
Illumina Genome Analyzer
GTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAAC GAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGCGAACGCAATTCGTTCTCCCACATGATGTTGCGCGACCAATTCCGTGTTTGTTTCTTGCAGTAGCCATCAATTCGTCTTACATGACTTCCATAGTCAGACAATTCGGGGCATTCCAAAAACTTGGGTTGACAATTCATCAATTCATAATATAATAGTGACGCAATTCCATTTTCCTTCCTCATTCATTAAGGCAATTCCAGAGATTTGTAGACTCTAGGTCTTCAATTCAGCTCACATTTCTTCGTTTTTAAAACAATTCAGTAAAAAAATACAAGTAGTGTCAGCAATTCTCTTCCATAAGAAGTTGTGAGGTTACAATTCTGGGAACTATAAAAATCAGATGACGCAATTCTTTTGTACATGGGATAAACAGTTCCCAATTCTACAATGGAGCTCTTATATGGTTCCCAATTCAACTAAAGATAAAGGAGTTCGCGCGCAATTCTAAAACCAATCAATTAAATAGAACGCAATTCCTACATAGATGGTGGGACCCGTCTTCAATTCAAATAAGCTAATACCAAGTAGTACTCAATTCCTCTCTATAAACCATTACAAGGTGTCAATTCTGATGCTCGTCCTCGTCATAGGTACCAATTCCAAATCAAATAAAATCCTATTATTACAATTCTTGAACGTGTCGTCTCACACGTCTTCAATTCGTGCGTTTCTATCAAAACTGTAACTCAATTCAGACTCTGAATAAAGTTTACTATGTCAATTCGGATTCACCTATAGCTTGATGAATACAATTCAGAACCTTCTCTTCCTCAAAGTCTTCAATTCATGGAGACTTCAAATTTGAAGTTCCCAATTCTGGTTTTGTGTTTACAGTTAGTTGACAATTCAGTATTACTAGAGCTGCGTATAACTCAATTCTCATCAGAAAAAATATATGATACGCCAATTCCAAAAGATTGTAGAGTATTTGCCATCAATTCCAGTTTTGGGTTCTTCACCAGAAGTCAATTCAATTCCTTCAATCATCACACGGTCGCAATTCAGTGTAAAAAGAAATTTCTAGCGGTCAATTCATGGTATGATGATAATAAGTTAACTCAATTCGAGTTCAGTTTGTTACTACTGCGGTCAATTCGGCAGAAGACACCAGTAACATAAGGCAATTCATAGGTATAGGAAGAAGCCCTAACTCAATTCGTTCATACTTACACATGCATTATTACAATTCTATATGATTTGTTACAGGTTGAATACAATTCAAAACCCTAATCACACGCACTGAACCAATTCCACAATCGATGCTTATGAAATATTACAATTCCAGTGGTCTAACAAGTAAACGTTGACAATTCTTCATTCGAGTATTTCATCAGAAGTCAATTCGATTAGAAGCTGAAGAAGAAGTCGCCAATTCACAAAAAGCAAAACCCTTTTGTACTCAATTCTAATGGTGATTGAAAGACAAGTTATCAATTCTCCTCAGAGTTGGCGATGGAGCCATCAATTCTCTCGGGGATGTGTTGGGGGGCGCGCAATTCGCTTATCAAATCATTACCAGGTCAGCAATTCTATGGAGATAATTCGTGGGATAAGGCAATTCTCATAGAAACAGAGAATGGATAACTCAATTCTAAACTCAACCACCTAAAACGTCGCCAATTCTACAGCCAGGTTTTGGATCTTACCACAATTCCCGGGGGCAAATTACGTTGAGGTACCAATTCCGTAGGCGACTTGAGTGCGGGTTATCAATTCAACAGTCATTTCATTGGACCGGTACCAATTCATTATTTTCATATAAATTTTGTTATCAATTCTTCGCTTTCGCCACTTGGTCTGACGCAATTCTGATCCATTGATTGCTCTTGGTTCCCAATTCTTGTTTAGGCAGTTCATACCGTTGACAATTCTCAGTATGCTAGGTGGTTGAGGTACCAATTCAGTTCAAGAGCCCAAGGACTGGTACCAATTCATCGTGAGAGAATGAGTAAAGCCATCAATTCGGTTCAGTATTTCCTTTCGGGGTGTCAATTCGCAAGGATTTGTAGGCCGGATATTACAATTCTGGGTTTTTCCTTCTGGTGAGAATACAATTCTGAAGTCCTACGAAATATAGTATGTCAATTCTCCCAAAATGTGAGAGGTCCGTCAGCAATTCATTTTCATTCTGACCGAACCTATGTCAATTCCATATTCGAAGTTGCGATCAGAATACAATTCAAAGTTGTAAGTAATATCTCTATTACAATTCCAATAGAGAAAAGAGTCGTAGTCAGCAATTCGCCCTATAGTGAGTCGTATTGTCAGCAATTCCATTTCCGGCGTGATGATGC
Whole Genome Profiling
Illumina sequence reads:TCTGT CAATTC TAGTACCAAGCTTGCCATGATAAGG CAATTC GTTCCCGGGCCTTGTACACAGTCGC CAATTC CATCCAATAAATAGCTCTATGCATC CAATTC TAGTACCAAGCTTGCCATGATATTA CAATTC AATTAGAAGAAATGATATTC
Whole Genome Profiling Sequence Tags
sample identification tag (“barcode”)
Restriction site part of the primer
20 base genome sequence tag flanking RE site
= pool R3
= pool C19
70% of sequence 20-mer tags are unique in rice; > 85% in Arabidopsis
Fraction unique Tags
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 5 10 15 20 25 30tag length (incl RE-site) bp
EcoMse-At
PstMse-At
EcoMse-Os
PstMse-Os
Impact sequence tag lengthWhole Genome Profiling
FingerPrinted Contigs (FPC) assembly
BAC1
BAC2
Assembly physical BAC map using adapted FPCWhole Genome Profiling
Whole Genome Profiling Whole Genome Arabidopsis
Arabidopsis Genome – 125 Mbp 6144 BACs (6 GE) in 384 well plates Each Illumina GA lane:
• 768 BACs ~ 3 M reads Total 8 lanes
Whole Genome ProfilingResults 6 GE Arabidopsis
4599 BACs 65,000 tags
234 contigs (2 – 125 BACs)541 singletons
85% coverage
FPC
BAC1BAC2
WGP Arabidopsis thaliana ecotype Colombia
6144 BACs (5 GE); WGP using one Illumina GA classic run 65,000 sequence tags assembly 4599 BACs (75%): 234 contigs (2 – 125 BACs/contig)
Validation on genome sequence by BLAST analysis WGP sequence tags:
52,000 tags 100% hits, covering 99% of genome; max. gap 125 kbp 50,000 unique hits; average 2,355 bp between tags 86% of all EcoRI sites represented
PoP Arabidopsis thalianaWhole Genome Profiling
XXXXXXXXXXXXXXXXXXXXX X XX X X
XX X XX X XX X XX X XX X XX X XX X XX X XX X X
X XX X
X X X XX X X XX X X XX X X XX X X XX X X XX X X X XX X X X X
X X X X XX X X X XX X X X XX X X XX X X X XX X X X XX X X XX X X XX X X XX X X XX X XX X X XX X XX X XX X XX X XX X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X X
XXX X XX X XX X XX X XX XX X XX X XX X XX XX X XX X X
XX XX XX XX XX XX XX XX X XX X XX X XX X X
XX X X
X X XX X XXX X X
X XX X XX X XX X X
XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X XX X X X
X X XX X X X XX X X X X XX X X X XX X X X X X
X XX X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X X
X X X X X XX X X X X X XX X X X X X XX X X X X X X
X XX X X X XX X X X X XX X X X X X
X XX X X X X X XX
X X XX X X X X X XX X X X X X XX X X XX X X X X X XX X X X X X XX X X X X X XX XX X X X X X XX X X X X X XX X X X X XX X X X X X X XX X X X X X XX XX X X X X X X XXX X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X XX X X X X X X X XX X X X X X XX X X X X X X X X
XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X X X X X X XX X X XX X X X X X X X XX X X X X X X X XX X X X X X XX X X X X X X X XX X X X X X X XX X X X X X X X XX X X X
X X X XX X X X X X X
X X X X X XX X X X X X X
X X X X X XX X X X X X
X X X XX X X X X X XX X X X X X X
XX
X X X X X X XX X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X XX X X X X X X
X X X X XX X X X XX X X X XX X X X XX X X X XX X X X X
XX X X X XX X X X XX X X XX X X X
X X XX X
XX XX XX XX XX X
XXXXXXXX
X XXXXXXXXXX
BAC852
BAC4124
BAC1373
BAC285
BAC2544
BAC704
BAC3536
BAC2070
BAC4237
BAC5328
BAC3912
BAC1461
Sequence Chrom bpX X X X X GAATTCTCAGTCACCGTCGGCGTTTG 3 17059264X X X X X X GAATTCCACCAGCTACGATACCAACT 3 17059284X X X X X GAATTCGAGAAACTCGGAGGAATCGA 3 17064605X X X X X X GAATTCAGACTCTGGTAAGCTTTCTT 3 17064625
X X GAATTCTCTGTCTTCTATTCTTGCTG 3 17070747X X X X X X GAATTCAAGAGTACCTTTCAAGGGAG 3 17070767X X X X X X X GAATTCCAGTGTATCCATTAGGCCCT 3 17075442X X X X X X X GAATTCCAAGTTCTTGTTGCAGCCAT 3 17075462X X X X X X X GAATTCAATCGAGTAAACTCTTCGCA 3 17078725X X X X X X X GAATTCTCCCTGAGGAACTATAATTG 3 17078745
X X X X X X GAATTCAGAAGAACCCTAGACTAAAT 3 17100591X X X X X X X GAATTCAATGCATTTTTGATTTTCCA 3 17100611X X X X X X X GAATTCTATCCCTAAGTGCTACAACA 3 17103098X X X X X X X GAATTCCATAAAGTTCTCGGATCACA 3 17103118
X X GAATTCGCTAGTTTTAAGATCATTAT 3 17107409X X X X X GAATTCGGATTTAAACGCGTTCTCGA 3 17107429X X X X X X GAATTCAACACGGTATCAATGAACAA 3 17108386X X X X X X GAATTCACGGTAATGTTGAGCTTGCA 3 17109561
X X GAATTCGGAGATGAATCTTTGGTTTC 3 17110126X X X X X X X GAATTCAGCATGGAAAAAGTGGTGCT 3 17117547X GAATTCACTAAATTAATCAAACCTCA 3 17117567
X X X GAATTCATGGTTAATTTGTATAGATT 3 17121474X X X X X X X GAATTCTATGATACACTTATGTAGTT 3 17124404X X X X X X X GAATTCCTCTTGTCAAAAAATTTATC 3 17124424X X X X GAATTCAGGTATTCGATGGTTAATTT 3 17124841X X X X X X X GAATTCTACACTACACTAATGAGGTC 3 17124861X X X X X X X GAATTCGCCACCAGAACTACTCAGGT 3 17125227X X X X X X X GAATTCAACACCAATAGTGGATTTAG 3 17125247X X GAATTCGGTTTATTAATTATGGCAGC 3 17127568X X X X X X X GAATTCAGAATATACATTCCTTACTT 3 17127588X X X X X X X GAATTCCGTCAGTTGTGCACCCATCG 3 17129227X X X X X X GAATTCCGCAGGAAACAGTGGTCCAG 3 17129392X X X X X X X X GAATTCTACTATGGGTCCAACGTATG 3 17132377X X X X X X X GAATTCGTTTTCTACCTTACACATTC 3 17132397X X GAATTCTTGATCGATATATAGACATG 3 17133709X X X X X X X X GAATTCATAGAACCTCTAACAAATGT 3 17133729X GAATTCCATCAGATGTGCACCTTATG 3 17134538X X X X X X X X GAATTCTAGCCGCATTTGATGATGCC 3 17134558X X X X X X X X X GAATTCCCCATAAACTAAGCATATAT 3 17145004X X X X X X X X X GAATTCCCAAAAGAGTAAGGAAAAAG 3 17145024X X X X X X X X X GAATTCGAATCCTTTTGTGCGGTTTC 3 17148314X X X X X X X X X GAATTCAACATGTGATCTTCATCTAA 3 17148334
BACs in order of their FPC resultPoP Arabidopsis thalianaWhole Genome Profiling
450 Mbp estimated genome size
47,000 BACs (EcoRI and HindIII libraries) ~ 13 GE in total
Available for contig building: - 5 GA runs - 300 M reads
- 196,000 unique sequence tags
- 40,000 BACs (85%) uniquely tagged, average 33 tags/BAC
WGP melonWhole Genome Profiling
WGP melon: results
549 contigs, 6416 singleton BACs Median 21 BACs / contig 78% genome coverage
Contig size distribution Melon Whole Genome Profiling
0
20
40
60
80
100
120
140
contig size (#BACs/contig)
# co
nti
gs
Whole Genome Profiling
Combining WGP and WGS
Roche GS FLX Titanium and Illumina Genome Analyzer II
Whole Genome Profiling and Whole Genome Sequencing
GS FLX Titanium sequencing (15 X): 10 GS FLX Titanium random shotgun runs 3 3-kb and 4 long jump p.e. GS FLX Titanium runs
Illumina GA II paired-end sequencing (30 X): 500 bp, 2 kb and 10 kb
Status: GS and GA sequencing completed GS assembly completed GA assembly in progress
WGS melon genomeWhole Genome Profiling and Whole Genome Sequencing
Combining WGP and WGSWhole Genome Profiling and Whole Genome Sequencing
EcoRI
WGP BAC contigs
EcoRIWGP sequence tag 2 - 3 kb distance → WGP sequence tag
400 nt Titanium
(Paired-end) WGS contigs
36 nt GA II
Combining WGP and WGS
Advantages: WGP provides sequence-based anchor points for WGS
Use WGP to create high-resolution sequence-based physical BAC map,
eg. 10 X BAC library coverage Use WGS to generate (deep) coverage whole genome sequence
Superior assembly: WGP map contains far less contigs (549) than
genomes sequenced by conventional random shotgun WGS strategies (tens
of thousands) and produces more accurate maps than fingerprint based PM
Cost reduction: no Sanger sequencing required
Direct access to BAC clones in regions of interest
Whole Genome Profiling and Whole Genome Sequencing
StatusWGP Tomato
4 types of BAC libraries: HindIII 15360 clones 120Kbp insert EcoRI 15360 clones 120Kbp insert MboI 15360 clones 120Kbp insert
20 pools total 5.5 GBp / 950 Mbp = 5.7 x
Random sheared (Lucigen) 50688 clones 90 kb insert 16 pools Total 4.6 Gbp / 950 Mbp = 4.8 x
Total nr of clones: 96786 of which 92160 are analyzed (95%) Approximately 85% RE bacs deconvolutable Approximately 60% of sheared bacs deconvolutable
Comparison WGP resultsWGP Tomato
515771average nr reads/tag
403140average nr tags/BAC
74%84%75%% tagged BACs (FPC ready)
67,74239,9354,599nr tagged BACs (FPC ready)
336,258181,25465,734nr unique tags
42%50%43%% deconvolutable reads
136.796.012.1nr deconvolutable reads (M)
326.919128.2nr OK reads generated (M)
E/ME/ME/Menzyme combination
12.113.25.9genome equivalents BACs tested
92,16047,6166,144nr BACs tested
262626tag length (incl. restriction site)
950450130genome size (Mbp)
TomatoMelonArabidopsis
What nextWGP Tomato
Finish last 5% (planned for next run) Contiging with FPC Deliver data
EU-SOL: Integrate with WGS data
Amplicon Express:Robert BogdenKeith StormoQuanzhou Tao
454 Life Sciences / Roche Applied Science:Jason AffourtitBrian DesanyHans Lunstroo
University of Udine:Michele Morgante
CBSG / EU SOL:Willem StiekemaRoeland van HamRené Klein Lankhorst
BioSeeds companies:Rijk ZwaanEnza ZadenVilmorin & CieTakii & Co
Keygene N.V.:Upstream Research Applied ResearchMarcel Prins René HofstedeMarjo de Ruiter Anker SørensenHein van der Poel Richard FeronMarjolijn Kelder Martin ZevenbergenAnita Bonné Linda de LeeuwNathalie van Orsouw Alberto MaurerEsther Verstege Marco van SchriekTaco Jesse Jeroen Rombout
Bio-informatics ICT Jan van Oeveren Kornelis StolAntoine Janssen Harold Verstegen Contact:Hanne Volpin michiel.van-eijk@keygene.com Jifeng Tang herco.van-liere@keygene.com
Business Development Jon Wittendorp Herco van Liere Mark van Haaren
Keygene N.V. owns patents and patent applications covering its Whole Genome technologies
Thanks to:Whole Genome Profiling and Whole Genome Sequencing
top related