genotyping by sequencing (gbs) method...

51
Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya Institute for Genomic Diversity Cornell University http://www.igd.cornell.edu/

Upload: dangtram

Post on 02-Dec-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Genotyping By Sequencing (GBS)

Method Overview

Charlotte B. AcharyaInstitute for Genomic Diversity

Cornell University

http://www.igd.cornell.edu/

Page 2: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

•Background/Goals

•GBS lab protocol

•Illumina sequencing review

•GBS adapter system

•How GBS differs from RAD

•Modifying GBS for different species

•GBS Workflow

Topics Presented

Page 3: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Background

Genotyping by sequencing (GBS) in any large genome

species requires reduction of genome complexity.

II. Restriction Enzymes (REs)I. Target enrichment

•Long range PCR of specific genes

or genomic subsets

•Molecular inversion probes

•Sequence capture approaches

hybridization-based (microarrays)

*Technically less challenging*

• Methylation sensitive REs filter out

repetitive genomic fraction

Page 4: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

QTL are often located in non-coding regionsVgt1, Tb, B regulatory regions 60-150kb from gene

Exon Exon Exon Exon

Exon captureMap large numbers of

genome-wide markers

QTL

Page 5: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Create inexpensive, robust multiplex

sequencing protocol

Low- or high-coverage

Sequencing

Illumina GA

Informatics Pipelines

Anchor markers across the genome

Impute missing data

if needed

Combine genotypic & phenotypic data for

QTL mapping, GS and GWAS

We have created a public genotyping/informatics

platform based on next-generation sequencing

Page 6: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Open Source

• Method available for anyone to use / modify.

• Analysis pipeline details and code are public.

• Promote dataset compatibility.

• Method published in PLoS ONE to promote

accessibility.

• Genotype calls available for public projects.

Page 7: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

< 450 bp

Restriction SNP

( ) sequence tag

Loss of cut SNP

Sample1

Overview of Genotyping by Sequencing (GBS)

• Focuses NextGen sequencing power to ends of restriction fragments

• Both SNPs and presence/absence markers can be scored

• Small indels are identified but are not scored

Sample2

Page 8: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

• Reduced sample handling

• Few PCR & purification steps

• No DNA size fractionation

• Efficient barcoding system

• Simultaneous marker discovery

& genotyping

• Scales very well

GBS is a simple, highly multiplexed system for

constructing libraries for next-gen sequencing

Page 9: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

GBS 96- or 384-plex Protocol(http://www.maizegenetics.net/gbs-overview)

1. Plate DNA &

adapter pair

2. Digest DNA with RE

3. Ligate adapters

Page 10: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

GBS Adapters and Enzymes

Barcode

Adapter“Sticky Ends”

Barcode

(4-8 bp)

Common

Adapter

P1 P2

ApeKI G CWGC

PstI CTGCA G

EcoT22I ATGCA T

5’ 3’

Restriction Enzymes

Illumina Sequencing

Primer 2

Illumina Sequencing

Primer 1

Page 11: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

GBS 96- or 384-plex Protocol(http://www.maizegenetics.net/gbs-overview)

..

..

. ..

.........

...........

..

..........

..

. ......

. ..

.

..

.

.

.... ...

.

...

.....

...

.

..

... .. ......

..

...

.

..

...

... ..

.....

...

.

..

...

.. ...

....

..

.

. .

..

.

... . .. .....

1. Plate DNA &

adapter pair

5. PCR

Primers

2. Digest DNA with RE

3. Ligate adapters

Clean-up4. Pool samples

Page 12: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

. .

.

...

.

....

..

...

..

..

..

.

.

..

.

.

.

.

. .

..

..

.

..

.

.

. ..

. .

.

...

.

.

..

.

.

.

.....

.

.

...

... .

...

.

.

.

.

.

.

.

.. ...

.. ..

...

....

.

.

.

.....

.......

.

.

..

.

.. ..

...

.. ..

. .

.

.

..

.

......

..

.

. .

P1 P2 P1 P2

.

O1 O2

PCR

Insert

Pooled Digestion/

Ligation ReactionsGBS

“Library”

PRC primers:

Insert

Page 13: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

GBS 96- or 384-plex Protocol(http://www.maizegenetics.net/gbs-overview)

..

..

. ..

.........

...........

..

..........

..

. ......

. ..

.

..

.

.

.... ...

.

...

.....

...

.

..

... .. ......

..

...

.

..

...

... ..

.....

...

.

..

...

.. ...

....

..

.

. .

..

.

... . .. .....

1. Plate DNA &

adapter pair

5. PCR

Primers

2. Digest DNA with RE

3. Ligate adapters

4. Pool DNA aliquots

6. Evaluate

fragment sizes

Clean-upClean-up

Page 14: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Perform Titration to Minimize Adapter

Dimers Before Sequencing

NOTE: Done once with a small number of samples.

Adapter dimers constitute only 0.05% of raw sequence reads

Size Standards

LibraryAdapter

Dimer

1500 bp15 bp

Flu

ore

sce

nse

inte

nsi

ty

Time Time

Optimal adapter amountNon-optimized library

Page 15: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Small Fragments are Enriched in GBS Libraries To

tal

Fra

gm

en

ts

0%

5%

10%

15%

20%

25%

30%

35%

40%

0

50

10

0

15

0

20

0

25

0

30

0

35

0

40

0

45

0

50

0

55

0

60

0

65

0

70

0

75

0

80

0

85

0

90

0

95

0

10

00

10

00

00

00

REFGENOME

IBM

ApeKI fragment size (bp) >1

00

00

B73 RefGen v1

IBM (B73 X Mo17) RILs

Page 16: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

384-plex GBS Results for Maize

Reads

Mean read count per line = 528,000

c.v. = 0.22

Page 17: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Flowcell

8 channels

Solid Phase Oligos

Illumina Sequencing by Synthesis Review

Based on solid phase-PCR

Page 18: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Flow cell with bound oligos

Denatured “Library”

Cluster Formation Amplifies Sequencing Signal

Linearization

CBot

Bridge Amplification

PCRCleavage

Page 19: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

A

C

G

T

G

G

C

T

G

P1 primer

C

A

TTGTGC

Sequencing by Synthesis

FlowcellHiSeq 2000

TGCA

Page 20: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1
Page 21: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

G

A

C

G

T

G

C

T

G

A

C

G

T

G

G

C

T

C

A

C

G

T

G

C

T

G

G

T N

A

C

G

T

G

G

C

T A

C

G

T

G

G

C

T A

C

G

T

G

G

C

T

“In Phase” “Out of Phase”

Page 22: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Read 1

GBS captures barcode and insert DNA sequence in single read

Insert DNA

Barcode

Sequencing

Primer 1

Sequencing

Primer 2

Illumina

Read 1

Read 2

Sequencing Primer 1

BarcodeRE cut-SNP

Insert DNA

GBS

Page 23: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Variable Length GBS Barcodes Solves

Sequence Phasing Issues

•First 12 nt used to calculate phasing.

•Algorithm assumes random nt distribution.

•Incorrect phasing causes incorrect base

calls.

Barcode

Illumina

Insert DNA

•Good design and modulating the RE

cut SNP position with variable length

barcodes produces even nt distribution.

Barcode

RE cut-SNP

GBS

Insert DNA

Read 2

Read 1

Page 24: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Invariant GBS barcodes cause loss of signal intensity.

Page 25: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Successful GBS sequencing run.

Page 26: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

GBS Adapter Design

Page 27: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Barcode Design Considerations

• Barcode sets are enzyme specific

– Must not recreate the enzyme recognition SNP

– Must have complementary overhangs

• Sets must be of variable length

• Bases must be well balanced at each position

• Must different enough from each other to avoid confusion if

there is a sequencing error.

– At least 3 bp differences among barcodes.

• Must not nest within other barcodes

• No mononucleotide runs of 3 or more bases

http://www.deenabio.com

Page 28: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Most significant GBS technical issues?

• DNA quality

• DNA quantification

Page 29: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Different Ends 50%

Same Ends 50%

100%

Ligation

LigationPCR

GBS

Standard

GBS does not use standard “Y” adapters

Page 30: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Denatured “library”

Flow Cell with

Bound Oligos

Same-ended Fragments Do Not Form Clusters

Bridge Amplification

Cluster Formation

&“Linearization”

No P1 binding

SNP

Cleaved

from

surface

Page 31: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

< 450 bp

Restriction SNP

( ) sequence tag

Loss of cut SNP

Sample1

GBS vs. RAD

• Focuses NextGen sequencing power to ends of restriction fragments

• Scores both SNPs and presence/absence markers

Sample2

Page 32: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Digest

Ligate adapters

Pool

Random shear

Size select

Ligate Y adapters

PCR

RAD

GBS

ReferenceDavey et al. 2011

Page 33: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Modifying GBS

Considerations for using GBS with new

species and / or different enzymes.

Page 34: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Why Modify the GBS Protocol?

• More markers

• Fewer markers

(deeper sequence coverage per locus)

• Increase multiplexing

• More genome appropriate

(avoid more repetitive DNA classes)

• Other novel applications

(i.e., bisulfite sequencing)

Page 35: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Genome Sampling Strategies Vary by Species

Dependent on Factors that Affect Diversity:

•Mating System influences heterozygosity

(Outcrosser, inbreeder, clonal?)

•Ploidy(Haploid, diploid, auto- or allopolyploid?)

•Geographical Distribution(Island population, cosmopolitan?)

Page 36: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Other Factors

• Genome size

– The size of the genome has some bearing on the number

of fragments in the sequencing pool.

– Amount of repetitive DNA directly correlated with

genome size.

• Genome composition

– The base composition of the genome can affect the

frequency and distribution of the cut SNP s.

– How repetitive DNA is organized in the genome affects

library profiles.

Page 37: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Sampling large genomes with methylation-

sensitive restriction enzymes.

5-base

cutter

6-base

cutter

Methylated

DNA

Unmethylated

DNA

GBS Library

Sequenced

Fragments

Page 38: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Scrub jay

Vole Giant

squid

Deer Mouse

Tunicate

Yeast

Solitary Bee

Grape Maize Cacao

Rice

Raspberry

Barley

Cassava

Goose

Sorghum

Cassava

Shrub willow

Optimizing GBS in New Species

Page 39: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Maize

Sorghum

Rice

Barley

Switchgrass

Bracypodium

Pearl Millet

Teosinte

Lily

Andropogon

Fonio

Finger Millet

Grape

Cassava

Cacao

Watermelon

Apple

Hop

Pine

Spruce

Conifers

Flowering Plants

Strawberry

Ragweed

Silene

Sunflower

Safflower

Soybean

Goldenberry

Jatropha

Pepper

Cucumber

Squash

Pea

Gourd

Arabidopsis

Willow

Tea

Potato

Cherry

Flax

Page 40: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Neurospora

Verticillium

Solitary Bee

Corn Ear Worm

Plant Bug

Mexican Tetra

Killifish

Scrub Jay

Goose

Chickadee

Deer Mouse

Vole

Killer Whale

Pig

Fox

Cow

Page 41: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Choosing Appropriate Restriction Enzymes:

Generalizations from the Bench

Page 42: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

ApeKI

ApeKI works well for grasses.

Maize, sorghum, teosinte, rice, barley, millet, switchgrass, brachypodium.

Page 43: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

PstI

PstI works well for most mammals.Deer mouse, vole, cow, pig.

Page 44: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Most frequently asked question for new species:

How many SNPs will I get?

Page 45: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Answer: It depends……

• Genome size and expected heterozygosity

affects size of fragment pool for desired

amount of sequence coverage

(enzyme choice and multiplex level).

• Amount of extant diversity and how well

your sample reflects that diversity.

• Reference genome sequence? 3-4X more

SNPs attained by aligning to a reference

sequence.

Page 46: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Species

Genome

Size (Mb) Enzyme Sample Size No. SNPs

Maize 2,600 ApeKI 33,000 1,200K

Grape 500 ApeKI 1000 200K

Cow 3,000 PstI 48 64K

Rice 400 ApeKI 850 60K

Pine* 16,000 ApeKI 12 63K

Vole* 3,400 PstI 283 53K

Willow* 460 ApeKI 459 23K

Fox* 2,400 EcoT22I 48 16K

Verticilliflorum

(fungus isolates)

40 ApeKI 2 10K

How many SNPs will I get?

*No reference genome. UNEAK analysis pipeline used for analysis. To avoid

homology/paralogy issues this pipeline calls SNPs very conservatively.

Page 47: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

SNP calls in Sorghum bicolor- Lots of Missing Data

alleles Ta

xa

1

Ta

xa

2

Ta

xa

3

Ta

xa

4

Ta

xa

5

Ta

xa

6

Ta

xa

7

Ta

xa

8

Ta

xa

9

Ta

xa

10

Ta

xa

11

Ta

xa

12

Ta

xa

13

Ta

xa

14

Ta

xa

15

Ta

xa

16

Ta

xa

17

Ta

xa

18

Ta

xa

19

SNP 1 C/A A N A C A N N C C N N N N N N N N C N

SNP 2 A/C C N C A C N N A A N N N N N N N N A N

SNP 3 T/C C N C T C N N T T N N N N N N N N T N

SNP 4 C/G N N C N G N N N N N N N N N C N N N N

SNP 5 T/C T T N N T N T N N N N C N N N N N N N

SNP 6 C/T C C N N C N C N N N N T N N N N N N N

SNP 7 G/A G A G A G N G G N N G G G R N A N R G

SNP 8 G/A G N N A G A G N N G G G N N A N G G G

SNP 9 T/C T C T C T C T N N T T N N N C N N T T

SNP 10 T/C T C N N N N N T T N N N N N N N N N N

SNP 11 G/A G N N A G N N N N N N N G A A N G G N

SNP 12 G/A G A N A G A N N N G N G N N N N G G N

SNP 13 G/A G A N A N N N G N N G N N N N N G N G

SNP 14 T/G G T N T G N G N N G G N G T T T N G N

SNP 15 C/T T C N C T N T N N T T N T C C C N T N

SNP 16 G/A G A G N G A G G G N G N G A A N G G G

SNP 17 C/G N G C S C G C C C N C C N C G G N C N

SNP 18 G/A N A G A G A G G N N G G G N A A G G N

SNP 19 C/T C N N T N N N N C N C N N N N N N C C

SNP 20 T/G T N T N T G T N T T T T T N N G T N T

SNP 21 T/G T G T G T G N T T T T T T N G G T T T

SNP 22 G/A N A N A N N G G G N G N N N A N N N N

SNP 23 G/T G T G T G N N G G G N N N N N T G G G

SNP 24 C/T C T C T C N N C C C N N N N N T C C C

SNP 25 T/C T C T C T C N N T T N N T N N N T N N

SNP 26 C/A N N N N A N A N N N N N N N N N N N N

SNP 27 G/A N N N N A N A N N N N N N N N N N N N

SNP 28 C/T N N C N N N N C T C N N C T N N C N C

SNP 29 T/C T N T N N N N N N N N N N N N N N T T

SNP 30 G/T G N G N N N N G G G N N N N N N N N N

SNP 31 A/T A T N N A N N A A A N A A N T T A A A

SNP 32 A/T A T A T N N N A N A N A N N T T A N A

SNP 33 C/T N N C N C N C C C N N N C T N N C N C

SNP 34 C/T C T C T C N N C N C C N C T T N C C N

SNP 35 A/C A C A N A N A A A A A A A C C N A A A

SNP 36 T/C T C T C T C N T T T T N T N C N T T T

Page 48: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Filtering SNPs to remove most of the missing data.

Will be covered later in discussion of

TASSEL (http://www.maizegenetics.net/)

Page 49: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

Missing Data Strategies

• Impute Missing SNPs.

- Many algorithms for doing this.

• Technical Options

– Reduce the multiplexing level

– Sequence the same library multiple times

• Molecular Options

– Choose less frequently cutting enzymes

Page 50: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

DNA sample

info entered

via webform

HTS databaseProject approved?

Samples shipped

GBS libraries

made

Reference

genome

Non-reference

genome

HapMap File

SNPs/Sample

Coordinates

Lab Analysis Pipelines

DNA

sequence dataHapMap File

SNPs/Sample

GBS workflow at IGD

http://www.igd.cornell.edu/index.cfm/page/projects/GBS.htm

Page 51: Genotyping By Sequencing (GBS) Method Overviewcbsu.tc.cornell.edu/lab/doc/September_2012_class_CBA.pdf · Genotyping By Sequencing (GBS) Method Overview Charlotte B. Acharya ... 1

BioinformaticsJeff Glaubitz

Qi Sun

Katie Hyma

Fei Lu

Method DevelopmentRob Elshire

Ed Buckler

Sharon Mitchell

GBS Team

Laboratory/ProductionCharlotte Acharya

Wenyan Zhu

Lisa Blanchard

Shane Cieri

Workshop CoordinatorTheresa Fulton