genotyping and genetic maps bas heijmans leiden university medical centre the netherlands

Post on 29-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Genotyping and Genetic Maps

Bas HeijmansLeiden University Medical CentreThe Netherlands

111122222

123412345

001100111

002200222

111111111

132200565

243400877

Pedigree file in linkage format

122112121

111122222

123412345

001100111

002200222

111111111

132200565

243400877

Pedigree file in linkage format

122112121

family id

person id

father

mother

sex

disease

statu

s

marker d

ata (1

mark

er)

Marker choice for genome-wide linkage scans

Short tandem repeats (STR, a.k.a. microsatellites) because:

• High heterozygosity (1 STR ~ 5 SNPs)

• There are more than enough (1/30kb thus >>1/cM)

• Reliable genetic maps (Marshfield, Decode)

• Optimized marker sets, spacing down to 5cM (Marshfield/Applied Biosystems)

• Reasonably automated measurement (2 persons 40,000 checked genotypes in database per week)

• Low cost per genotype (<$0.15 for consumables)

• Reasonable success and error rates (>92% and <0.8%)

Short tandem repeats

AACTAACTAACTAACTTTGATTGATTGATTGA

AACTAACTTTGATTGA

Paternalallele

Maternalallele

4 repeats

2 repeats

Tetranucleotide repeat:

Short tandem repeats

AACTAACTAACTAACTTTGATTGATTGATTGA

AACTAACTTTGATTGA

Paternalallele

Maternalallele

4 repeats

2 repeats

Tetranucleotide repeat:

CACACACACACACACAGTGTGTGTGTGTGTGT

CACACAGTGTGT

Paternalallele

Maternalallele

8 repeats

3 repeats

Dinucleotide repeat:

And there also are tri- and pentanucleotide repeats….

Principle of genotyping methods

CACACACACACACACAGTGTGTGTGTGTGTGT

CACACAGTGTGT

• Short tandem repeats length differences

GC

AT

• SNPs only sequence difference

• Destruction restriction site (RFLP)• Hybridization differences (TaqMan)• One base-pair sequencing reaction- primer extension (Sequenom, Orchid)• Ligation assay (Illumina)

• VNTR, insertion/deletion polymorphisms (1 bp to ~300 bp for Alu repeat)

Genotyping STRs – step 1: PCR

Genotyping STRs – step 1: PCR

CACAGTGT

20 3525 4 20 104 bp+ + + + =

CACACACAGTGTGTGT

20 3525 8 20 108 bp+ + + + =

genomic DNA+

primers+

Taq DNA polymerase+

dNTPs (ACGT)+

buffer

Genotyping STRs – step 1: PCR in practice

Agarose or polyacrylamide slab gel• DNA is negatively charged• Longer fragments migrate slower than shorter ones through polymer network.

— electrode

+electrode

Genotyping STRs – step 2: electophoresisDetect length differences

To scan the whole human genome…

• 1 short tandem repeat every 10 cM

• makes 400 markers per individual

• Assuming 1000 individuals (preferably 1000s)

• One whole genome scan = 400,000 genotypings

Not like this…….

Not like this……. but like this

96-well plates

384-well plates

Not like this…….

Not like this……. but like this

Not like this…….

Not like this……. but like this

• 96 capillaries (no lanes) (ABI3700)

• Put in machine and all goes automatically

• Primers are labelled with fluorescent dye

• Machine detects PCR products through a laser

Electrophoresis using automated sequencer

TCTCAGAG

TGTGTGACACAC

GTGTCACA

CACAGTGT

Typically 15 markers in one capillary: start

2.5 h

A bit later

Laser

Detector

-

+

Through-put

A 384-well plate taking about one night

• 384 samples minus 16 controls = 368

• 15 markers per sample

• makes 5520 genotypes (if succes rate 100%)

Tetranucleotide repeat marker (e.g. multiples of AACT)

• Detected length of PCR product depends on machine

• Standards are used to correct this (CEPH DNA samples)

• Take this into account when analysing data from different machines/labs

Dinucleotide repeat marker (e.g. multiples of CA)

• Dinucleotide repeats give less clean pictures but in practice this is no problem as long as pattern is always the same

• However, markers not in standard 10 cM screening sets often are more problematic (different stutter patterns for different samples, non-constant ratio ‘real peak’/plus-A peak) increased error rates?

The result: allele lengths

CACAGTGT

20 3525 4 20 104 bp+ + + + =

CACACACAGTGTGTGT

20 3525 8 20 108 bp+ + + + =

111122222

123412345

001100111

002200222

111111111

Pedigree file in linkage format

122112121

102106104104

00

111112111

104110106110

00

118114114

Raw mark

er data

132200565

243400877

Renumbered data

Genetic map of measured markers

For IBD estimation using Merlin or other software • Pedigree file

• Genetic map

Markers measured on chromosome 19

16 markersd19s247d19s1034d19s391d19s865d19s394d19s588d19s49d19s433 d19s47d19s420d19s178apoc2d19s246d19s180d19s210d19s254

Genetic maps

Available from

• Marshfield Center for Medical Genetics http://research.marshfieldclinic.org/genetics/

• Decode Genetics (most accurate) Supplemental data to Kong et al. Nat Genet 2002;31:241-7. see F:\Bas\Genotyping&Maps\DecodeMap.xls

Merlin Map File

CHROMOSOME MARKER LOCATION19 d19s247 9.8419 d19s1034 20.7519 d19s391 28.8319 d19s865 32.3919 d19s394 34.2519 d19s588 42.2819 d19s49 50.8119 d19s433 51.88 19 d19s47 63.1019 d19s420 66.3019 d19s178 68.0819 apoc2 69.5019 d19s246 78.0819 d19s180 87.6619 d19s210 100.0119 d19s254 100.61

top related