welcome to uw-madison, the wnprc, and o’connor lab! mhc genotyping workshop november 7 th – 11...
TRANSCRIPT
Welcome to UW-Madison, the WNPRC, and O’Connor Lab!
MHC Genotyping WorkshopNovember 7th – 11th, 2011
Madison, Wisconsin
Introductions
• Trainers (WNPRC Genetics Service)– Roger Wiseman– Julie Karl– Simon Lank– Gabe Starrett– Francesca Norante
• Participants– Wendy Garnica– Mark Garthwaite– Julie Holister-Smith– Suzanne Queen– Premeela Rajakumar– Yuko Yuki
Schedule of Events
• Monday– Welcome and Overview
Presentation– Begin bench work: cDNA
synthesis & PCR (run #1)
• Tuesday– PCR product purification,
quantification & pooling (run #1)
– Begin emulsion PCR (run #1)
– Begin bench work (run #2)
• Wednesday– Break & enrich DNA beads (run
#1)– Run Roche/454 GS Junior
instrument (run #1)– emPCR (run #2)
• Thursday– View run #1 results– Continue work on run #2– Informatics presentation– Data analysis
• Friday– Run #2 results– Continue Data Analysis & Wrap-up
Overview of Presentation
• Our lab & research focus• Evolution of DNA sequencing technology• Discussion of Roche/454 technology & sample
multiplexing• MHC genotyping method overview
– NHP immunogenetics– Genotyping strategy– Workflow
• Genotyping results
Welcome to Madison!
WNPRC
Welcome to Madison!
The Wisconsin National Primate Research Center (WNPRC)
• Only federally funded National Primate Research Center in the Midwest
• Center holds ~1,100 rhesus macaques, 200 marmosets, and 100 cynomolgus macaques
• Research strengths:– Immunogenetics & Virology– Aging & Metabolism– Reproductive & Regenerative Medicine
The O’Connor Laboratory
Genetics Services Members
The O’Connor Laboratory
Genetics Services Members
The O’Connor Laboratory: Research
• NHP immunogenetics (MHC class I, class II, KIR)– Cynomolgus Macaque (Mauritian, Indonesian, SE
Asian)– Rhesus Macaque (Indian & Chinese)– Japanese Macaque, Vervet, Sooty Mangaby
• SIV pathogenesis (immunology) and viral evolution
• Human immunogenetics (HLA) and HIV variation
The O’Connor Laboratory: Research
• NHP immunogenetics (MHC class I, class II, KIR)– Cynomolgus Macaque (Mauritian, Indonesian, SE
Asian)– Rhesus Macaque (Indian & Chinese)– Japanese Macaque, Vervet, Sooty Mangaby
• SIV pathogenesis (immunology) and viral evolution
• Human immunogenetics (HLA) and HIV variation
Sequencing Technology is Changing
• Micro sequencing reactions– Pyrosequencing– Single molecule sequencing
• Higher throughput– Millions of sequences per day
• Lower cost– $10,000 human genome
(original HGP = $3 billion)
Sequencing Technology: Overview
• 1st Generation (previous): Sanger sequencing
Applied Biosystems 3730xl: 1 x 103 reads / day- 500 to 1,000 bp read length
Sequencing Technology: Overview
• 2nd Generation (current): 454, Illumina, SoLID, Ion torrent
Roche / 454: 1 x 106 reads / day- 500 to 800 bp read length
Illumina: 2 x 109 reads / week- 100 or 200 bp read length
Sequencing Technology: Overview
• 3rd Generation (future): Pacific Biosciences, Nanopore sequencing, Complete Genomics
Pacific Biosciences: 1 x 105 sequences / hour- 1,000 to 10,000 bp reads (?) - Single molecule sequencing- Goal = $1,000 genome !
Sequencing Technology: Overview
• 1st Generation (previous): Sanger– Slow, Expensive, Not clonal, easy to analyze
• 2nd Generation (current): 454, Illumina, SoLID, Ion torrent– Faster, Cheaper, Clonal, hard to analyze
• 3rd Generation (future): Pacific Biosciences, Nanopore sequencing, Complete Genomics, Helicos– Very fast, Very cheap, Impossible to analyze
Roche / 454 Sequencing
How does it work?
Flowgram (instead of chromat)
O’Connor Laboratory Sequencing
20072006 2008 2009 20102005
Sanger sequencing
NHP MHC class I genotyping with E. coli based cloning and Sanger sequencing: Throughput of ~ 8 animals per week.
O’Connor Laboratory Sequencing
20072006 2008 2009 20102005
Sanger sequencing
Pilot with Roche sequencing
center
MHC class I genotyping pilot project: ~24 samples per week
O’Connor Laboratory Sequencing
20072006 2008 2009 20102005
Sanger sequencing
GS FLX at UIUC
Pilot with Roche sequencing
center
MHC class I genotyping at UIUC, ~ 48 samples per week
O’Connor Laboratory Sequencing
20072006 2008 2009 20102005
Sanger sequencing
GS FLX at UIUC
Pilot with Roche sequencing
center
Titanium pilot with Roche sequencing center
MHC class I full-length sequencing project with Roche using Titanium chemistry
O’Connor Laboratory Sequencing
20072006 2008 2009 20102005
Sanger sequencing
GS Junior in lab
GS FLX at UIUC
Pilot with Roche sequencing
center
Titanium pilot with Roche sequencing center
MHC class I and viral sequencing projects run in-house ( > 48 samples per week )
Roche/454 Sequencing Advantages
• Inherently clonal (no bacterial cloning needed)• Far cheaper per base than Sanger (3 – 4 orders
of magnitude)• Reliable read number and data regularity• Easy protocol: many people trained
GS Junior 5 Month Run Summary
MHC Class I 568bp Amplicon – 9 runs
Average 70,848 HQ reads 523 bp median length
Highest 101,711 526
Lowest 33,552 521
SIV Whole Genome – 16 runs
Average 101,846 HQ reads 360 bp median length
Highest 177,642 494
Lowest 42,949 147
SIV Epitope Amplicons (Various Sizes) – 5 runs
Average 80,244 HQ reads 369 bp median length
Highest 107,605 388
Lowest 37,066 356
Ease of Use Access to instrument since Jan 2010 34 different fully-trained operators to date 7 additional people have begun training, but
have not yet completed a solo run
Ease of Use Access to instrument since Jan 2010 34 different fully-trained operators to date 7 additional people have begun training, but
have not yet completed a solo run
Ultra-Deep vs. Ultra-Wide Sequencing
• 2nd & 3rd Generation = thousands / millions of sequences per run
• Cost per run is high ($1000s)• Can examine polymorphic target at high depth
(ultra-deep)– expensive
• Can sequence many samples sequenced at the same time (ultra-wide)– cheap
Ultra-Deep vs. Ultra-Wide Sequencing
• Significantly improves sensitivity over traditional Sanger-based sequencing (500x vs 2x coverage)
Ultra-Deep vs. Ultra-Wide Sequencing
Ultra-deep Ultra-wide
• Low frequency ARV resistance• TCR sequencing• Antibody sequencing
• HLA Typing• Allele frequencies• SNP detection
Multiplexed (Ultra-wide) Amplicon Sequencing
MultiplexIdentifier
MID Tag
Methods to increase multiplexing1. Physically subdividing plate (gasket)2. Sample specific MID sequence tags3. Uniquely mixing 5’ & 3’ MID tags
Patient MID1 ATCGTAGTCA2 TCCGATCGA3 GTGTAACGT4 CCATGGATC5 TGGATGCAG6 TAGTAGCCA7 GTAGTCTAA8 AACGATGCA9 GCGCTAGCA
Patient 5' MID 3' MID1 1 12 1 23 1 34 2 15 2 26 2 37 3 18 3 29 3 3
1.
3.2.
O’Connor lab sequencing projects
• NHP comprehensive MHC genotyping & allele discovery (amplicons)
Importance of MHC Class I
MHC class I molecules dictate immunity to disease
High degree of polymorphism within the MHC class I peptide-binding domain
Specific MHC alleles associated with superior control of HIV infection
Source: modified from Yewdell et al., Nature Reviews Immunology 2003
Host Immune Genetics
NHP MHC Class I Allele LibrariesTo
tal #
Alle
les
in G
enBa
nk
Rhesus Macaque
Cynomolgus Macaque
Pig-tailed Macaque
Vervet Sooty Mangabey
0
100
200
300
400
500
600
700 663
09
156
460
NHP MHC Class I Allele LibrariesTo
tal #
Alle
les
in G
enBa
nk
Rhesus Macaque
Cynomolgus Macaque
Pig-tailed Macaque
Vervet Sooty Mangabey
0
100
200
300
400
500
600
700 663
09
156
460
Human HLA class I = 5,400 alleles
Human HLA vs NHP MHC Class I
A C
A C
B
B
Human HLA class I
Human HLA vs NHP MHC Class I
A C
A C
B
B
Human HLA class I
A1 A2 A4 A3 B1 B2 B3 B4 BN
A1 A2 A3 A4 B1 B2 B3 B4 BN
Nonhuman primate MHC class I
MHC Genotyping Design
• 568bp amplicon captures highly variable peptide binding region flanked by conserved sequences
• Amplifies in multiple primate species• Longer reads provide better resolution of
alleles
% M
HC
Clas
s I V
aria
bilit
y
100
80
60
40
20
0
Leader Peptide α1 Domain α2 Domain α3 Domain
Cyto-plasmic
Trans-membrane
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 251 261 271 281 291 301 311 321 331 341 351 361
Amino Acid Position
F R
568bp Amplicon
MHC Genotyping Design
568bp Amplicon
Primer = Adapter (A or B) + MID + sequence-specific
MHC Genotyping Design
568bp Amplicon
Primer = Adapter (A or B) + MID + sequence-specific
Within a single nonhuman primate sample:
MHC Genotyping Design
568bp Amplicon
Primer = Adapter (A or B) + MID + sequence-specific
Within an MHC class I amplicon genotyping pool:
Roche/454 MHC Workflow• Total RNA isolation and cDNA
synthesis– RNA isolation ~4 hrs; cDNA synthesis ~2
hrs
• Primary PCR amplification– plus SPRI purification, quantification,
pooling ~3 hrs
• emPCR– set-up ~1 hr, run ~5.5 hrs
• Breaking and enrichment– ~3 hrs
• GS Junior run– set-up ~1.5 hrs; run time ~10 hrs
• Data processing and analysis– run processing ~2 hrs;– analysis time varies
www.454.com
GS Junior Run Metrics – MHC
Reads per SampleSample MID Read Count Sample MID Read Count
Monkey001 1 525 Monkey049 49 585Monkey002 2 392 Monkey050 50 504Monkey003 3 1,023 Monkey051 51 673Monkey004 4 504 Monkey052 52 565Monkey005 5 450 Monkey053 53 893Monkey006 6 722 Monkey054 54 581Monkey007 7 622 Monkey055 55 623Monkey008 8 489 Monkey056 56 955Monkey009 9 344 Monkey057 57 698Monkey010 10 635 Monkey058 58 792Monkey011 11 660 Monkey059 59 655Monkey012 12 796 Monkey060 60 1,203Monkey013 13 653 Monkey061 61 428Monkey014 14 731 Monkey062 62 8Monkey015 15 1,342 Monkey063 63 391Monkey016 16 628 Monkey064 64 663Monkey017 17 76 Monkey065 65 411Monkey018 18 481 Monkey066 66 386Monkey019 19 503 Monkey067 67 625Monkey020 20 633 Monkey068 68 637Monkey021 21 573 Monkey069 69 367Monkey022 22 463 Monkey070 70 391Monkey023 23 390 Monkey071 71 585Monkey024 24 723 Monkey072 72 808Monkey025 25 739 Monkey073 73 594Monkey026 26 560 Monkey074 74 391Monkey027 27 1,672 Monkey075 75 578Monkey028 28 559 Monkey076 76 728Monkey029 29 801 Monkey077 77 612Monkey030 30 590 Monkey078 78 283Monkey031 31 548 Monkey079 79 475Monkey032 32 748 Monkey080 80 527Monkey033 33 583 Monkey081 81 27Monkey034 34 374 Monkey082 82 226Monkey035 35 226 Monkey083 83 113Monkey036 36 791 Monkey084 84 481Monkey037 37 618 Monkey085 85 52Monkey038 38 558 Monkey086 86 612Monkey039 39 438 Monkey087 87 733Monkey040 40 666 Monkey088 88 800Monkey041 41 250 Monkey089 89 647Monkey042 42 451 Monkey090 90 1,094Monkey043 43 612 Monkey091 91 522Monkey044 44 673 Monkey092 92 756Monkey045 45 570 Monkey093 93 624Monkey046 46 207 Monkey094 94 912Monkey047 47 604 Monkey095 95 610Monkey048 48 180 Monkey096 96 514
Allele Calls & Transcript Profiles%
Tot
al R
eads
MHC Class I AllelesMamu-A
1*026:01
Mamu-A4*14g
Mamu-B*065:03
Mamu-B*090:01
Mamu-B*151:nov:01
Mamu-B*142:nov:01
Mamu-I*01g
Mamu-B*013:nov:01
Mamu-B*046g
Mamu-B*046:06
Mamu-E*03:01:01
Mamu-E*01:12
0
2
4
6
8
10
12
14
16
ChRh10
ChRh11
ChRh12
Lymphocyte Specific Expression
% T
otal
Rea
ds
MHC Class I AllelesMafa-A
1*063:01
Mafa-A2*05g
Mafa-A4*01:01
Mafa-B*104:01:01
Mafa-B*134:02:01
Mafa-B*144:02:01
Mafa-B*064:01:01
Mafa-B*057:01:01
Mafa-B*046:01:01
Mafa-B*131:02
Mafa-B*152:01N
Mafa-B*060:05:02
Mafa-E*01g
Mafa-E*01:nov:09
0
5
10
15
20
25
30
35
40
45
50CD16
CD20
CD4
CD8
CD14
ROGER: INSERT ADDITIONAL DATA SLIDES?
Same methods applicable to HLA typing
• We have developed a similar assay to genotype human samples: HLA Class I and DRB loci
• Cheaper, higher-resolution, and higher-throughput than existing methods
• Can genotype up to 96 individuals per GS-Jr run
High Resolution HLA Genotyping
3 45 87 129 171 213 255 297 339 381 423 465 507 549 591 633 675 717 759 801 843 885 927 969 1011 1053 10950
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1kb-F / 581-R (Amplicon 1)
LP α1 Domain α2 Domain α3 Domain CTTM
581-F / 1kb-R bp SBT (Amplicon 2)
High-resolution Typing for 40 Reference Cell Lines
UW ID# A* B* C*HLA-Ref01 A*31:01:02 B*51:01:01 C*15:02:01 HLA-Ref02 A*32:01:01 B*38:01:01 C*12:03:01:01/02 HLA-Ref03 A*02:16 A*03:01:01:01/03 B*51:01:01 C*07:04:01 C*15:02:01HLA-Ref04 A*24:02:01:01/02L A*26:02 B*40:06:01:01/02 B*51:01:01 C*08:01:01 C*14:02:01HLA-Ref05 A*30:01:01 B*13:02:01 C*06:02:01:01/02 HLA-Ref06 A*02:01:01:01/02L/03 A*02:07 B*46:01:01 C*01:02:01 HLA-Ref07 A*33:03:01 B*44:03:01 C*14:03 HLA-Ref08 A*30:01:01 A*68:02:01:01/02/03 B*42:01:01 C*1701 HLA-Ref09 A*02:06:01 A*11:01:01 B*15:01:01:01 B*35:01:01:01/02 C*03:03:01 C*04:01:01:01/02/03HLA-Ref10 A*26:01:01 B*08:01:01 C*07:01:01 HLA-Ref11 A*02:04 B*51:01:01 C*15:02:01 HLA-Ref12 A*03:01:01:01/03 B*47:01:01:01/02 C*06:02:01:01/02 HLA-Ref13 A*01:01:01:01 B*57:01:01 C*06:02 HLA-Ref14 A*02:01:01:01/02L/03 B*35:03:01 C*12:03:01:01/02 HLA-Ref15 A*02:01:01:01/02L/03 B*35:01:01:01/02 C*04:01:01:01/02/03 HLA-Ref16 A*34:01:01 B*15:21 B*15:35 C*04:03 C*07:02:01:01/02/03HLA-Ref17 A*02:01:01:01/02L/03 B*15:01:01:01 C*03:04:01:01/02 HLA-Ref18 A*01:01:01:01 B*49:01:01 C*07:01:01 HLA-Ref19 A*25:01 B*51:01:01 C*01:02 HLA-Ref20 A*30:02:01 B*18:01:01:01 C*05:01:01:01/02 HLA-Ref21 A*01:01:01:01 A*02:05:01 B*08:01:01 B*50:01:01 C*06:02:01:01/02 C*07:01:01HLA-Ref22 A*01:01:01:01 A*03:01:01:01/03 B*07:02:01 B*58:01:01 C*07:01:01 C*07:02:01:01/02/03HLA-Ref23 A*01:01:01 A*02:01 B*05:801 B*07:02 C*07:01 C*07:02HLA-Ref24 A*01:01:01:01 A*24:02:01:01/02L B*39:06:02 B*58:01:01 C*07:01:01 C*07:02:01:01/02/03HLA-Ref25 A*01:01:01:01 A*01:37 B*35:01:01:01/02 B*58:01:01 HLA-Ref26 A*03:01:01:01/03 B*07:02:01 B*35:01:01:01/02 C*04:01:01:01/02/03 C*07:02:01:01/02/03HLA-Ref27 A*03:01:01:01/03 B*07:02:01 B*35:01:01:01/02 C*04:01:01:01/02/03 C*07:02:01:01/02/03HLA-Ref28 A*01:01:01:01 A*03:01:01:01/03 B*35:01:01:01/02 B*58:01:01 C*04:01:01:01/02/03 C*07:18 (701?)HLA-Ref29 A*03:01:01:01/03 A*24:02:01:01/02L B*35:01:01:01/02 B*51:01:04 C*04:01:01:01/02/03 C*07:04:01HLA-Ref30 A*02:01:01:01/02L/03 A*03:01:01:01/03 B*07:02:01 B*37:01:01 C*06:02:01:01/02 C*07:02:01:01/02/03HLA-Ref31 A*01:01:01:01 A*24:02:01:01/02L B*39:06:02 B*58:01:01 C*07:01:01 C*07:02:01:01/02/03HLA-Ref32 A*24:02:01:01/02L B*07:02:01 B*51:01:01 C*07:117 HLA-Ref33 A*03:01:01:01/03 B*07:02:01 B*35:01:01:01/02 C*04:01:01:01/02/03 C*07:02:01:01/02/03HLA-Ref34 A*03:01:01:01/03 A*24:02:01:01/02L B*35:01:01:01/02 B*39:06:02 C*04:01:01:01/02/03 C*07:02:01:01/02/03HLA-Ref35 A*02:01:01:01/02L/03 A*24:02:01:01/02L B*07:02:01 B*13:02:01 C*06:02:01:01/02 C*07:02:01:01/02/03HLA-Ref36 A*24:02:01:01/02L A*31:01:02 B*07:02:01 B*40:01:02 C*03:04:01:01/02 C*07:02:01:01/02/03HLA-Ref37 A*02:01:01:01/02L/03 A*24:02:01:01/02L B*15:01:01:01 B*39:06:02 C*03:03:01 C*07:02:01:01/02/03HLA-Ref38 A*3402 A*7401 B*801 B*1503 C*02:10 C*701HLA-Ref39 A*2308N A*301 B*440301 B*5129 C*02:02:02 C*04HLA-Ref40 A*02:01:01:01/02L/03 A*29:02:01 B*35:01:01:01/02 B*44:03:01 C*04:01:01:01/02/03 C*16:01:01
Example High-Resolution HLA Genotypes with DRB
Sample AlleleReads 1kbF 581F 581R 1kbR DRB-F DRB-R
HIV_114 A*36:01 122 35 41 23 23 HIV_114 A*68:01:01 150 50 45 50 5 HIV_114 B*41:02:01 74 16 24 25 9 HIV_114 B*53:01:01 223 36 87 61 39 HIV_114 C*04:01:01 99 14 52 13 20 HIV_114 C*17:01:01 (primer) 45 2 32 2 9 HIV_114 DRB1*01:02:01 163 83 80HIV_114 DRB1*16:02:01 127 65 62HIV_114 DRB5*02-novel? 60 60 .
HIV_115 A*03:01:01 60 24 16 7 13 HIV_115 A*11:01:01 70 32 16 9 13 HIV_115 B*07:02:01 120 28 48 12 32 HIV_115 B*51:01:01 177 53 53 35 36 HIV_115 C*07:02:01 62 30 15 16 1 HIV_115 C*15:02:01 109 60 20 19 10 HIV_115 DRB1*04:04:01 165 86 79HIV_115 DRB1*07:01:01 228 114 114HIV_115 DRB4*01:01:01:01 93 75 18HIV_115 DRB4*01:03:01:01 99 75 24
HIV_116 A*01:01:01 122 37 31 49 5 HIV_116 A*02:01:01 97 40 17 31 9 HIV_116 B*08:01:01 213 57 71 63 22 HIV_116 B*15:01:01 129 21 58 32 18 HIV_116 C*03:04:01 103 27 43 21 12 HIV_116 C*07:01:01 114 46 22 41 5 HIV_116 DRB1*03:01:01 471 244 227HIV_116 DRB1*04:01:01 429 221 208HIV_116 DRB3*01:01:02 137 74 63HIV_116 DRB4*01:03:01:01 176 101 75
Sample Allele Reads 1kbF 581F 581R 1kbR DRB-F DRB-RHIV_117 A*26:01:01 167 24 74 40 29 HIV_117 A*29:02:01 96 24 31 24 17 HIV_117 B*44:03:01 (putative) 286 112 53 59 62 HIV_117 B*44:10 (putative) 210 113 51 46 . HIV_117 C*04:01:01 245 38 130 26 51 HIV_117 HIV_117 DRB1*03:01:01 173 94 79HIV_117 DRB1*07:01:01 171 81 90HIV_117 DRB3*02:02:01 50 25 25HIV_117 DRB4*01:03:01:01 44 29 15
HIV_118 A*02:01:01 117 33 46 24 14 HIV_118 A*23:01:01 156 42 61 39 14 HIV_118 B*40:01:02 113 13 50 35 15 HIV_118 B*44:03:01 206 51 81 63 11 HIV_118 C*03:04:01 84 7 47 15 15 HIV_118 C*14:03 142 28 61 31 22 HIV_118 DRB1*04:01:01 151 80 71HIV_118 DRB1*10:01:01 195 96 99HIV_118 DRB4*01:03:01:01 57 33 24
HIV_119 A*29:01:01:01 36 13 7 10 6 HIV_119 A*68:01:02 73 36 12 20 5 HIV_119 B*07:05:01 48 12 11 7 18 HIV_119 B*44:02:01:01 86 41 15 26 4 HIV_119 C*05:01:01 47 25 5 10 7 HIV_119 C*15:05:01/02 63 26 15 11 11 HIV_119 DRB1*04:04:01 233 89 144HIV_119 DRB1*07:01:01 250 105 145HIV_119 DRB4*01:03:01:01 77 33 44