hla-b*5701 genotype is a major determinant of drug-induced ......1 hla-b*5701 genotype is a major...
TRANSCRIPT
1
HLA-B*5701 genotype is a major determinant of drug-induced liver injury due to
flucloxacillin
Ann K Daly1, Peter T Donaldson1, Pallav Bhatnagar1, Yufeng Shen2, Itsik Pe'er2, Aris
Floratos2, Mark J Daly3, David B Goldstein4, Sally John5, Matthew R Nelson6, Julia
Graham1, B Kevin Park7, John F Dillon8, William Bernal9, Heather J Cordell1, Munir
Pirmohamed7, Guruprasad P Aithal10,11, Christopher P Day1,11 for the DILIGEN study12 and
International SAE Consortium12
Nature Genetics: doi:10.1038/ng.379
2
Supplementary note
DILIGEN study team
Investigators: AK Daly (PI), CP Day, PT Donaldson, C Donaldson (Newcastle University); GP
Aithal (Nottingham Digestive Diseases Centre); M Pirmohamed, BK Park, S Khoo, I Gilmore
(University of Liverpool); W Bernal (Kings College Hospital, London). Research nurses: J
Henderson (Newcastle University); C Davies (Nottingham Digestive Diseases Centre); K Hawkins,
A Hanson, J Evely (University of Liverpool); J Calera (Kings College Hospital). Other contributors
to flucloxacillin case recruitment: N Thompson (Freeman Hospital, Newcastle), R Williams and M
Morgan (Royal Free and University College Medical School), H Hussaini (Truro), EM Phillips
(Hexham), P Mills (Glasgow), M Groom and M Miller (Ninewells Hospital, Dundee), M Patel
(Merthyr Tydfil), H Mitchison (Sunderland), W Griffiths (Addenbrooks Hospital, Cambridge), JG
Kingham (Singleton Hospital, Swansea), D Das (Stepping Hill Hospital, Stockport), J Collier (John
Radcliffe Infirmary, Oxford), A Brind (North Staffordshire), N Fisher (Dudley), J Shearman (South
Warwick), E Elias (Birmingham), A Grant (Leicester Royal Infirmary), S Hellier (Worcester), A
Austin (Derby).
International serious adverse events consortium (SAEC) management team
Arthur L Holden [SAEC], Brian Spear [Abbott], Joe Walker [Daiichi-Sankyo], Dan Burns
[GlaxoSmithKline], Lon Cardon [GlaxoSmithKline], Eric Lai [GlaxoSmithKline], Matt Nelson
[GlaxoSmithKline], Allen Roses [GlaxoSmithKline], Nadine Cohen [Johnson & Johnson],
Quingqin Serena Li [Johnson & Johnson], Joanne Meyer [Novartis], Steve Lewitzky [Novartis],
Sally John [Pfizer], Duncan McHale [Pfizer], Klaus Lindpaintner [Roche], Steven Kovacs [Sanofi-
Aventis], Leonardo Sahelijo [Takeda], Michael Dunn [Wellcome Trust], Maha C Karnoub [Wyeth],
and Michael E Burczynski [Wyeth].
Nature Genetics: doi:10.1038/ng.379
3
Supplementary Results
Eigenstrat analysis
To account for the possibility of associations resulting from population stratification, we applied
EIGENSTRAT on the data set. The chi-square statistics from trend test conditioned on the first 10
eigen-vectors were in close correlation with the original trend test (correlation coefficient = 0.91;
Supplementary Fig. 2a), and the signals from chromosome 6 and the top SNPs were still genome-
wide significant (Supplementary Fig. 2b).
Copy Number Variation (CNV) analysis
We exported the Log R Ratio (LRR) and B Allele Frequency (BAF) values of all the Illumina 1M
probes for all samples from Illumina BeadStudio with default parameters. The LRR value is the
normalized signal intensity for each probe. We discarded samples with large LRR standard
deviations or low average LRR values to prevent spurious CNV calls (Supplementary Fig. 4).
Eleven cases and 10 population controls were removed. We used PennCNV1 to call CNVs for each
individual samples based on LRR and BAF values (Supplementary Table 9). We then applied
PLINK v1.04 to calculate the association of CNVs in the remaining 40 cases and 272 controls. Only
one CNV region from chromosome 11 was significantly associated after permutation: 5 cases
carried 3-copy duplications, 2 cases carried 1-copy deletions, while none of the controls carried it
(p-value = 0.02 after genome-wide permutation). The size of the CNVs varied from 103 kb to 214
kb, with a common region of 50 kb (48,756,916 bp to 48,807,363 bp). The CNV region is about
200 kb away from genes OR4A47 and FOLH1. It is flanked by segmental duplications, and the
location was reported to have CNVs by previous studies (Supplementary Fig. 5)2,3.
For the samples removed due to failure to pass quality control, data for the CNVs that are larger
than 100 kb and contain no less than 10 probes is still likely to be correct. Among the 11 cases and
Nature Genetics: doi:10.1038/ng.379
4
10 controls removed, there were two cases carrying chromosome 11 duplications similar to those
described above, one case carrying the deletion and one control carrying the duplication. When
combined with the data from samples that had passed quality control, nominal p-values of 1.4x10-6
(permutation p-value 0.00014) for duplication only and 3x10-8 (permutation p-value 2.3x10-5) for
duplications and deletions were obtained.
References
1. Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17, 1665-74 (2007).
2. Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat Genet 36, 949-51 (2004).
3. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444-54 (2006).
Nature Genetics: doi:10.1038/ng.379
5
Supplementary Table 1. Clinical and biochemical variables of DILI patients exposed to flucloxacillin
Sex (F/M) 36/15
Age at onset (yrs) 63.1 ± 12.2
Time to onset (days) 25.4 ± 22.4
Total days on drug 11.3 ± 7.8
Pattern of liver injury
Cholestatic 44 (86.3)
Hepatocellular 4 (7.8)
Mixed 3 (5.9)
ICC scoring
3-5 (possible) 4 (7.8)
6-8 (probable) 18 (35.3)
>8 (highly probable) 29 (56.9)
Peak Bilirubin (µmol/l) 271 ± 240
Peak ALT (U/l) 398 ± 264
Peak ALP (U/l) 644 ± 874
ALT/ALP decreased by ≥
50% above ULN after drug
discontinuation
Yes 46 (90.2)
No 3 (5.9)
Not known 2 (3.9)
Time taken for ALT/ALP to
decrease to ≥ 50% after
discontinuation 74 days ± 62 days
Percentages are shown in parentheses
Nature Genetics: doi:10.1038/ng.379
6
Supplementary Table 2. Number of markers removed in GWAS quality control steps
Criteria Threshold Number of markers
removed
Genotyping missingness > 0.05 27,941
Hardy-Weinberg
Equilibrium test
p <= 10-7 in controls 1,034
Minor Allele Frequency < 0.01 200,835 *
Total number of markers
removed
204,231 **
* This includes 136,223 CNV-specific markers, which are all non-polymorphic.
** The total is not the sum of the numbers from three categories due to overlap.
Nature Genetics: doi:10.1038/ng.379
7
Supplementary Table 3. rs2395029 genotypes in cases and POPRES controls
G/G G/T T/T P-value OR (95%CI)
Controls (n=282) 252 (89.4) 30 (10.6) 0
Cases (n=51) 8 (15.7) 39 (76.5) 4 (7.8) 8.7 x10-33 45 (19.4-105)
The odds ratio is for carriage of the T variant and assumes a dominant model. Numbers in
parentheses are percentages. The p-value is not corrected for multiple testing.
Nature Genetics: doi:10.1038/ng.379
8
Supplementary Table 4. Distribution of HLA-DRB1 alleles in flucloxacillin DILI cases and drug-exposed controls
HLA DRB1 allele
DILI cases (n=51) DILI controls (n=64)
p-value pc OR (95% CI)
Positive for allele
Negative for allele
Positive for allele
Negative for allele
* 01 7 (13.7) 44 (86.3) 13 (20.3) 51 (79.7) 0.459 * 03 16 (31.4) 35 (68.6) 24 (37.5) 40 (62.5) 0.557 * 04 10 (19.6) 41 (80.4) 22 (34.3) 42 (65.7) 0.096 * 07 36 (70.6) 15 (29.4) 16 (25.0) 48 (75.0) 1.61x10-
6 1.9x10-5 7.2 (3.15
- 16.45) * 08 5 (9.8) 46 (90.2) 1 (1.6) 63 (98.4) 0.086 * 09 0 (0.0) 51
(100.0) 2 (3.1) 62 (96.9) 0.502
* 10 0 (0.0) 51 (100.0)
1 (1.6) 63 (98.4) 1
* 11 4 (7.8) 47 (92.2) 4 (6.3) 60 (93.7) 1 * 13 9 (17.6) 42 (82.4) 14 (21.9) 50 (78.1) 0.643 * 14 1 (1.9) 50 (98.1) 4 (6.3) 60 (93.7) 0.38 * 15 5 (9.8) 46 (90.2) 18 (28.1) 46 (71.9) 0.018 0.21 0.27
(0.09 - 0.81)
* 16 1 (1.9) 50 (98.1) 1 (1.6) 63 (98.4) 1
Pc: corrected p-value (correction factor = 12). Numbers in parentheses represent percentages of
individuals positive or negative for the individual alleles. All *07 alleles were found to be *0701 by
high resolution typing.
Nature Genetics: doi:10.1038/ng.379
9
Supplementary Table 5. Extended HLA class II haplotypes in DRB1*0701-positive cases and controls
DRB1*0701-positive
haplotypes
Cases
(n=41)
Controls
(n=19) p-value
DRB1*0701-DQB1*02 8 (7.8) 13 (10.2) 0.647
DRB1*0701-DQB1*0303 33 (32.3) 6 (4.7) 1.91 x 10-8
OR for DRB1*0701-DQB1*0303 possession = 9.72 (95%CI 3.88 - 24.36). Numbers in parentheses
represent percentage of all haplotypes among cases or controls. P-values are uncorrected.
Nature Genetics: doi:10.1038/ng.379
10
Supplementary Table 6. TNF and HSPAIL genotypes
TNFα rs1799964 CC CT TT p-value Odds ratio (95% CI) Cases (n=48) 11 (22.9) 31 (64.6) 6 (12.5) 5.64x10-8 12.41 (4.55-33.8) Controls (n=61) 1 (1.7) 21 (34.4) 39 (63.9) rs1800629 AA AG GG p-value Odds ratio (95% CI) Cases (n=51) 0 (0.0) 15 (29.4) 36 (70.6) 0.079 0.46 (0.21-0.99) Controls (n=64) 3 (4.7) 27 (42.2) 34 (53.1) rs1800630 AA AC CC p-value Odds ratio (95% CI) Cases (n=50) 0 (0.0) 10 (20.0) 40 (80.0) 0.544 1.42 (0.58-3.49) Controls (n=61) 1 (1.6) 15 (24.6) 45 (73.8) rs361525 AA AG GG p-value Odds ratio (95% CI) Cases (n=51) 5 (9.8) 34 (66.6) 12 (23.5) 4.33x10-13 38.35 (12.5 - 117.4) Controls (n=64) 0 (0.0) 5 (7.8) 59 (92.2) HSPA1L rs2227956 TT TC CC p-value Odds ratio (95% CI) Cases (n=51) 14 (27.4) 29 (56.9) 8 (15.7) 3.33x10-6 6.25 (2.8 - 14.2) Controls (n=64) 45 (70.3) 19 (29.7) 0 (0.0) Odds ratios are for carriage of the minor allele. p values are uncorrected. Numbers in parentheses
are percentages showing the particular genotype.
Nature Genetics: doi:10.1038/ng.379
11
Supplementary Table 7. Conditioning analysis on selected MHC gene data
Conditioning
marker
Test marker
B*5701 TNF-α
rs361525
HSPAIL
rs361525
DRB1*0701
HLA-B*5701 2.7 x 10-16 0.242 0.931 0.256
TNF-α
rs361525
0.00058 5.45 x 10-14 0.755 0.19
HSPAIL
rs2227956
6.9 x 10-11 1.42 x 10-8 7.44 x 10-7 7.4 x 10-4
DRB1*0701 4.1 x 10-11 2.3 x 10-8 3.51 x 10-5 2.7 x 10-4
All p values were calculated using a likelihood ratio chi-square test. P values for individual markers
are shown in bold on the diagonal. The p values shown on the off-diagonals correspond to testing
the effect of adding a particular marker (the test marker) to a model that already includes the
conditioning marker. The top row, conditioned on HLA-B*5701, shows no significant effect for
adding additional markers but in the other rows, conditioned on other markers, the model is
improved by adding B*5701 and, in some cases, other markers also.
Nature Genetics: doi:10.1038/ng.379
12
Supplementary Table 8. Effect of B*5701 genotype status on disease severity
B*5701 Positive
(n=43)
B*5701 Negative
(n=8) p-value
Sex (F/M) 31/12 5/3 0.059
Age at onset (years) 63.2 ± 12.5 62.6 ± 11.3 1.00
Time to onset from initial drug
intake (in days) 23.4 ± 11.3 35.7 ± 51.2 0.56
Total days on drug 11.1 ± 7.4 12.6 ± 10.5 0.902
Pattern of liver injury
Cholestatic 37 (86.0) 7 (87.5)
Hepatocellular 4 (9.3) -
Mixed 2 (4.7) 1 (12.5)
1.00
ICC scoring
3-5 (possible) 4 (9.3) -
6-8 (probable) 16 (37.2) 2 (25.0)
>8 (highly probable) 23 (53.5) 6 (75.0)
0.267
Peak Bilirubin (mmol/l) 291 ± 250.5 165.4 ± 140.8 0.195
Peak ALT (U/l) 390 ± 244.4 440.6 ± 371.2 0.866
Peak ALP (U/l) 689 ± 943.7 405.4 ± 181.4 0.265
Nature Genetics: doi:10.1038/ng.379
13
Supplementary Table 9. Potential additional signals from chromosomes other than 6
Odds Ratio (OR) was estimated based on allelic frequency. The chromosomal positions are based
on NCBI human genome build 36
3 99498527 rs1603605 TREND 0.3942 0.1738 3.12E-06 3.095 OR5H2
3 99508366 rs1472413 TREND 0.3942 0.1738 3.12E-06 3.095 OR5H2
3 99516090 rs7634235 TREND 0.3942 0.1738 3.12E-06 3.095 OR5H2
3 99517216 rs1497546 TREND 0.1346 0.0231 1.94E-07 6.569 OR5H2
3 99522310 rs11928290 TREND 0.3942 0.1738 3.12E-06 3.095 OR5H2
3 99540350 rs12630857 TREND 0.3942 0.1755 3.20E-06 3.057 OR5K4
9 26596060 rs10812425 TREND 0.4423 0.2252 2.20E-06 2.729 C9org82
9 26604847 rs10812428 TREND 0.5673 0.3149 1.28E-06 2.852 C9org82
12 36965842 rs1973293 TREND 0.6562 0.4029 3.31E-06 2.829 ALG10B
12 36978814 rs1825806 TREND 0.625 0.3865 4.71E-06 2.645 ALG10B
12 36987497 rs6582576 TREND 0.625 0.3865 4.71E-06 2.645 ALG10B
12 36994816 rs1843876 TREND 0.625 0.3865 4.71E-06 2.645 ALG10B
12 36996726 rs4882284 TREND 0.625 0.3865 4.71E-06 2.645 ALG10B
12 37003945 rs6582607 TREND 0.6154 0.3741 3.48E-06 2.677 ALG10B
12 37016722 rs10880934 TREND 0.6154 0.3706 2.74E-06 2.718 ALG10B
12 37029775 rs6582630 TREND 0.6346 0.3812 1.49E-06 2.819 ALG10B
12 37078615 rs7968322 TREND 0.6154 0.3723 4.08E-06 2.697 ALG10B
12 37207850 rs7980932 TREND 0.6154 0.3723 4.66E-06 2.697 ALG10B
15 92740512 rs4984390 TREND 0.1731 0.406 4.20E-06 0.3062 MCTP2
OR Genes nearbyChromo
some
Risk allele
frequency in
cases
Risk allele
frequency in
controls
Position SNP name test p-value
Nature Genetics: doi:10.1038/ng.379
14
Supplementary Table 10. The statistics of CNV calls from 40 cases and 272 controls A. Copy number Cases (average number of
CNVs) Controls (average number of CNVs)
0 134 (3.4) 914 (3.4) 1 1767 (44) 12193 (45) 3 1081 (27) 5191 (19) 4 10 (0.25) 69 (0.25) B. Cases Controls Total number of CNVs 2992 18637 Average # of CNVs per sample
75 68
Total size of CNVs (kb) 2988 2459 Average total size of CNVs per sample (kb)
41 38
Nature Genetics: doi:10.1038/ng.379
15
Supplementary Table 11. rs10937275 genotyping summary rs1093725-positive rs1093725-negative Cases positive for B*5701/rs2395029(n=48)
25 (52.0) 23 (48.0)
Cases negative for B*5701/rs2395029 (n=10)
1 (10.0) 9 (90.0)
POPRES controls (n=282) 50 (17.7) 232 (82.2) Odds ratio for disease development in positive cases 5.04 (95% CI 2.65-9.60); p-value =1.17 x 10-6
Numbers in parentheses represent percentages.
Nature Genetics: doi:10.1038/ng.379
16
Supplementary Figure 1 (a)
(b)
Supplementary Figure 1. Principal component analysis of population structure. Panel (a)
shows an overview of the entire POPRES control cohort and flucloxacillin cases. Panel (b) is an
enlarged view of the Northern European cluster.
Nature Genetics: doi:10.1038/ng.379
17
Supplementary Figure 2
(a)
(b)
Supplementary Figure 2. EIGENSTRAT analysis. The first 10 eigen-vectors were included in
the analysis. Chi-square statistics were calculated from the trend test conditioned on the 10
components. Panel (a) shows Eigenstrat chi-square plotted against chi-square for trend. Panel (b)
shows Eigenstrat chi-square plotted against chi square quantiles.
Nature Genetics: doi:10.1038/ng.379
18
Supplementary Figure 3 (a)
(b)
Supplementary Figure 3. The p-values of SNPs from the MHC region. The p-values of SNPs
from MHC region. X-axis is the position on chromosome 6 (NCBI build 36). Y-axis is –log10 of p-
values from logistic regression. The approximate positions of three genes (HLA-B, HCP5, and
HLA-DQA1) are marked. (a) p-values of case-control association by logistic regression. (b) p-
values of case-control association conditioned on rs2395029. rs2395029 was modeled as an
additional factor in the logistic regression. The figure indicates that no other SNP in MHC region is
genome-wide significantly associated that are independent of rs2395029 alleles.
Nature Genetics: doi:10.1038/ng.379
19
Supplementary Figure 4
Supplementary Figure 4. CNV quality control: removing poor-quality samples. The figure
shows LRR standard deviations (LRR SD) versus LRR mean values for all samples, each
represented by a dot. The gray lines mark the ad hoc cutoff values of LRR mean (-0.018) and LRR
SD (0.19). Samples in the top-left quarter were removed. It is noticeable that cases are more likely
than controls to be removed.
Nature Genetics: doi:10.1038/ng.379
20
Supplementary Figure 5
Supplementary Figure 5. Genomic features around the CNV in chromosome 11. The red and
blue “PLINK CNV” tracks represent the deletions and duplications respectively of the cases.
Nature Genetics: doi:10.1038/ng.379