nick martin queensland institute of medical research brisbane mrc caite symposium bristol
DESCRIPTION
Connecting biobanks - adding value in the genetics of complex traits The Australian Twin Collections Biobank. Nick Martin Queensland Institute of Medical Research Brisbane MRC CAiTE Symposium Bristol January 12, 2011. My brief…. how biobanks can be beneficial for researchers - PowerPoint PPT PresentationTRANSCRIPT
Connecting biobanks - adding value in the genetics of complex traits
The Australian Twin Collections
Biobank
Nick MartinQueensland Institute of Medical Research
Brisbane
MRC CAiTE Symposium Bristol
January 12, 2011
My brief…• how biobanks can be beneficial for
researchers • what’s happening and what is
accomplished • some results of projects I’m involved in
How beneficial biobanks can be for research[ers] (1)
1 page of authors and affiliations!
2 pages of authors and affiliations !
How beneficial biobanks can be for research[ers] (2)
• Founded 1978• Voluntary enrolment – schools, media, etc• ~30,000 pairs enrolled (~15% of all pairs)• Two adult cohorts studied
• 1893-1964 (5967 pairs), 1965-1971 (4629 pairs)• Typical of population wrt psychiatric symptoms,
personality, social class & education (females)• Males slightly more educated and middle class
• New cohort of ~8000 pairs (born 1972-85)
Australian Twin Registry
1980 1990 2000
Cohort 1
Cohort 2
Siblings
Parents
1985 1995
N12 5375
N12 6014
N23, A, D 3808 p / 576 s
N12, A, D 3051 p / 468 s
DSM-IIIR MD, PD 2456 p / 771 s
N12, A, D 1279 p / 558 s
N12, A, D 2270 p / 518 s
765
N23, CIDI 1172
N23, CIDI 404
N23, CIDI 894
Timetable of Questionnaires and Interviews
Quantitative phenotypes related to disease risk:• Metabolic / cardiovascular risks
Biochemical test resultsLipidsGlucose, insulinUrate, CRP, ferritinLiver enzymes GGT, ALT, AST, BCHE
• Personality, depression, anxiety, cognition, MRI, taste, smell
• Addictions (alcohol, nicotine, cannabis, opioids, gambling)
• Melanoma; endometriosis; asthma; migraine; twinning
QIMR GenEpi core interests
• Biochemical phenotypes N ≈ 19,000 adultsN ≈ 2,500
adolescents• GWAS N ≈ 20,000
Data (Twins and families)
ENGAGE participation• Meta-analysis of lipids, urate, alcohol, liver function
tests, glucose• Meta-analysis of iron markers, transferrin isoforms
Queensland Twin Registry
Adolescent twins + sibs
12yrs 14yrs 16yrs
Sun exposure -Sun protective behaviour -Mole counts and locations -Melanoma family history -Mosquito bite susceptibility -Mouth ulcers -Sociodemographic Variables Eye, hair and skin colour Personality (JEPQ, NEO) Acne Height, weight Blood pressure Fingerprints, handprints
Phenotypes measured on teenage twins included; - no information
12yrs 14yrs 16yrs
Photoaging (skin mould) Visual acuity AutoRefractometry (myopia) ENT (grommets, T&A) Asthma, eczema Laterality (hand, eye, foot) Hand preference (peg board) Binocular rivalry (bipolar) - Computer Use - - Reading Ability (CCRT) - - Cognitive Ability (IQ – MAB) - - Information Processing (IT) - - Working Memory (DRT) - - ERPs (DRT) - - EEG (power, coherence) - - Academic achievement (QCST) - - Taste (PTC, bitter, sweet) Smell (BSIT, NatGeo) - - Psychiatric signs (SPHERE) Relationships - - Leisure activity - -
12yrs 14yrs 16yrs
Haemoglobin Red blood cell count Packed cell volume Mean corpuscular volume Platelet count White blood cell count Neutrophils Monocytes Eosinophils Basophils Total lymphocytes CD3+ T-cells CD4+ helper T-cells CD8+ cytotoxic T-cells CD19+ B cells CD56+ natural killer cells CD4+/CD8+ T-cell ratio Blood groups (ABO, MNS, Rh) - -
Blood phenotypes
12yrs 14yrs 16yrs
Cholesterol, HDL, LDL Triglyceride Apolipoproteins A1,A2.B,E Lp(a) Glucose, Insulin Ca, PO4 Creatinine Urea, Uric acid Alkaline phosphatase Albumin, Bilirubin AST, ALT, GGT Fe, Ferritin, Transferrin Heavy metals (Pb, As etc)
Serum biochemistry
Population 21 millionArea 7.7 million km2
Preparing Labmailers
Biobottle Box Incoming Blood Samples
Receipting the blood sample
Preparing FTA cards
External blood collection: Labmailer Process
Samples are collected in the following tubes:
2 x EDTA 1 x SERUM 1 x ACD 1 x PAX 1 x BUCCAL
4 x Red Blood Cells
4 x Plasma
2 x Buffy Coats
4 x Serum
Stored in Freezer for later RNA work
The 2 x EDTA & 1 x SERUM tubes are centrifuged at 3000rpm for 10mins and then the fractions are collected. All fractions & 1 x Buffy Coat are stored in the -80oC freezers
MNC Processing Buccal Extraction
1 x Buffy Coat Extraction
Standard blood collection and processing
(10ml EDTA blood collection)
Average DNA Yield per buffy coat
Mean = 171.291Std. Dev = 68.5431N = 3,554
Genetic EpidemiologyFrozen sample inventoryFraction Number of SamplesPlasma 128,012Buffy Coats 101,333Red Blood Cells 130,668Serum 97,677Buccals 5,591FO Plasma 7,815FO BC 7,387FO RBC 7,500Total 485,983
Genetic EpidemiologyDNA sample inventory
Fraction Number of SamplesDNA Dilutions at 50ng/µl 44,926DNA Stocks 50,719DNA Other 16,443Total 112,088
Study Subjects N Platform Site Funding
CVD Risk Adult MZ ff 923 Illumina 317k Helsinki EU
Migraine+ Nic Adult twins 1,234 Illumina 610k deCode NHMRC
Alcohol (1) Adult twins 2,736 Illumina 370k deCode NIH
Alcohol (2) Adult sibships 4477 Illumina 370k CIDR NIH
Depression Adult cases (1,257) Affy 6.0 TGen GAIN
Endometriosis Adult cases 2,383 Illumina 660k deCode Wellcome
Adolescent Twin families 4,556 Illumina 610k deCode NHMRC+
Asthma/Angst Twin families 1,766 Illumina 610k Brown NHMRC
TOTAL 19,257
GWAS studies at QIMR
Australia’s changing ethic composition
NHGRI GWA Catalogwww.genome.gov/GWAStudies
Published Genome-Wide Associations through 6/2010
904 published GWA at p<5x10-8 for 165 traits
• Genetic risks for complex traits are modest• A genetic risk (OR) of 1.3 (2% variance) is large• Most genetic risks are in the 1.1 to 1.2 range or
less (<1% variance)• This is true for most complex diseases (e.g.
alcoholism, schizophrenia, bipolar disorder, lung cancer) and traits (height, BMI, lipids)
BUT not always………….(use your Biobank !)
(Most) genetic effects are modest
• a waste product of the normal breakdown of red blood cells
• excreted from the body after being conjugated with glucuronic acid ~ UGT (Uridine Diphosphate Glucuronyltransferase) enzyme
• a diagnostic marker of liver and blood disorders• acts as an antioxidant: an increase in bilirubin
levels is associated with a reduced risk of cardiovascular diseases
Serum Bilirubin
rs2070959
Bilirubin in adolescents
Measure Allele Effect (b) Se R2 P Value
Age 12 A -0.58 0.04 21% 3E-59Age 14 A -0.71 0.05 23% 1E-50Age 16 A -0.97 0.06 29% 4E-65Age 18 A -0.72 0.09 24% 5E-15Mean A -0.76 0.03 28% 2.1E-115
– What genes affect iron status (e.g. serum iron, transferin, saturation, ferritin), and the risk of either deficiency or overload in general population?
Genetics of Iron Status
HFEP = 5E-38
HFEP = 1E-73
TMPRSS6P = 7E-27
TFP = 3E-104
HFEP = 8E-83
TMPRSS6P = 2E-27
HFEP = 4E-12
ZNF521 (Zinc Finger Protein 521)P = 4E-08
Serum iron
Transferrin
Tf saturation
Ferritin
GWAS (N = 8942)
ENGAGE meta-analysis to find more iron metabolism genes
Large effects of TF and HFE variants
Measures TF Mutation (rs3811647) HFE mutation (rs1800562)
Effect % var p Effect % var p
Iron -.01±.10 SD 0 .81 .66±.10 SD 10 3.5 x 10-11
Transferrin .46±.06 SD 13 3 x 10-15 -.68±.10 SD 9 1.1 x 10-10
Saturation -.17±.06 SD 2 .002 .80±.10 SD 13 4.3 x 10-15
Ferritin -.13±.06 SD 1 .03 .44±.11 4 4.5 x 10-5
• Enzyme found in plasma• Rare variants in BCHE
extensively studied because of pharmacogenetic effects
• Evidence of involvement with T2DM, CVD, Alzheimer disease (questionable)
Correlations ≥ 0.25 for:BMIBlood pressureApoBApoETotal cholesterolTriglyceridesGGT
+ significant but smaller correlations for ALT, AST, HDL-C, LDL-C, urate.
Butyrylcholinesterase (BCHE)
GWAS Meta-Analysis (3 studies, total N = 8781)
Cholinesterase
Before and After Adjustment for the BCHE K Variant –many other variants contributing…….
QQ Plots
All SNPs with p ≤ 0.001 (Total 5662, of which 2003 mapped to 440 genes)
Ingenuity Pathway Analysis on all butyrlcholinesterase GWAS data
CD4+/ CD8+ ratio h2 = 0.84 (0.79–0.87)
Not only blood variables show large SNP effects...
λ = 1.00008
Hair curliness – straight, wavy, curly
P = 10-31
Other peaks
GWAS for curliness in
three independent Australian Cohorts
~6% variance
GWAS for hair curliness
Trichohyalin is expressed in hair root sheaths
Heterogeneity of gene effects by age, and
sex...and environment?
Several significant hits in the combined data, but not the expected one on Chr. 22
Heterogeneity between adult and adolescent results at this locus!
?
Liver function: gamma glutamyl transferase (GGT)
Multiple SNPs show
heterogeneity between adult and adolescent results for GGT
Melanocytic naevi (common moles)
The largest risk factor for melanoma
IRF4 MTAP
Note inverse association signals for MTAP and IRF4 with flat and raised nevi
QIMR GWAS for total, flat and raised nevi
American Journal of Human Genetics 87, 6–16, 2010
Mole count: Interaction of IRF4 genotype with age
• 4 point rating (none to severe)• 3 sites – face, chest, back
• at age 12 and 14• at age 16 face only
• How to combine these 7 measures ?• Lots of missingness• Item response modelling in WinBUGS enables
Bayesian estimation of liability, allowing for twin relatedness and adjusting for age, sex
Teenage acne
Joint
F + M
Females
Males
GWAS for Acne – different genes for males and females ?
• Is sensitivity to the environment a function of genotype?
• For MZ twins |twin1 – twin2| is a pure measure of e
• does |twin1 – twin2| vary systematically between genotypes?
• A direct test of G x E
Gene – environment interaction
Systematic GWA search for GxE using MZ twins
• 1800 MZ female pairs aged 30-70 from AU, UK, NL, DK, SE
• GWAS using Illumina 317k array• Focus on CVD risk factors (lipids), but
other phenotypes as well (including depression)
GenomEUtwin
Genome-wide association scan of MZ pair mean levels of HDL
cholesterol
1800 MZ female pairs from GenomEUtwin
A gene for environmental sensitivity on Chr 16 ?
GWAS of MZ pair |differences| of HDL cholesterol
- expression and epigenetic data
Adding value to your Biobank (1)
Study Design
980 Individuals
Full FamiliesParents +
Offspring (MZ / DZ / Sib)
MZ and DZ twin pairs
PAX
MZ, DZ and Sib
~2/3 of samples
~1/3 of samples
PAX PAX
• Gene expression
profiles for ~980 individuals
• Individuals from 3 ‘family’ groups
• Only PAX gene expression generated
• expression levels generated using Illumina HumanHT-12 v4.0 chips
Expression levels can be correlated with all other phenotypes
eQTL Study
Study Design
980 Individuals
Full FamiliesParents +
Offspring (MZ / DZ / Sib)
MZ and DZ twin pairs
Methylation
MZ, DZ and Sib
~2/3 of samples
~1/3 of samples
Methylation Methylation
• From the sample
individuals as the full expression study
• Whole genome methylation levels determined
• Using Illumina methylation 450k chips
Methylation levels can be correlated with expression…and with MZ discordance !
Methylation levels
- widespread methylation differences
Changes in the pattern of DNA methylation associate with twin discordance in systemic lupus erythematosus.
Javierre BM et al. Genome Res. 2010 20: 170-179, 2010
MZ pairs discordant for SLE
- keep adding new phenotypes !
Adding value to your Biobank (2)
Associated with:
testosterone exposureaggression
ADHDhomosexuality
fertilityothersMultivariate Genetic Analyses of the 2D:4D Ratio: Examining the Effects of Hand and
Measurement Technique in Data from 757 Twin Families.Sarah E. Medland and John C. LoehlinTwin Research and Human Genetics 11: 335–341, 2008
Ratio of 2nd to 4th finger length
LIN28B SNP associated with:
2D:4D ratioAge of menarche
MenopauseHeight
A Variant in LIN28B Is Associated with 2D:4DFinger-Length Ratio, a Putative RetrospectiveBiomarker of Prenatal Testosterone ExposureSarah E. Medland…. David M. Evans Am J Human Genetics 86, 519–525, 2010
Large consortia…..
Brisbane Adolescent Twin database - (>700 scanned) Data acquisition: 4 Tesla Bruker Medspec scanner –
CMR, UQ MRI DTI (HARDI) fMRI (n-back) resting-fMRI
Processing and analysis: MRI - UCLA DTI (HARDI) -UCLA fMRI (n-back) - UQ resting-fMRI – UQ + NYU
Twin Imaging Study (TIMS)
http://enigma.loni.ucla.edu/
- sequencing !
Adding value to your Biobank (3)
Whole-genome sequencingWhy?Discover novel, rare variants with potential relevance for disease, including CNVs.These can then be imputed/genotyped and tested for association in large cohorts.
Pilot study: first look at data14 cases + 1 control (including trio) sequenced with deep coverage using HiSeq.Cases with strong family history, severe disease and other co-morbid phenotypes.
~97% concordance of sequence with KGP imputation (610k)
Twins and their families for the participation
John Whitfield, Peter Visscher, David Duffy, Grant Montgomery, Dale Nyholt
Dixie Statham, Ann Eldridge, Marlene Grace, Anjali Henders and Megan Campbell,
Leanne Wallace for the data collection and sample processing.
Allan McRae, Manuel Ferreira, Brian McEvoy, Scott Gordon, Sarah Medland, Gu Zhu,
Beben Benyamin, Rita Middelberg, Margie Wright for helping with data & analysis
Harry Beeby and David Smyth for IT support
Collaborators: Netherlands Twin Registry: Gonneke Willemsen, Jouke-Jan Hottenga, Eco
de Geus, Brenda Penninx, Dorret Boomsma UK Twin Registry: Tim Spector, Mangimo Massimo ALSPAC Study: David Evans, George Davey Smith Sanger Institute / U Helsinki: Aarno Palotie, Leena Peltonen University of Queensland: Ian Frazer, Rick Sturm, Greig de Zubicaray Washington University, St. Louis: Andrew Heath, Pam Madden
Acknowledgements