pamela linksted - ccace
TRANSCRIPT
2/11/2014
1
GS offers access to high quality, ethically
consented samples & data for genetic &
health related research
Pamela LinkstedCCACE Scientific Themes
Feb 2014
Scottish Family
Health Study (SFHS)
Genetic Health
in the 21st Century (21CGH)
Donor DNA
Databank (3D)
~2,000~24,000 ~5,000
Combined total 30,000
Generation Scotland
GS Resources – Size & Confidentiality
SFHS
24,000
3D5,000
21CGH
2,000
AnonymousCoded
Record LinkageRecontact
Size of circle
depicts number of
participants
GS Resources – Data & Samples
21CGHDemographic
Moderate phenotypeClinical
DNAPlasma
Cells
3D
SFHSDemographic
Intensive phenotype
ClinicalDNA
Serum
UrineBlood
Biochemistry
Demographic
Minimal phenotypeDNA
Plasma
Size of circle
depicts quantity of
data collected
GS Resources - Key features
• Large population-based cohorts
• Rich sample resource
– Biochemistry data, DNA, serum, urine & cryopreserved blood
• Clinical measurements, health & demographic data
– Including cognitive function, mental health, cardiovascular, metabolic respiratory, pain and musculoskeletal disease
• Genotype and biomarker data
• Consent for
– Routine health record linkage
– Re-contact for further research
• Efficient governance & management of access
• Broad Consent for Academic & Commercial Medical Research
What was collected?
2/11/2014
2
SFHS family size (7023 families)
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20+
members in study
fam
ilie
s
Family cohort (SFHS)
~7000 families (n = 2-36, some complex pedigrees, some three generations, average size 4.25)
• ~3,500 parent-child trios• ~12k sib, ~13k parental &
~9k avuncular - kinship pair
• ~14,000 unrelated
Primary care-based recruitment • ~24,000, age 18 – 98yrs• Informed by extensive public
engagement• Ethnicity: 99% white
96% born in the UK87% in Scotland
Phenotype and SamplesPersonal information
• Pedigree
• Demographics
Clinic measurements• Body Measurement
• Ankle-Brachial Pressure Index
• Spirometry
• ECG
• Cognitive testing*
• SCID (major mental Disorders)*
• Psychometric testing*
Biological Samples • DNA
• Serum
• Cryopreserved blood
• Urine
Biological samples data• Biochemistry
• Genotype
*validated methodology
Questionnaire
• Family History
• Family Health
• Medications
• Operations
• Chest Pain*
• Musculoskeletal
• Chronic Pain*
• Exercise
• Thoughts &
experiences (SPQ-B,
MDQ)*
• Diet
• Alcohol
• Smoking
• Education
• Occupation
• Household
• Women’s Health
Heart Disease
Stroke
High Blood Pressure
Diabetes
Alzheimer's Disease
Parkinson's Disease
Depression
Breast Cancer
Bowel Cancer
Lung Cancer
Prostate Cancer
Hip Fracture
Osteoarthritis
Rheumatoid Arthritis
Asthma
COPD
Validated Tests of Cognitive Function & Mental health
Verbal fluency scores
0%
5%
10%
15%
20%
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Words in 1 minute
C F L
Eysenck Personality Questionnaire
0%
5%
10%
15%
20%
0 1 2 3 4 5 6 7 8 9 10 11 12
score / 12
Neuroticism Extraversion
1. Verbal Fluency
2. Eysenck Personality Questionaire
Neuroticism vs. Extraversion
1. Verbal Fluency*
2. Eysenck Personality Questionnaire*
3. Wechsler Memory Test (short-term and delayed)*
4. Digit Symbol test*
5. Mill Hill Vocabulary Scale*
6. General Health Questionnaire-28
7. SCID screen for DSMIV Major Depressive Disorder
8. Schizotypal Personality Questionnaire (SPQ-B)
9. Mood Disorder Questionnaire (MDQ)
10. Choice Reaction time & Impulsivity
(*also collected in 21CGH)
>2,400 individuals with MDD or rMDD (~13%)
>400 families with 2 or more
cases
Samples at time of recruitment
Biological samples
• DNA master stocks and working stock plates for all study participants (~24,000)
• Blood spots on FTA card (equivalent to Guthrie) (~20,500)
• Serum and Urine samples and cryopreserved blood (~20,000)
Data from samples
• Biochemistry from fresh blood: (~20,000) glucose, urea, creatinine,
potassium, sodium, total cholesterol, HDL cholesterol
GS Data Portal (www.gsaccess.org)
Email [email protected]
Application
Data browser
2/11/2014
3
Access to the resource
GS Access• Managed access a ‘Research Tissue Bank’
• Consent for “Biomedical Research”
• Consent for recontact, linkage and commercial use
• Online application system www.gsaccess.org
• Dedicated management team
• Principals
– Regulatory Framework (NHS)
– Participant confidentiality & consent
– Maximise use for medical research
– Availability & protection of resource
– Commercial/IP implications
– Quality
– Sustainability
– Collaborative Approach
• Seeking HIS Accreditation within NHS Lothian
Consideration by GSAC
Analysis
Enquiry
GS Application Process
Often multistage /multifaceted
process
Who?
What?
Why?
When?
Where?
How?
Approvals?
Dependencies?
Funding?
Outline plan with
timescales and resources required
before costs and
approval by GSAC.
Full plan required
before signoff of the
DMTA!
e-form review considering:
Ethics & Confidentiality
Within ConsentPeer reviewResource (Data/Samples)
Overlap Amendment
& re-review
e-form submission (CPF)
Agreement (DMTA)
Withdrawn /deferred
Outline Plan
Full Plan
Costing
Approved - Finalise CPF
Collaboration Proposal
Publications
Implement Project:Data/sample selection
e.g. case/control matching
Data release/remote accessSample selection, analysis or release
Linkage to NHS dataTargeted re-contact
etc.
Return data to
GS
Acknowledgment &
co-authorship Condition of Collaboration
Growing GS Resources
– Use to Date• >180 Proposals and >300 Enquiries to date
• Academic & Commercial
• Types of Study
– Data only
– Commissioned DNA analysis (e.g. SNP analysis)
– GWAS on 14,000 participants
– Biomarker analysis
– Case and Control matching
– Pedigree analysis
– Targeted recruitment (based on e.g. genotype)
– Linkage to prescribing, SMR, SCI-DC data
Using and
Enhancing the resource
Using and
Enhancing the resourceSamples analysis
Record Linkage
Recontact
Expert Working Groups
Current and future activity
2/11/2014
4
Data from Samples Genotype from stored DNA:
• (~10,000) 700k SNP GWAS and 250k exome array (+4000)
• (~900) whole exome sequence
• (~70) whole genome sequence,
• plus numerous (~30) taqman/openarray involving ~14,000 SNPs
(average samples per assay 548)
Frozen Serum:
• (~1800) IL-6 and CRP
• (~1500) Anti-Mullerian hormone
• (~400) miRNA
Cryopreserved bloods:
• (20) Pilot whole blood transformation and extraction of DNA
• (32) Feasibility of IL-1RI co-receptor TILRR mRNA measurements
Urine:
• (~4000) Urinary traits
GWAS #1
GS ExSeq
UK10K
GWAS#29216
4212
17
209
32
3999
2
167
GWAS and ExomeSequence
GWAS #1 9905
GWAS #2 4168GS ExSeq 442UK10K ExSeq 428
Proton ExSeq 42
39 3Ion
Proton
+ 18 Whole Genome Sequence
Collaborations underway with GWAS Consortia
CARTA Tobacco & Alcohol
CHARGE Multiple Phenotypes inc. adiposity, lipids, diabetes, cognition, educational
attainment family studies,
CKDgen Kidney
ECUT Urinary Traits
PGC Psychiatric Genomics Consortium
ReproGEN Reproductive Health
ROHgen Homozygosity & Health
SpiroMETA Lung Function
Record Linkage
DNA
Urine
Serum
Blood
Samples (LIMS)
GenotypeBiochemistry
Data from SamplesMedical
Records
Data Linkage
Prescribing
Personal data
Personal
Unique
ID
New Data
Future Recontact
GS DataDemographic
Phenotype
Data Collected
Community
Health Index
GS:SFHS Medical Record Linkage
– events before and since participation
100% of GS:SFHS participants gave consent for their data to be linked to their “medical records for health related research”.
92% have community health index (CHI) and consent to linkage
• Enhance the baseline data collected at recruitment
• Long-term follow-up through linkage and recontact
2/11/2014
5
• Hospital admissions data• Maternity and neonatal data
• Mental health data • Disease registry data
• GRO (deaths)• GP data /SPIRE• Prescribing and dispensing
• NHS Scottish Care Information (SCI-Store) - Lab measures• NHS Scottish Care Information (SCI-DC) - Diabetes
• SSCA - Scottish Stroke Care Audit • Dental Records• Breast feeding
• Sleep• Infections
Types of data of interest to GS
collaborators
SFHS Linkage - validation and characterisation study
GS:
• Clinical measurements
• Lifestyle questionnaire
• Biochemistry
• Cognitive tests
• SCID
• Pedigree
• Genotype
NHS:
• NHS – SMR admissions (ISD)
• NHS – Prescriptions (HIC)
SMR00 – Outpatient Attendance
SMR01 – General/Acute Inpatient & Day CaseSMR02 – Maternity Inpatient & Day Case
SMR04 – Mental HealthSMR06 – Cancer Registry
SMR11 – Neonatal Inpatient
Scottish Birth Record (2002+)
Linkage of GS study data to NHS SMR for ~22,000 participants and prescribing data (HIC) for ~17,700 participants
Data held in TASC safe haven in Dundee
SMR dataset timelines
http://www.adls.ac.uk/nhs-scotland/
GS:SFHS - Linkage dataset
Allowing:• Comparison of self reported with outcome data • Review of clinical endpoint
• Project future events• Provide summary data to researchers for project planning
• Code up linked data
Approval sought for interpreted/coded data to be released from linked dataset for linkage in other projects e.g. with GWAS data
Addresses issue of:
• Need to interpret/pre-process GS-NHS data• GWAS and exome array data held within MRC HGU
servers
• Streamline linkages to GS data and approvals
Cardiovascular events
ICD10 ICD9
Group from to from to Ever before
GSafter GS only after
GShypertension I10 I15 401 405 1279 830 655 449
angina/MI I20 I25 410 414 1039 776 489 263
heart failure I50 I51 428 429 276 152 139 124
haemorrhage I60 I69 430 438 297 195 126 102
aneurysm etc I70 I74 440 444 212 135 112 77
any of the above 2182 1532 1102 650
Note:
Coding of event (ICD9/10)Timing of event
Multiple eventsLink/connection with other info e.g. prescribing
Recontact
2/11/2014
6
Recontact
98% of participants have given consent for recontact.
Allows GS to recontact participants for new data/samples collection or recruitment into new studies.
Wealth of information allows targeting recontact
Studies involving recontactGS has written to:
• 1000 participants – matched controls for INTERSTROKE evaluating association between conventional and emerging risk factors and stroke.
• 80 participants – range of individuals to explore views on reconsent for future data and samples use.
GS currently writing to:
• ~500 participants – matched controls for study of Cognitive Impairment
and Heart Failure involving validated neuropsychological techniques, MMR brain imaging and indices of cerebral blood flow.
• ~500 participants – identifying pre-eclampsia cases and controls for study of cardiovascular consequences of pre-eclampsia involving record-linkage, a biomarker strategy and high-fidelity cardiovascular phenotyping.*
• ~100 participants – identifying individuals at the extremes of polygenic spectrum for MDD for pilot study involving the further mental health and
clinical assessments and brain scan. Funding bid submitted for main study involving ~3000 participants
26% response rate
Expert Working Groups
Enhance the quality and utility of the GS data
Expert Working Groups
• Invest resources and assume responsibility for ensuring integrity of
the defined research area and related data (e.g. through checking, cleaning,
validating, annotating)
• Make the cleaned data and primary analysis available for further
collaborative work
• Undertake to publish primary research
• Support and/or facilitate other collaborative proposals in this area
• GS will refer research applicants to the EWG where relevant but
applicants are not obliged to work directly with them however EWG
contributions to publications should be acknowledged.
• EWG do not have preferential of exclusive access
Approved EWG
• Cognition and related traits: Professor Ian
Deary, University of Edinburgh
• Mental Health and related traits: Professor
Andrew McIntosh, University of Edinburgh
• Pain and related Traits: Professor Blair
Smith, University of Dundee and Dr Lynne
Hocking, University of Aberdeen (jointly)
Current and future activity
2/11/2014
7
• GS has received >180 proposals for collaboration with a further ~300 enquiries.
• Currently ~30 active projects requiring input or monitoring (6 new proposal at next GSAC review, 10-15 new enquiries per month)
• Some approved projects extending out to 2019.
Future activity
• Further enhance the resource through recontact for new data (and
sample) collection – requires collaborative funding bid!
• Enhance commercial engagement
• Closer alignment with other bioresearch initiatives e.g. ISD, Farr Institute and Administrative Data Liaison Service (ADLS) for linkage
and National Repositories for tissue governance and maintenance.
• Income generated from Access fee but addition funding required to
support GS core team from April 2014!
Current and future activitySources of Information
www.generationscotland.org www.gsaccess.org
Email [email protected]
www.generationscotland.org