identification of repeat expansions with whole exome and whole … · 2020. 9. 11. · severe...

24
Identification of repeat expansions with whole exome and whole genome sequencing in epilepsy patients Melanie Bahlo [email protected], @MelanieBahlo Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research (WEHI) Melbourne, Australia Cleveland Epilepsy Symposium(Virtual) September 2020 Structural Repeat Epilepsies

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Identification of repeat expansions with whole exome and whole genome sequencing in epilepsy

patients

Melanie [email protected], @MelanieBahlo

Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research

(WEHI)Melbourne, Australia

Cleveland Epilepsy Symposium(Virtual)September 2020

Structural Repeat Epilepsies

Repeat Expansion Disorders: caused by e x p a n d e d short tandem repeats

• ~40 human disorders caused by expansions e.g. Huntington’s Disease, many spinocerebellar ataxias (SCAs)– Aberrant short tandem repeat profiles are also features of some cancers

e.g. colorectal cancer (microsatellite instability)• Pathogenic molecular mechanisms vary (Hannan et al, Nat Rev Genet 2018)• Overrepresented in neurological disorders. Longevity, slow turn over of

neurons.• Example: SCA1, coding CAG (Q = glutamine) expansion in ATXN1

– Normal 6–38, affected 40–81

Human Genome Reference = (CAG)12 : an unaffected individual

SCA6

Expanded

Normal

CTG8

REs are captured by standard short read Whole Exome & Whole Genome Sequencing (WES & WGS)

Bioinformatics methods for repeat expansion detection with WES/WGS

1 10 100 1000 10000

Repeat expansion disorders:normal and disease causing allele repeat lengths

Repeat length (bp)

SCA8 (3)SCA12 (3)EPM1 (12)SCA10 (5)SCA36 (6)

FTDALS (6)DM2 (4)

FRDA (3)DM1 (3)

FRAXE (3)FRAXA (3)

HDL2 (3)DRPLA (3)SCA17 (3)SCA7 (3)SCA6 (3)SCA3 (3)SCA2 (3)SCA1 (3)SBMA (3)

HD (3) >

>

>

coding

exon_var_spliced5'UTR

3'UTRintron_1

intron_9promotor

utRNA

NormalIntermediatePathogenichg19 reference

Vertical bar lengths151bp HiSeq X397bp HiSeq X inser t

from Rick Tankard, PhD thesis, 2018

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

HD (coding CAG) norm: 19 (59bp) , exp: 36 (108bp) score ECDF

Repeated bases (x)

Fn(x

)

●●

●●●●●●●●●●

●●●●

●●●

●●●

●●●

●●

●●●●

●●●●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●●●

●●●●●●●●●

●●●●

●●●●●●

●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●●●●●

●●

●●●●●

●●●

●●

●●

●●

●●●

●●●●

●●●●●●●

●●

●●

●●●●

●●

●●

●●●

●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●●●●

●●●

●●●●

●●

●●●●●●●●●●●●●

●●●

●●●●

●●●●

●●●●

●●●●●●

●●

●●●●

●●●●●

●●●

●●●●●

●●●●

●●

●●●●●

●●●●●●●●

●●

●●●

●●●●

●●●●●

●●●●●●●

●●●●●●

●●●

●●●

●●

●●●●

●●

●●●●●●●●●●

●●●●●●●●

●●●●●●●

●●●●

●●

●●●

●●●

●●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●●●●●

●●●

●●●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

HD−1SCA2−1SCA6−1

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

FRDA (intron_1 GAA) norm: 6 (20bp) , exp: 200 (600bp) score ECDF

Repeated bases (x)

Fn(x

)

WGSrpt_09WGSrpt_11controlsexSTRa: https://github.com/bahlolab/exSTRa

EHdn: https://github.com/Illumina/ExpansionHunterDenovo

Repeat Expansion Detection TOOLS• Expansion Hunter (Dolzhenko et al, Genome Research 2017)• exSTRa (Tankard et al, AJHG 2018)• STRetch (Dashnow et al, Genome Biol 2018)• TREDPARSE (Tang et al, AJHG, 2017)• GangSTR (Mousavi et al, NAR, 2019)• TRhist (Doi et al, Bioinformatics, 2014)• EHdn (Dolzhenko, Bennett et al, Genome Biology 2020)

• Sensitivity and specificity are high (>90%, >90%) for most known REs

• Known repeat expansions: testing ~40 known REs, not 1000s of SNPs/indels

• Low curation burden• Not in standard clinical genomics analysis pipelines – YET!

Rick Tankard Mark Bennett

HD FRDA

Repeat Expansions contribute substantially to disease burden in both rare and complex

neurological disorders*

• Fragile X RE – causes 50% of all X-linked ID and autism

• C9orf72 RE causes ALS/MND & FTD– in >1/10 ALS patients– In >1/10 FTD patients– differential diagnosis for HD, PD & other diseases

• HD (~30,000 USA affected, ~200,000 carriers)

*underestimates

ALS&FTD C9orf72 (2015)

Unverricht Lundborg PME CSTB dodecamer (1997)

Familial Adult Myoclonus Epilepsy (pentamers)FAME1 SAMD12 (2018)FAME6 TNRC6A (2018)FAME7 RAPGEF2 (2018)FAME2 STARD7 (2019)FAME3 MARCH6 (2019)FAME4 YEATS2 (2019)

Fragile-X FMR1 (1991)

Spinocerebellar ataxia 1 (SCA1) ATXN1Huntington’s DiseaseHTT (1993)

Familial Adult Myoclonic Epilepsy (FAME)

‣ Cortical myoclonic tremor, onset ~20 yrs

‣ Myoclonic and generalisedtonic-clonic seizures in ~1/3 of patients

‣ Linkage studies identified four loci: chromosomes 2, 3, 5, 8

‣ Evidence of Anticipation ‣ Discovered in 2018/19 to be due to

multiple repeat expansions

(Crompton et al. 2012, Arch Neurol)

Australian/New ZealandFAME2 Family

-caused by RE in STARD7 (Corbett et al. 2019, Nat Comms)

Pentamer Repeat Expansions-in the vicinity of Alu Sequences

FAME3 (Florian et al. 2019)

FAME2 (Corbett et al. 2019)

FAME6 (Ishiura et al. 2018)

FAME7 (Ishiura et al. 2018)

FAME1 (Ishiura et al. 2018)

SCA37 (Seixas et al. 2017)

+ CANVAS(Cortese et al, 2019, Rafehi et al 2019)

Sri Lankan family Indian family

Families with FAMEWhich FAME locus? Novel?

WGS to search for REsFAME

Work with Sam BerkovicIngrid Scheffer

exSTRa

FAM

E1 b

enig

nre

fere

nce

TTTT

AFA

ME1

pat

hoge

nic

nove

l TTT

CA

STRetchExpansion Hunter

Targeted RE Search: Hit for FAME1

Results validated with RP-PCR

Ishiura et al. 2018Cen et al. 2018

Mutation Datinghttps://shiny.wehi.edu.au/rafehi.h/mutation-dating/

Method: Gandolfo, Bahlo & Speed, 2014, GeneticsMutation dating web app: Haloom Rafehi

Haplotype History

Japanesefamilies

Chinese families

original mutation

Japanesefounder

~160generations

~650generations

Indian & Sri Lankan families

Chinesefounder

~500generations

Bennett et al, EJHG 2020

Epi25 Repeat Expansion Analysis

• Epi25 - largest epilepsy sequencing project (still in progress)– WES only (Epi25 Consortium, AJHG 2019)

• Are any known repeat expansions enriched in Epi25?– Differential diagnosis and/or epilepsy genetic risk factor– WES Caveat: Can only assess those repeat expansions with

coverage

• Future Analysis: Are there any novel repeat expansions in Epi25?

Repeat Expansion Analysis Pipeline

BAMs case / control

Repeat Databaseknown pathogenic

genome wide catalog

RE Methods: Raw Data Processing

ExpansionHunterexSTRaSTRetch

TREDPARSEGangSTR

ExpansionHunterDenovo

Post Processing

AnalysisConsensus REs

(targeted search)

novel candidate repeat

expansions

targ

eted

nove

l

Mark Bennett

Coverage of known REs – variability in cases v controls

cove

rage

Therapies are looking

promising for REs

AcknowledgementsWEHI Bahlo LabMark BennettHaloom RafehiLiam FearnleyRick TankardPeter Degorski

MCRIPaul LockhartMartin Delatycki

Illumina (EHdn Collaboration)Egor DolzhenkoMichael Eberle

Brain Research Institute, UoMSam BerkovicSaul MullenMichael Hildebrand

University of AdelaideJozef GeczMark CorbettThessa Kroes

EPI25 ConsortiumFAME Consortium

exSTRa and other software available from:https://github.com/bahlolab

Identification of repeat expansions with whole exome and whole genome sequencing in

epilepsy patients• Bennett MF1,2, Rafehi H1,2, Hildebrand MS3, Oliver KL3, Corbett MA4,5, Gecz J4,5, Sadleir LG3, Scheffer IE3, Berkovic

SF3, Epi25 Consortium, Bahlo M1,2•• Recent discoveries have expanded the list of repeat expansions (REs) that cause epilepsy to seven loci. Repeat expansions

are found in two types of epilepsies: (i) Unverricht-Lundborg disease (ULD), which is caused by a recessive GC-rich dodecamer repeat expansion in the promoter region of CSTB, and (ii) Familial Adult Myoclonic Epilepsy (FAME), which is caused by dominantly inherited, fully penetrant, pentamer TTTCA repeat expansions in six genes (SAMD12, STARD7, MARCH6, YEATS2, TNRC6A and RAPGEF2). These repeat expansions cause myoclonic epilepsies, with FAME is a less severe epilepsy than ULD, which is a form of progressive myoclonus epilepsy.

• Our study of two new families, one Indian and one Sri Lankan, as well as whole genome sequencing data from FAME2 and FAME3 families demonstrate: (i) the FAME REs can be easily detected with standard short-read whole genome sequencing, and (ii) that the FAME1 RE is not just confined to China and Japan but also present more widely in Asia. These REs show founder effects, with the most recent common ancestors for FAME1 and FAME3 estimated to have arisen ~17,000 and 5,000 years ago respectively. The FAME REs are the likely result of mutations in retro-transposable elements, which are a possible indicator of the presence of other REs, since similar pentamer expansions have also recently been identified for three types of ataxia (SCA31, SCA37 and CANVAS).

• Epi25 is a large-scale whole exome sequencing project of individuals with generalised epilepsy, focal epilepsy, and epileptic encephalopathy. Using RE detection methods developed by us we are analysing Phase 1 and 2 Epi25 data (~13,000 individuals). We have detected >50 recognised REs such as SCA8, as well as 27 individuals with evidence of the TTTCA expansion which causes FAME. One of these individuals has been confirmed as having the FAME2 expansion, although others may be benign, with the majority requiring confirmatory testing. Our results indicate that testing for these REs is feasible and important, even in epilepsy patients not recruited based on FAME/ULD clinical diagnoses. Whether these REs represent missed diagnoses or contribute to the genetic risk burden of common epilepsies remains to be determined.

Haplotype History

Japan50 families

China19 familiesIndia

1 family

Sri Lanka1 family

Mutation Datinghttps://shiny.wehi.edu.au/rafehi.h/mutation-dating/

year

s(2

5 ye

ars

per g

ener

atio

n)

gene

ratio

ns

95% Confidence Interval

Time to most recentcommon ancestor

Abstract…

‣ Thank you for agreeing to speak at the Epilepsy Genetics Update virtual conference taking place September 11-13 during US Eastern Daylight Time. I apologize in advance for the length of this email, but it contains important information about theconference.

‣ By now, all of you should have been contacted by Hannah Nussbaum of GlobalCastMD to schedule a day and time to pre-record your presentation. GlobalCast is producing the conference for us. Attached are some simple tips for creating a successful recording.

‣ - A few other things to keep in mind before you record:‣ · Make sure you have a working camera and microphone (most laptops have these features built in)‣ · Find a filming location that is well lit (avoid backlighting) with a minimally distracting background - Short YouTube video

with lighting tips: https://www.youtube.com/watch?v=OWgmw_pFMrI‣ · Have your presentation prepared and rehearsed: we will be filming both your slides and your presentation at this time ‣ · Dress to impress! We want to make sure you look your best as these videos will be streamed during the event and saved in

a video library. Also, if possible, wear the same clothing for the live sessions as what you wear for the recording for continuity.‣ - The recordings will be available online for a up to a year after the program. ‣

‣ A Q&A session will be held at the end of each segment with all of the speakers. You will receive a Zoom link for the day/time to join the live program. You’ll be able to discuss the questions to be addressed with the moderators before the live Q&A begins. Your offline discussion will not be heard by the participants.

‣ Action Items by September 1st: ‣ · Please send a short biography (4-6 sentences) to be used for introduction before your presentation. DONE‣ · The presentation slides will be made available to the participants through the online course syllabus. The slides will be

saved as a secure PDF file. If you have unpublished data or proprietary information that you do not want the participants to receive, please remove those slides from your Power Point file prior to sending to me by September 1, 2020.

‣ Martha Tobin

Instructions

Tips to make your pre‐recorded movies looktheir best for live streamingRecord your video slide presentations in 1080 30p video standard• 1920 x 1080 pixels• 30 frames per second• Widescreen, 16:9 aspect ratio (not 4:3 or 16:10)While we can broadcast movies of just about any format, following these guidelines will ensure your presentation appears crisp, legible, and in its entirety, while it is being live streamed. Any deviation from these recommendations, couldresult in unwanted blurriness, artifacts, or cropping.