module 4 mcpherson 2012 - files.bioinformatics.ca · 120529 2 module’4’...

37
120529 1 Canadian Bioinforma2cs Workshops www.bioinforma2cs.ca

Upload: others

Post on 30-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

1  

Canadian  Bioinforma2cs  Workshops  

www.bioinforma2cs.ca  

2 Module #: Title of Module

Page 2: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

2  

Module  4  Mapping  and  Genome  Rearrangement  

John  McPherson,  Ph.D.  

ATCAA CTAAG

DNA fragment

Paired-end Reads

Module  1   bioinformatics.ca

Platform complexity

Increasing Run Time

Increasing Data

Per Run

Moving away from a “one-size-fits-all” platform

$

$

Cross-platform data integration needed.

700Mb/23h

150Mb/3h

100Mb/1h

2Gb/27h

100Gb/15d

90Gb/10d

600Gb/10d

14TB/run

120Gb/1d

Proton? GridION?

Page 3: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

3  

Module   bioinformatics.ca

Single Molecule

Amplified Template

Long Read

Short Read

Module     bioinformatics.ca

Basecalling  

•   How  do  we  translate  the  machine  data  to  base  calls?  •   How  do  we  es2mate  and  represent  sequencing  errors?  

Page 4: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

4  

Module  1   bioinformatics.ca

Spectral  overlap  

http://www.olympusfluoview.com/theory/bleedthrough.html

Module  1   bioinformatics.ca

Spectral  overlap  

http://www.olympusfluoview.com/theory/bleedthrough.html

Page 5: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

5  

Module  1   bioinformatics.ca

Spectral  overlap  

http://www.olympusfluoview.com/theory/bleedthrough.html

Module  1   bioinformatics.ca

Spectral  overlap  

http://www.olympusfluoview.com/theory/bleedthrough.html

Sources  of  error  

Page 6: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

6  

Module     bioinformatics.ca

Sources  of  error  Illumina:  Pre-­‐phasing  &  Phasing  

Module     bioinformatics.ca

What  is  a  base  quality?  

Base Quality Perror(obs. base)

3 50 % 5 32 %

10 10 % 20 1 % 30 0.1 % 40 0.01 %

PHRED values: -  Sequence a known template and align reads -  determine error rate.

Page 7: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

7  

Module   bioinformatics.ca

Calibra@ng  Base  Quali2es  

“Original” Recalibrated

Mark DePristo Broad Institute June 2009

Module   bioinformatics.ca

Error  Profiles  

Roche 454 •  error rate is low (< 0.5%) •  most errors are INDELs

72%  

24%  

4%  

insertions

deletions

substitutions

Illumina •  error rate is also low •  most errors substitions

Slides by M. Stromberg

Correct 99.5%

Error 0.5%

Error rate over all bases

Page 8: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

8  

Module   bioinformatics.ca

Mismatch  by  cycle  

Module   bioinformatics.ca

Fasta  files  ASF-1.fa ASF-2.fa

•  Reads  are  oWen  stored  in  fasta  files  •  Separate  file  for  forward  and  reverse  pairs  •  header  line  -­‐-­‐  read  name/pairing  info  •  sequence  line  -­‐-­‐  nucleo2des  

Page 9: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

9  

Module   bioinformatics.ca

Fastq  files  

ASF-1.fastq ASF-2.fastq

•  header  line:  @SEQUENCE_ID  •  sequence  line  •  line  beginning  with  +  •  encoded  quality  value  line  

•  Most  reads  are  stored  in  fastq  •  4  lines  per  read  

Module   bioinformatics.ca

Alignment  

•  Reference-­‐based  alignment:  •  Goal:  find  posi2on  in  genome  from  which  read  was  sampled  

–  Comparison  is  to  the  human  reference  genome  (eg  HG19)  

•  Can't  we  use  BLAT  or  BLAST?  –  op2mized  for  long  reads  –  slow  

•  Things  to  consider:  –  support  for  your  technology    –  speed  /  sensi2vity  –  parallelism  –  tolerance  for  gapped  alignment  –  handling  of  mul2ple  good  mappings  

Page 10: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

10  

Illumina AB SOLiD Roche 454 Helicos gapped all alignments multithreaded

Bowtie X X X X BWA X X X X BFAST X X X X X X X Corona Lite X X ELAND X GenomeMapper X X X X gnumap X X X X karma X X X * MAQ X X MOSAIK X X X X X X X MrFAST X X X MrsFAST X X Novoalign X X X * RMAP X X SeqMap X X X SHRiMP X X X X X X Slider X X SOAP2 X X X SSAHA2 X X X X SOCS X X SXOligoSearch X X X X Zoom X X * X Slides by M. Stromberg

Module     bioinformatics.ca

0 5000 10000 15000 20000

Karma

Bowtie

SOAP2

BWA

ELAND2

MOSAIK

srprism

BFAST

Novoalign

speed (reads/s)

alig

ners

Performance  

Illumina 37 bp (human genome) program aligned reads/s Karma 15,635 Bowtie 13,889 SOAP 13,580 BWA 10,314 ELAND2 8,859 MOSAIK 6,792 Srprism 2,768 BFAST 1,125 Novoalign 1,095

Slides by M. Stromberg

Page 11: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

11  

Module  1   bioinformatics.ca

Genome  Mapping  

Module  1   bioinformatics.ca

De  novo  assembly  

Page 12: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

12  

Module  1   bioinformatics.ca

Reference  alignment  

Module     bioinformatics.ca

Reference  alignments  

?

Reference genome

Sequence read

Page 13: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

13  

Module     bioinformatics.ca

Reference  alignments  Reference genome

Sequence read

x x x

Module     bioinformatics.ca

Reference  alignments  Reference genome

Sequence read

x x x x

?

Page 14: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

14  

Module   bioinformatics.ca

Alignment  Quality  

0

10

20

30

40

50

60

70

0 20 40 60

Act

ual a

lignm

ent q

ualit

y Assigned alignment quality

optimal MOSAIK using PE AQs Slides by M. Stromberg

Module     bioinformatics.ca

Alignment  Quality  

alignment errors due to heuristic algorithm

probability that the best hit is wrong

Slides by M. Stromberg

Page 15: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

15  

Module     bioinformatics.ca

INDEL  Cleaning  

Module     bioinformatics.ca

“You  like  tomato  and  I  like  tomahto”  

                                                                                         George  Gershwin  •  NGS010  BRCA1  dele2on  variant    

…CGCTTTAATTTATTTGTG…!!…CGCTTTATTTGTG…!!

…CGC-----TTTATTTGTG…!!

…CGCTTTA-----TTTGTG…!!

!c.1500_1504delTTTAA! c.1504_1508delATTTA!

Reference

Variant

CAP/CLIA Sanger sequence NGS pipeline

Page 16: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

16  

Module     bioinformatics.ca

De  novo  assembly  

Module     bioinformatics.ca

De  novo  assembly  

Page 17: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

17  

Module     bioinformatics.ca

De  novo  assembly  

Module     bioinformatics.ca

De  novo  assembly  

Read from a repeat

Page 18: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

18  

Module     bioinformatics.ca

De  novo  assembly  

Long Reads

Module   bioinformatics.ca

What  are  Paired  Reads?  

ATCAA CTAAG

Insert size (IS)

DNA fragment

Paired-end Reads

Slides by M. Brudno

Page 19: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

19  

Module     bioinformatics.ca

De  novo  assembly  

Module     bioinformatics.ca

Page 20: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

20  

Module     bioinformatics.ca

SAM/BAM  

•  SAM  =  text,  BAM  =  binary  

   SRR013667.1  99  19  8882171  60  76M  =  8882214  119  NCCAGCAGCCATAACTGGAATGGGAAATAAACACTATGTTCAAAGCAGAGAAAATAGGAGTGTGCAATAGACTTAT  #>A@BABAAAAADDEGCEFDHDEDBCFDBCDBCBDCEACB>AC@CDB@>>CB?>BA:D?9>8AB685C26091:77  

Read name Flag Reference Position CIGAR Mate Position

Bases

Base Qualities

Module     bioinformatics.ca

CIGAR  

MD:3^A03C5

Page 21: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

21  

Module   bioinformatics.ca

samtools  

•  used  for  low  level  processing  of  BAM/SAM  files  •  convert  between  BAM  and  SAM  •  sort  alignments  by  posi2on  •  create  index  for  sorted  BAM  file  

Module   bioinformatics.ca

What  kinds  of  varia@on  is  there?  

•  Single  Nucleo2de  Polymorphisms  (SNPs)  •  Short  indels  (<  read  length)  •  Structural  varia2ons    

–  large  scale  inser2ons  and  dele2ons  –  Inversions  –  Transloca2ons  –  Copy  number  varia2on  

 

Page 22: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

22  

Module     bioinformatics.ca

Structural variants Mate-pair and paired-end reads can be used to detect structural

variants

Fragmentation & circularization to an internal adaptor

Shear

Isolate internal adaptors and fragment ends

Mate-Pairs Paired-Ends

Fragmentation

Add amplification and sequencing adaptors

Sequence

Add amplification and sequencing adaptors

Genomic DNA

1 - 20kb 200 – 500bp

Module     bioinformatics.ca

Clusters of aberrantly aligned read pairs

Mapping of read pairs to reference •  Spanning unexpected distance •  Unexpected orientation

• Detection of: • Deletions • Insertions • Translocations • Inversions

Fragment size

Fragment number

< <

Insertion

> <

Deletion

> <>Build35

<

inv Map

Seq

del Map

Seq

> <

Concordant Inversion translocation

ChrA ChrB

Page 23: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

23  

Module   bioinformatics.ca

Inser@on:  signature  

Mapped distance

Insert size Mapped distance < IS - 2.s.d

Size of insertion = Insert size - Mapped distance

don

ref

Slides by M. Brudno

Module   bioinformatics.ca

Inser@on:  consistency  1.  Overlap  2.  Size  of  inser2on  explained  by  X  =  Size  of  inser2on  explained  by  Y  

X

Y

X

Y

don

ref

Slides by M. Brudno

Page 24: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

24  

Module   bioinformatics.ca

Inser@on:    narrowing  down  the  loca@on  

don

ref

Possible location of insertion

•  Insertion lies within spanning region of matepair •  For clusters, it lies within the intersection

Slides by M. Brudno

Module   bioinformatics.ca

SV  summary  

Type   Mapped  Distance   Orienta@on  

Inser2on   too  big   correct  

Dele2on   too  small   correct  

Inversion   *  

Tandem  duplica2on   *  

Interchromosomal   different  chromosomes  

N/A  

Slides by M. Brudno

Page 25: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

25  

Module   bioinformatics.ca

Where  can  we  go  wrong:  missed  inser@on  

don

ref

IS Insertions larger than IS cannot be detected with basic method

Module   bioinformatics.ca

Soma@c  vs.  Germline  

•  tumor  vs.  normal  sequencing  •  approach  1:  

–  find  SVs  separately  in  two  samples  –  filter  out  soma2c  SVs  that  overlap  germline  SVs  

•  approach  2  –  find  soma2c  SVs  –  for  each  soma2c  SV,  find  any  type  of  evidence  in  germline  

•  a  single  discordant  but  non-­‐consistent  matepair?  

–  filter  out  anything  with  evidence    

Slides by M. Brudno

Page 26: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

26  

Module  1   bioinformatics.ca

Variant detection - distinguishing novel variants

from errors Reference:  ACGT  …  

Germline  variants  50%  Ref  :  50%  Var  40%  for  :  60%  rev  

Module  1   bioinformatics.ca

Variant detection - distinguishing novel variants

from errors Reference:  ACGT  …  

Strand bias

Misaligned reads - Not consistent so sometimes seen as somatic

PCR artifact Germline  variants  50%  Ref  :  50%  Var  40%  for  :  60%  rev  

Page 27: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

27  

Module  1   bioinformatics.ca

Variant detection - distinguishing novel variants

from errors Reference:  ACGT  …  

80%  Ref  :  20%  Var  

Strand bias

Misaligned reads PCR artifact Somaic mutations

Module   bioinformatics.ca

Structural  Variants  and  Split  Reads

Paired Short Reads

Align

Most of these pairs can be aligned to the reference genome

For some paired-end reads one of the pair may not be mapped because it goes across the breakpoint of a structural variant. We call such reads split reads.

Slides by M. Brudno

Page 28: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

28  

Module   bioinformatics.ca

Split  read  signatures

don

ref

Deletion

don

ref

Insertion

don

ref

don

ref Slides by M. Brudno

Module   bioinformatics.ca

Pair  informed  split  mapping

ref

Deletion

reference region 1

reference region 2

•  searching  the  whole  genome  for  split  mappings      –  gives  a  lot  of  false  mappings  –  too  slow  

•  can  exactly  es2mate  breakpoint  and  indel  sizes  •  can  detect  very  small  dele2ons  

Slides by M. Brudno

Page 29: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

29  

Module     bioinformatics.ca

SV  SoRware  and  Exercise  •  There  are  many:  •  GASV  (will  use  this  today)  

–  hsp://compbio.cs.brown.edu/soWware.html  –  S.  Sindi,  E.  Helman,  A.  Bashir,  B.J.  Raphael.  (2009)  A  Geometric  Approach  for  Classifica2on  and  Comparison  of  Structural  Variants.Bioinforma*cs.  25:  i222-­‐i230  

•  Breakdancer  –  hsp://breakdancer.sourceforge.net/  –  hsp://www.nature.com/nmeth/journal/v6/n9/abs/nmeth.1363.html  

Module   bioinformatics.ca

Gene  fusions  

•  if  a  linking  signature  connects  two  genes,  this  might  indicate  a  gene  fusion  

ChrA

ChrB

Gene X

Gene Y

Gene XY Protein

Page 30: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

30  

Module     bioinformatics.ca

Things  we  have  set  up:  

•  Loaded  data  files  to  an  S3  bucket  •  We  brought  up  an  Ubuntu  (Linux)  instance,  and  loaded  a  whole  bunch  of  soWware  for  NGS  analysis.  

•  We  then  cloned  this,  and  made  separate  instances  for  everybody  in  the  class.    

•  We’ve  simplified  the  security:  you  basically  all  have  the  same  login  and  and  file  access,  and  opened  ports.  In  your  own  world  you  would  be  more  secure.  

Module   bioinformatics.ca

All on Wiki! �http://bioinformatics.ca/workshop_wiki/�"Login: FirstnameLastname�"Password: guest �

Page 31: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

31  

Module   bioinformatics.ca

Module   bioinformatics.ca

Page 32: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

32  

Module   bioinformatics.ca

On Mac: Control+

Module   bioinformatics.ca

Page 33: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

33  

Module     bioinformatics.ca

http://bioinformatics.ca/workshop_wiki/��"Login: FirstnameLastname�"Password: guest �

Module   bioinformatics.ca

Page 34: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

34  

Module   bioinformatics.ca

# is your assigned student number�

Module     bioinformatics.ca

Ask  your  ques@on,  and  then  gather  the  data,  the  tools  and  hardware  you  need  

•  Data  and  Databases:  you  will  take  workshops,  you  will  read  papers,  and  you  will  go  on-­‐line:  SeqAnswers  &  maybe  the  bioinforma2cs.ca  Links  Directory  

•  Tools:  you  will  take  workshops,  you  will  read  papers,  and  you  will  go  on-­‐line:  SeqAnswers  &  maybe  the  bioinforma2cs.ca  Links  Directory  

•  Hardware:  you  need  to  decide?      

Page 35: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

35  

Module   bioinformatics.ca

We  are  now  going  to  start  an  exercise  in  mapping  and  structural  variant  

detec2on.  

Module   bioinformatics.ca

We  are  on  a  Coffee  Break  &  Networking  Session  

Page 36: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

36  

Module     bioinformatics.ca

Cryp2c  Fusion  Oncogene  

•  Welch  et  al.  JAMA  305;  1577-­‐1584    –  April  20,  2011  

•  Acute  Promyelocy2c  Leukemia  (APL)  –  >90%  associated  with  gene  fusion  PML-­‐RARA  –  Rapid  diagnosis  is  essen2al  as  adding  all-­‐trans  re2noic  acid  to  chemotherapy  leads  to  substan2ally  improved  outcome  (5yr  event-­‐free-­‐survival  of  69%  compared  with  29%  with  chemotherapy  alone)  

 

Module     bioinformatics.ca

Cryp2c  Fusion  Oncogene  

•  39  year  old  pa2ent  diagnosed  with  acute  myeloid  leukemia  (AML)  in  first  remission  referred  for  allogenic  stem  cell  transplanta2on.    

•  Cytogene2cs  indicated  a  poor  prognosis  and  absence  of  a  PML-­‐RARA  fusion.  

•  Leukemic  cytomorphology  consistent  with  APL  •  Course  of  treatment  uncertain  –  APL  or  AML  with  poor  prognosis?  

Welch et al. JAMA 305; 1577-1584

Page 37: module 4 McPherson 2012 - files.bioinformatics.ca · 120529 2 Module’4’ Mapping’and’Genome’Rearrangement! John’McPherson,’Ph.D.’ ATCAA CTAAG DNA fragment Paired-end

12-­‐05-­‐29  

37  

Module     bioinformatics.ca

Cryp2c  Fusion  Oncogene  

•  Whole  genome  sequencing  revealed  a  77kb  inser2on  from  chromosme  15  into  the  second  intron  of  the  RARA  gene  on  chromosome  17  resul2ng  in  a  classic  PML-­‐RARA  fusion.  

•  7  week  turnaround;  ~$40,000  •  Pa2ent  received  ATRA  and  is  in  remission  at  15  months.  

Welch et al. JAMA 305; 1577-1584

Module     bioinformatics.ca

Cryp2c  Fusion  Oncogene  

Welch et al. JAMA 305; 1577-1584