detection of complex rearrangements in cancer...

1
Detection of Complex Rearrangements in Cancer Genomes: A Study on Multiple Myeloma AWC Pang, H Dai, M Saghbini, EH Cho, A Hastie, X Yang, T Dickinson, A Nooka 1 , JL Kaufman 1 , SM Matulis 1 , L Zhang 1 , DF Saxe 1 , LH Boise 1 , KP Mann 1 , DL Jaye 1 , S Lonial 1 , H Cao, MR Rossi 1 BioNano Genomics, San Diego, CA, USA, 1 Emory University, Atlanta, GA, USA Abstract Large DNA rearrangements are known to be associated with the initiation and progression of multiple myeloma. While karyotype and FISH are routinely used for diagnosis and prognosis of this disease, microarrays, RNA- and DNA sequencing are also used for variant discovery. In our experience, short- read sequencing has limited ability to detect translocation events reliably, and the validations of those calls by FISH or PCR is manually intensive. Here we present the BioNano Genomics Irys System that utilizes nanochannel technology to linearize long DNA molecules of hundreds of kilobases. It uses high resolution imaging for whole genome mapping and de novo assembly. Leveraging on ultra- long assembled genome maps, the platform is able to detect large DNA rearrangements such as translocations, amplifications and deletions. As a proof of concept, we ran the cell line KMS11 on the Irys System, and we detected known translocation events, such as t(4;14), as well as a gene-fusion event in this sample. In addition, we developed a pipeline that would use our long-length molecules to validate and potentially phase complex rearrangements detected by sequencing. By examining molecule depth profile, we identified multiple gross genomic abnormalities in copy number. We further ran additional multiple myeloma clinical samples on the Irys platform, and identified numerous rearrangements. Based on comparisons with variants found by SNP arrays, whole genome and whole transcriptome sequencing, we are confident that the BioNano Genomics Irys System will enable genome assembly finishing and rearrangement discoveries, expand our view of genome architecture, and improve our understanding of the molecular mechanisms which drive hematological malignancies and solid tumors. Background Generating high quality finished genomes replete with accurate identification of structural variation and high completion (minimal gaps) remains challenging using short read sequencing technologies alone. Instead, Irys technology provides direct visualization of long DNA molecules in their native state, avoiding the statistical assumptions that are normally used to force sequence alignments of low uniqueness elements. The resulting order and orientation of sequence elements are demonstrated in anchoring NGS contigs and structural variation detection. Methods (1) Long molecules of DNA is labeled with IrysPrep™ reagents by (2) incorporation of fluorophore labeled nucleotides at a specific sequence motif throughout the genome. (3) The labeled genomic DNA is then linearized in the IrysChip™ nanochannels and single molecules are imaged by Irys. (4) Single molecule data are collected and detected automatically. (5) Molecules are labeled with a unique signature pattern that is uniquely identifiable and useful in assembly into genome maps. (6) Maps may be used in a variety of downstream analysis using IrysView™ software. Conclusions The Irys System utilizes nanochannel technology to image high molecular weight DNA for genome mapping of translocations, amplifications and deletions, and it accurately detected hallmark genetic mutations in multiple myeloma samples. We also demonstrate the usage of long molecules to validate and phase breakpoints identified by sequencing. In addition, we are cur- rently analyzing the broad range of structural variation called by Irys that are missed by short-read technologies and SNP arrays. These proof of concept data demonstrate that that the Irys platform can reveal relevant mutations in complex genomes, thus fill- ing in the gap between cytogenetics and NGS/microarrays, and has broad applicability in genome research. References 1) Hastie, A.R., et al. Rapid Genome Mapping in Nanochannel Arrays for Highly Complete and Accurate De Novo Sequence Assembly of the Complex Aegilops tauschii Genome. PLoS ONE (2013); 8(2): e55864. 2) Lam, E.T., et al. Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nature Biotechnology (2012); 10: 2303 3) Das, S. K., et al. Single molecule linear analysis of DNA in nano-channel labeled with sequence specific fluorescent probes. Nucleic Acids Research (2010); 38: 8 4) Xiao, M et. al. Rapid DNA mapping by fluorescent single molecule detection. Nucleic Acids Research (2007); 35:e16. Translocation Profiles of the PE Genome Translocations were detected in the PE genome. By aligning our long- range genome maps to the reference hg19, we were able to detect high-confidence intra-chromosomal (blue) and inter-chromosomal (red) translocations. One example is translocation t(1;18), which was detected by a 3.85Mb genome map. Note that there are multiple long molecules that span across the translocation junction. Rearranged Chromosome with Phase Breakpoints 2) IrysPrep reagents label DNA at specific sequence motifs 1) IrysPrep Kit extraction of long DNA molecules 3) IrysChip linearizes DNA in NanoChannels 4) Irys automates imaging of single molecules in NanoChannels 5) Molecules and labels detected in images by instrument software 6) IrysView software assembles genome maps Nick Site Displaced Strand Polymerase Nickase Recognition Motif Free DNA solution DNA in a microchannel DNA in a NanoChannel Gaussian coil Partially elongated Linearized Position (kb) Blood Cells Tissue Microbes Detecting Copy Number Aberrations Chromosomal aberrations can also be detected by significant deviations in the number of molecule alignments compared to diploid regions. Here, we show a coverage profile of a breast cancer sample, and we see that there are significant amplifications and deletions throughout the genome. The green line represents the expected diploid level. This result shows another utility of this technology for cancer research. Single molecules alignment to each of four fragments of chr12, proving the structure of the derivative chromosome in this breast cancer sample. Structural Variation in KMS11 Deletion at 11q21 (95.65-96M) Caused MAML2-MTMR2 Fusion Transcript The 360Kb deletion will cause the MAML2 and MTMR2 coding regions to be fused. This is confirmed by detection of RNA-seq fusion transcript MAML2 exon5-MTMR2 exon 16. Known KMS11 Translocation Detected by Genome Mapping De Novo Assemblies of Cancer Genomes The left panel shows the genome coverage by the de novo assemblies of the multiple myeloma cell line KMS11 and a multiple myeloma patient sample, called PE. Both assemblies were constructed using molecules > 150 kb. Grey loci represent N-base gaps in the hg19 reference. Translocation t(1;18) Detected chr18 Genome Map chr1 G A P Align to chr18 Align to chr1 Genome Map Molecule Pileup Our data show that KMS11 is positive for t(4;14), a known variant in this cell line. The FGFR3 can be dysregulated in myeloma by the translocation, as it brings the gene into the vicinity of IGH enhancers. The deletion in the IGH locus could be incidental to KMS11 or simply a common variant in the human population. GAP IGH Deriva've chr14 FGFR3 del chr4 Genome Map chr14 360Kb deletion MAML2 exon5 MTMR2 exon16 chr11 Genome Map 66,107,69566,996,035 strand 66,107,69566,996,035 +strand 112,763,376 112,822,660 +strand 112,763,376 112,824,194 strand 68.31366.996M 112.754112.755M 112.763112.822M 66.99666.107M 112.824112.763M 66.10766.996M 66.99668.315M Poten:al deriva:ve chromosome structure based on NGS Valida:on of breakpoints by single molecule mapping Phasing of breakpoints by single molecules KMS11 PE Total Assembled Con/g Length 2.62 Gb 2.80 Gb Con/g N50 1.01 Mb 1.043 Mb % hg19 Overlapping BNG Assemblies 86% 88.3%

Upload: others

Post on 15-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detection of Complex Rearrangements in Cancer …bionanogenomics.com/.../2017/01/BeyondGenome-2014-WEB.pdfDetection of Complex Rearrangements in Cancer Genomes: A Study on Multiple

Detection of Complex Rearrangements in Cancer Genomes: A Study on Multiple Myeloma

AWC Pang, H Dai, M Saghbini, EH Cho, A Hastie, X Yang, T Dickinson, A Nooka1, JL Kaufman1, SM Matulis1, L Zhang1, DF Saxe1, LH Boise1, KP Mann1, DL Jaye1, S Lonial1, H Cao, MR Rossi1BioNano Genomics, San Diego, CA, USA, 1Emory University, Atlanta, GA, USA

AbstractLarge DNA rearrangements are known to be associated with the initiation and progression of multiple myeloma. While karyotype and FISH are routinely used for diagnosis and prognosis of this disease, microarrays, RNA- and DNA sequencing are also used for variant discovery. In our experience, short-read sequencing has limited ability to detect translocation events reliably, and the validations of those calls by FISH or PCR is manually intensive. Here we present the BioNano Genomics Irys System that utilizes nanochannel technology to linearize long DNA molecules of hundreds of kilobases. It uses high resolution imaging for whole genome mapping and de novo assembly. Leveraging on ultra-long assembled genome maps, the platform is able to detect large DNA rearrangements such as translocations, amplifications and deletions. As a proof of concept, we ran the cell line KMS11 on the Irys System, and we detected known translocation events, such as t(4;14), as well as a gene-fusion

event in this sample. In addition, we developed a pipeline that would use our long-length molecules to validate and potentially phase complex rearrangements detected by sequencing. By examining molecule depth profile, we identified multiple gross genomic abnormalities in copy number. We further ran additional multiple myeloma clinical samples on the Irys platform, and identified numerous rearrangements. Based on comparisons with variants found by SNP arrays, whole genome and whole transcriptome sequencing, we are confident that the BioNano Genomics Irys System will enable genome assembly finishing and rearrangement discoveries, expand our view of genome architecture, and improve our understanding of the molecular mechanisms which drive hematological malignancies and solid tumors.

BackgroundGenerating high quality finished genomes replete with accurate identification of structural variation and high completion (minimal gaps) remains challenging using short read sequencing technologies alone. Instead, Irys technology provides direct visualization of long DNA molecules in their native state, avoiding the statistical assumptions that are normally used to force sequence alignments of low uniqueness elements. The resulting order and orientation of sequence elements are demonstrated in anchoring NGS contigs and structural variation detection.

Methods(1) Long molecules of DNA is labeled with IrysPrep™ reagents by (2) incorporation of fluorophore labeled nucleotides at a specific sequence motif throughout the genome. (3) The labeled genomic DNA is then linearized in the IrysChip™ nanochannels and single molecules are imaged by Irys. (4) Single molecule data are collected and detected automatically. (5) Molecules are labeled with a unique signature pattern that is uniquely identifiable and useful in assembly into genome maps. (6) Maps may be used in a variety of downstream analysis using IrysView™ software.

ConclusionsThe Irys System utilizes nanochannel technology to image high molecular weight DNA for genome mapping of translocations, amplifications and deletions, and it accurately detected hallmark genetic mutations in multiple myeloma samples. We also demonstrate the usage of long molecules to validate and phase breakpoints identified by sequencing. In addition, we are cur-rently analyzing the broad range of structural variation called by Irys that are missed by short-read technologies and SNP arrays. These proof of concept data demonstrate that that the Irys platform can reveal relevant mutations in complex genomes, thus fill-ing in the gap between cytogenetics and NGS/microarrays, and has broad applicability in genome research.

References1) Hastie, A.R., et al. Rapid Genome Mapping in Nanochannel Arrays for Highly Complete and

Accurate De Novo Sequence Assembly of the Complex Aegilops tauschii Genome. PLoS ONE (2013); 8(2): e55864.

2) Lam, E.T., et al. Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nature Biotechnology (2012); 10: 2303

3) Das, S. K., et al. Single molecule linear analysis of DNA in nano-channel labeled with sequence specific fluorescent probes. Nucleic Acids Research (2010); 38: 8

4) Xiao, M et. al. Rapid DNA mapping by fluorescent single molecule detection. Nucleic Acids Research (2007); 35:e16.

Translocation Profiles of the PE GenomeTranslocations were detected in the PE genome. By aligning our long-range genome maps to the reference hg19, we were able to detect high-confidence intra-chromosomal (blue) and inter-chromosomal (red) translocations. One example is translocation t(1;18), which was detected by a 3.85Mb genome map. Note that there are multiple long molecules that span across the translocation junction.

Rearranged Chromosome with Phase Breakpoints

2) IrysPrep reagents label DNA at specific sequence motifs

1) IrysPrep Kit extraction of long DNA molecules

3) IrysChip linearizes DNA in NanoChannels

4) Irys automates imaging of single molecules in NanoChannels

5) Molecules and labels detected in images by instrument software

6) IrysView software assembles genome maps

NickSite

Displaced Strand

Polymerase

Nickase Recognition

Motif

Free DNA solution! DNA in a microchannel! DNA in a NanoChannel!

Gaussian coil! Partially elongated! Linearized!

Position (kb)

Blood

Cells

Tissue

Microbes

Detecting Copy Number AberrationsChromosomal aberrations can also be detected by significant deviations in the number of molecule alignments compared to diploid regions. Here, we show a coverage profile of a breast cancer sample, and we see that there are significant amplifications and deletions throughout the genome. The green line represents the expected diploid level. This result shows another utility of this technology for cancer research.

Single molecules alignment to each of four fragments of chr12, proving the structure of the derivative chromosome in this breast cancer sample.

Structural Variation in KMS11

Deletion at 11q21 (95.65-96M) Caused MAML2-MTMR2 Fusion Transcript

The 360Kb deletion will cause the MAML2 and MTMR2 coding regions to be fused. This is confirmed by detection of RNA-seq fusion transcript MAML2 exon5-MTMR2 exon 16.

Known KMS11 Translocation Detected by Genome Mapping

De Novo Assemblies of Cancer GenomesThe left panel shows the genome coverage by the de novo assemblies of the multiple myeloma cell line KMS11 and a multiple myeloma patient sample, called PE. Both assemblies were constructed using molecules > 150 kb. Grey loci represent N-base gaps in the hg19 reference.

Translocation t(1;18) Detected

chr18  

Genome  Map  

chr1  

GAP  

Align  to  chr18   Align  to  chr1  

Genome  Map  

Molecule  Pileup  

Our data show that KMS11 is positive for t(4;14), a known variant in this cell line. The FGFR3 can be dysregulated in myeloma by the translocation, as it brings the gene into the vicinity of IGH enhancers. The deletion in the IGH locus could be incidental to KMS11 or simply a common variant in the human population.

GAP!

IGH!

Deriva've  chr14  

FGFR3!

del  

chr4  

Genome  Map  

chr14  

360Kb deletion

MAML2    exon5  

MTMR2    exon16  

chr11  

Genome  Map  

66,107,695-­‐66,996,035  -­‐strand   66,107,695-­‐66,996,035  +strand  

112,763,376-­‐  112,822,660  

+strand  

112,763,376-­‐  112,824,194  

-­‐strand  

68.313-­‐66.996M  112.754-­‐112.755M  

112.763-­‐112.822M  

66.996-­‐66.107M  

112.824-­‐112.763M  

66.107-­‐66.996M   66.996-­‐68.315M  

Poten:al  deriva:ve  chromosome  structure  based  on  NGS  

Valida:on  of  breakpoints  by  single  molecule  mapping  

Phasing  of  breakpoints  by  single  molecules  

KMS11   PE  

Total  Assembled  Con/g  Length   2.62  Gb   2.80  Gb  

Con/g  N50   1.01  Mb   1.043  Mb  

%  hg19  Overlapping  BNG  Assemblies     86%   88.3%