n ext g eneration s equencing (b ig d ata ), it and t he c linical p ractice che martin ph.d | pmp...
TRANSCRIPT
NEXT GENERATION SEQUENCING(BIG DATA), IT AND THE CLINICAL PRACTICEChe Martin Ph.D | PMP
Katrina Fox-O’Malley
Bulent Oral
BIG DATA …WHAT IS IT REALLY?
Has been made more popular due to advances in computing and technology
Data that is :
High volume Variable Complex Can be mined to extract valuable information
MODERN APPLICATIONS-INFORMATION EQUALS IMPROVED EFFICIENCY & PROFIT
Google : Search engine and Company/Product –client matching
UPS ORION program uses data such as speed, direction, performance
and other operations to optimize their performance.
Clinical and Research Research applications : identification of new regulatory elements
in pathogens Identification of new drug targets Clinical applications – Precision medicine
Sources : Thomas H. Davenport and Jill Dyche, "Big Data in Big Companies," May 2013. McKinsey Global Institute” Big data: The next frontier for innovation, competition, and productivity “
NEXT GENOME SEQUENCING AND PRECISION MEDICINE
Precision Medicine Match treatment and diagnosis to a persons
molecular profile. Advances in molecular biology, genomics and
other technologies allow: The molecular/genetic characterization of patient’s
cancer In some cases apply these results to treatment
strategies that target molecular basis of the particular patient’s cancer.
Involves diagnostic tests via one of which is NGS to obtain molecular information about cancer.
NGS STEPS -BIG DATA- HOW DOES IT WORK
Sample are subjected to targeted sequencing; known cancer genes that well-characterized as mutational hot spots
Sequencer produces reads of sequence data ~1.5 GB wells and ~ .1 GB fastq files (contain reads) per sample
Source : https://rdp.cme.msu.edu/tutorials/init_process/RDPtutorial_INITIAL-PROCESS.html
NGS STEPS -BIG DATA- HOW DOES IT WORK
Reads are quality checked and clipped Clipped read are:
Aligned to reference genome via one of many algorithms.
Converted to produce a BAM file. ~.3GB (per sample)
Saved for visualization as part of downstream analysis.
BAM is processed by Variant calling Software (many different algorithms)
NGS STEPS -BIG DATA- HOW DOES IT WORK Variant; genetic differences from reference genome (“normal
expectation”) are identified and confirmed via visualization. Information is saved in VCF files
Source : http://www.sustc-genome.org.cn/pgi/documentation.php
NGS STEPS -BIG DATA- HOW DOES IT WORK
Bioinformaticians :
Design pipelines to parse and annotate identified variants with information required for the clinical workflow: Specific mutation (peptide) Identifying relevant clinical trials Identifying published references
Design verification pipelines (some cases) Design a infrastructure and pipelines to format
this data to be received by clinical LIMS software.
ADT
Sequencer (Big Data)
Electronic Medical Record
Laboratory Information System
Patient Registration
ADT
Ord
ers
Bioinformatics Pipeline
VCF
Patient Report
Report
G
enera
tio
n
Reports/
Results
Sto
rage
SHARING DATA WITH BIOINFORMATICS PIPELINE
Export Aliases Import new data and
match with sample aliases
REPORT CONTENT: PATHOLOGY & SPECIMEN DETAILS
Cancer Gene Panel 50 w/ Interp Targeted Next Generation Sequencing Collected: 3/20/2015 11:17 AMReceived: 3/30/2015 8:26 AMVerified Date/Time: Specimen Information:Surgical Path No.: S15-8646Specimen Type: Paraffin Embedded TissueTumor Type:Metastatic lung carcinomaBlock No.: B1Neoplastic Cell Content: 50%Institution:
REPORT CONTENT: CLASSIFICATION OF VARIANTS
Result: The following variants were detected in the patient's specimen: Tier 1Gene Variant: None detectedType of Mutation: Cosmic ID: Tier 2Gene Variant: KRAS, c.34G>T, p.G12CType of Mutation: SNVCosmic ID: COSM516 Variants in Tier 2 may be associated with clinical trials. Please check www.clinicaltrials.gov for details Tier 3Gene Variant: TP53, c.830G>T, p.C277FType of Mutation: SNVCosmic ID: COSM10749 Classification of variants: Variants are classified based on current evidence for clinical actionability.Tier 1 – Clinical utility has been demonstrated - Actionable / Clinically Relevant variants. -Variants in genes with approved therapeutic implications in specified tumors. -Variants with potential diagnostic/classification, prognostic implications.Tier 2 – Clinical utility /actionability has been documented. -Variants with approved therapeutic implications in a different tumor type. -Novel variants in genes that have approved therapeutic implications. -Variants associated with Clinical trials.Tier 3 – Variants of Uncertain Significance (VUS) -Variants may be associated with little or no established cancer risk.
DISCLAIMERS/REFERENCES/E-SIGNATURE
METHOD: DNA was extracted from macrodissected, paraffin-embedded tumor of the patient using the QIAmp Kit (Qiagen, Valencia, CA). The extracted DNA was amplified and subjected to Next Generation Sequencing (NGS) using the Ion Ampliseq Cancer Panel hotspot v2 on the Ion Torrent Personal Genome Machine (Life Technologies). The targeted gene panel is designed to detect mutations/variants in 50 key cancer-related genes. This test was validated for mutations/Single Nucleotide Variants (SNV) in the BRAF, EGFR, KRAS and JAK2 genes. The limit of detection is precise and reproducible at approximately 5% with approximately 400X coverage and 2.5% with 1000X coverage. The data obtained was analyzed with the Torrent Suite Software v 4.2. DNA sequences used as references for this panel of genes can be found at http://www.ncbi.nlm.nih.gov/refseq/rsg/. The mutation nomenclature is based on the recommendations from the Human Genome Variation Society http://www.hgvs.org/mutnomen/. Limitations: This mutation panel is designed to detect targeted mutations only. The 50 genes covered are not all sequenced in their entirety. Mutations outside the 207 interrogated amplicons will not be detected. Variants of uncertain origin (germline versus somatic origin) cannot be determined unequivocally.REFERENCES:
Sequist LV et al., First-Line Gefitinib in Patients With Advanced Non–Small-Cell Lung Cancer Harboring SomaticEGFR Mutations J. Clin. Oncol , 2008, 26, 2442-2449.
Verified Date/Time:04/06/15 3:02 PM By: John Doe M.D. (Electronic Signature)