culture-independent identification & characterization of
TRANSCRIPT
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
Culture-Independent Identification
& Characterization of Infectious
Agents
Distribution Statement A: Approved for Public Release; Distribution is Unlimited
Jonathan Jacobs, PhD
Senior Advisor
Global Health Surveillance & Diagnostics
MRIGlobal
@bioinformer
UNCLASSIFIED
PanGIA Program – General Requirements and Approach
• Goals: Develop and Deploy a Sample to Sequence (S2S)
System for infectious disease:
– Provide unbiased detection of all pathogens
– Accept both clinical and environmental samples
– Modular system to allow incorporation of new emerging technologies
– Remote analysis and data compression
• Approach: Best-of-Breed evaluation of solutions and integration into workflow
– Forty-three (43) manual / automated components were evaluated to establish baseline workflow
– Custom bioinformatics analysis pipeline benchmarked against sixteen (16) analysis algorithm
– Integrated, hand-in-hand development of both wet-lab and bioinformatics methods
• Challenges
• Whole blood. High level of background host genetic material
• Environmental samples. Highly inhibited samples with complex
genetic material (swabs, soils, mosquitoes)
• Balancing analysis speed and hardware requirements with
information content and accuracy
Sample Prep
Bioinformatics
UNCLASSIFIED
UNCLASSIFIED
PanGIA – Overview
• PanGIA (Pan-Genomics for Infectious Agents) is a
sample to sequence system capable of detecting
pathogens from clinical and environmental samples.
• Fully integrated End-to-End System:o Sample preparation
o Next-generation sequencing
o Bioinformatics analysis for unbiased pathogen detection (metagenomics)
• All COTS sample processing components
• Unbiased detection with streamlined workflow
• Platform agnostic modular system developed for sustainability
• “Push-button” bioinformatics analytical workflow using commodity hardwareo No Internet Connection Required
o Actionable results
o Built-in quality metrics
o High-confidence BSAT characterization
PanGIA is an end-to-end solution for unbiased detection of pathogens
o Gram ± bacteria o Fungal pathogens
o DNA / RNA viruses o Extracellular / Intracellular pathogens
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIEDUNCLASSIFIED
UNCLASSIFIED
An Integrated Biosurveillance Pipeline
Sample Collection
Sample Prep
Next Generation Sequencing
Bioinformatics
..................
..................
..................
..................
................................
.....................
Actionable Reports
• Each process in the pipeline is developed independently (typically by vendors) ... but is dependent on each preceding process – Integration is not trivial, requires experience
UNCLASSIFIED
PanGIA Sample Prep Overview
• Workflow Components:
• Clinical and Environmental Workflows developed
– Whole blood
– Serum
– Plasma
– Forensic swabs.
Sample Pre-processing
Pre-lysis Host Depletion
Nucleic Acid Purification
Nucleic Acid Concentration
Whole Transcriptome Amplification
Library preparation
Sequencing
UNCLASSIFIED
UNCLASSIFIED
Technologies for System Integration
UNCLASSIFIED
CITRATE EDTA mix
- Cynase NorgenPlasma /
Serum Prep
- RepliGSingle
Cell WTA
NexteraXT
Illumina MiSeq(2x75,2x150)
PanGIA
Example of workflow
for whole blood• Similar workflows have been established
for serum, plasma, forensic swabs and soil.
• Additional workflows will be developed in
FY18.
Sample Preprocessing
Pathogen Concentration
Pre-lysis Host Depletion
Nucleic Acid Purification
Post-purification
Host Depletion
Whole Transcriptome Amplification
Library Prep SequencingBioinformatics
AnalysisSummary Reporting
BD iMag Bead / BD FACSFocus
Sage Science Modified SageELF
BD CLiC
Sage Science Bump Array
SRI Sentinel System
Illumina MiSeq PanGIA*
GOTTCHA
KRAKEN
MetaPhlan2
PathoScope2
RealTime Genomics
CosmosID
OneCodex
SURPI
CLARK
Illumina NeoPrep
BD PPT
PAX Gene Blood RNA
TEMPUS Blood RNA
RNAGard Blood RNA
Citrate / K2 EDTA
InnovaPrepConcentrating Pipette
Acrodisk WBC Filter
Simple 5µm Filter Disk
MicroCon DNA Fast Flow column
PEPS6 Beads
BD Imag Beads
Cyanase
EL Buffer
Saponin
Simple Freeze/Thaw
OmniCleave
QiaAmp Circulating Nucleic Acid Kit
ViraPrep Mammal Kit
Norgen Circulating Nucleic Acid Kit
Norgen Preserved Blood RNA Kit
RNAGard Blood RNA
PAX Gene Blood RNA
E.Z.N.A. Blood RNA Midi Kit
Norgen Plasma/Serum Kit
QiaAmp DNA Blood Midi Kit
NEB Next MicrobiomeDNA Enrichment Kit
NEB Next rRNADepletion Kit
Illumina Ribo-Zero
Illumina Globin-Zero
Qiagen GeneReadsrRNA Depletion
Qiagen GeneReadsGlobin mRNA Removal
RepliG Single Cell WTA
Sigma Complete WTA
Illumina Nextera XT
Kapa HyperPlus
Rubicon PicoPlex
Technologies are compatible with the desired workflow and requirements.Evaluated technologies not optimal with sample workflow.
Technology not tested due to logistic incompatibilities or not yet available.Technology to be evaluated for this project.
kalisto
ConStrains
ReadScan
Sequedex
LMAT
PathoSphere
BioVelocityOxford Nanopore MinION
10 m 3 h2 h 6 h 10 h 22 h 24 h
Elapsed Time
PAN-0125
Illumina NextSeq
UNCLASSIFIED
PanGIA Implementation
• Minimal lab footprint
• No internet required.
• Suitable for small clinical lab, portable
container lab, or remote field station
UNCLASSIFIED
Extraction WTA and Library Prep
Quantitation
SequencingBioinformatics
UNCLASSIFIED
Sample Prep Key Findings
With clinical samples, host
depletion is key to improved
detection
• Pre-lysis host depletion was
more effective than post-
purification host depletion (i.e.
rRNA depletion)
• Cyanase endonuclease was
optimal for digestion of free-
circulating nucleic acids
Sequencing library prep input is
1:1 ratio of sample WTA and
TNA
• TNA sample is concentrated,
half subjected to WTA, half
reserved for library prep
UNCLASSIFIED
PAN-0107
PAN-0128 + 129
UNCLASSIFIED
Sample Prep Key Findings
• Norgen Plasma/Serum RNA
Purification Midi Kit and Qiagen
QIAamp DNA Blood Midi Kit
performed similarly.
• MO BIO PowerMicrobiome RNA
Isolation outperformed all the
other kits for detection sensitivity
with environmental samples.
PAN-0059-60
PAN-0058/-0061/-0062/-0066
UNCLASSIFIED
UNCLASSIFIED
Key Improvements
• Improved detection of DNA viruses
– Confirmed 1E2 pfu/ml Vaccinia from infected HeLa cells spiked into blood
• Optimization of pre-lysis host depletion method
– Incubation with Cyanase at room temperature, followed by Resin Inactivation
• Reduced sequencing time with 2 x 75 PE protocol vs. 2 x 150 PE
– No significant loss of detection sensitivity
PAN-0111
PAN-0154
PAN-0125
UNCLASSIFIED
UNCLASSIFIED
LOD in Whole Blood
- S. aureus 1E3 cfu/ml
- V. cholera 1E2 cfu/ml
- Vaccinia MVA (pox virus) 1E2 pfu/ml
- VEEV (arbovirus) 1E2 pfu/ml
LOD in Forensic Swab
- B. anthracis (veg.) 1E3 cfu/ml
- V. cholera 1E3 cfu/ml
- Vaccinia MVA (pox virus) 1E2 pfu/ml
- VEEV (arbovirus) 1E3 pfu/ml
LODs are generally
between 1E2 – 1E3
cfu/pfu per ml
UNCLASSIFIED
UNCLASSIFIED
Addition of BSAT Pipeline Eliminates False-Positives
UNCLASSIFIED
1
10
100
1000
10000
100000
1000000
Log#m
appe
dread
s
Spikelevels(cfu/ml)forB.anthracisinforensicswabs
w/o BSAT
with BSAT
• BSAT pipeline is currently being validated on multiple Tier 1 Select Agents;
• Fully integrated and validated into PanGIA by August, 2018.
• Readily adaptable to any pathogen of interest and/or for rapid AMR detection
B. anthracis in forensic swabs
UNCLASSIFIED
Sample Preparation Workflow and Optimization
• Baseline workflow
established during the
Base Year based on
literature review.
• Option Year 1 focused on
improvements &
optimization
• Modifications to the
workflow resulted in
– Improved senstitivity
– Reduction turnaround
time to <24 hours
– Reduction in cost per
sample to
~$240/sample
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA: Pan-Genomics for Infectious Agents
KEY Advantages/Differentiators
• Uses read-mapping approach to a custom database: (RefSeq + IMG + your favorite strains)
• All the advantages of read-mapping, while still fast and tuned to overcome multi-mapped
reads
• Confidence scores
• Comparative analysis to control samples
• Developed hand-in-hand with laboratory methods
• Easy, intuitive graphical user interface
• No Internet Required
• Runs on commodity hardware
• Open-Source, dockerized, and easy to implement
UNCLASSIFIED
UNCLASSIFIED
Performance vs. best in class tools: In silico data
UNCLASSIFIED
UNCLASSIFIED
Performance vs. best in class tools: In silico data
UNCLASSIFIED
UNCLASSIFIED
Performance on commodity hardware (8 cores)
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5
Ho
urs
Walltime (Hours)
Series1 Series2
UNCLASSIFIED
PanGIA GOTTCHA Kraken Kaiju Metaphlan2
CLINICAL
ENVIRONMENTAL
WALLTIME (h)
UNCLASSIFIED
UNCLASSIFIED
CLINICAL
ENVIRONMENTAL
PanGIA GOTTCHA Kraken Kaiju Metaphlan2
MEMORY USAGE (GB)
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA Hardware
• No Internet Required (but cloud version available)
• Very small hardware footprint and requirement
• A “BackPacker’s Bioinformatics Brick”
• Intel NUC ‘Skull Canyon’ SBC
– 32 GB of RAM
– 2TB Solid State Drive
– Quadcore i7 CPU
– Wifi / 4G / LTE Connectivity
– No moving parts
• Analysis is typically under 15m per sample on this system
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIED
UNCLASSIFIED
UNCLASSIFIED
PanGIA User Interface
UNCLASSIFIEDUNCLASSIFIED
FY18 Development Efforts
UNCLASSIFIED
• Development of optimized methods for additional matrices for both clinical and environmental workflows
• Further optimization of existing workflow
– Evaluate post-purification host depletion technologies, potential for automation, investigation of new sample prep technologies
• Implementation of end to end Quality Control guidelines
– Process controls spiked with PhiX, MS2, and synthetic xenoDNA/RNA
– Lot# control, metadata tracking
• Optimization and integration of BSAT characterization pipeline
• User-defined / custom databases
Additional matrices
include nasopharyngeal
swabs, serum, soil and
surface water
2017 – 2018 OCONUS Deployment
• Update system based on
OCONUS feedback
• Integrate automated
technologies
• Integration with other
sequencing devices
Clinical, BSV, Food Safety
Partner Labs Wanted
UNCLASSIFIED
Patrick Chain, PhD
Karen Davenport, PhD
Paul Li, PhD
Chenchui Gao, PhD
Tom Slezak
Marissa Torres
Cleveland Clinic
Gary Procop, MD
Laura Strawn, PhD
Steve Rideout, PhD
Thank you!
Jonathan Jacobs, PhD
Richard Winegar, PhD
JR Aspinwall
Joseph Russell, PhD
Kyle Parker
Jennifer Stone
John Bagnoli
Dave Yarmosh
Brittney Campos
Benjamin Pinsky, MD PhD
Trish Simner, PhD
Matthew Robinson, MD
DOD SPONSORS
UNCLASSIFIED
Defense Threat Reduction
Agency (DTRA)
Joint Science &
Technology Office
(JSTO J9)