building capacity for automated phylodynamic analysis · 8/1/2017  · building capacity for...

29
Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana Ells Campbell, MS Computational Biologist Division of HIV/AIDS Prevention National Center for HIV/AIDS, Viral Hepatitis, STDs, and Tuberculosis Prevention CDC Health Information Innovation Consortium August 1 st , 2017

Upload: others

Post on 24-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Building Capacity for Automated Phylodynamic AnalysisA look at the 2015 HIV Outbreak in rural Indiana

Ells Campbell, MSComputational BiologistDivision of HIV/AIDS PreventionNational Center for HIV/AIDS, Viral Hepatitis, STDs, and Tuberculosis

Prevention

CDC Health Information Innovation Consortium

August 1st, 2017

Page 2: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Program Perspective – Data Silos

Laboratory EpidemiologySurveillance

Page 3: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Program Perspective – Data Lake

Surveillance Laboratory Epidemiology

Page 4: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Connecting the Dots

Incidence Surveillance Molecular Surveillance

Page 5: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

HIV-TRACE – Quantifying difference between sequences

HIV Sequences in FASTA file

Page 6: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

HIV-TRACE Results – Simple Example

A B

D

0.1%

1.4%

Page 7: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Why build transmission networks?

Results directly comparable to traditional contact tracing data• Can readily compare genetic distance and contact tracing

Fine-grain resolution of transmission dynamics• Inform outbreak-specific intervention efforts• Inform future prevention strategies

Identification of unreported direct or indirect transmission links• Aid for partner services and outbreak investigations

Page 8: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Complementary Perspectives

High-Risk Sexual

Network

Epidemiological

Page 9: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Complementary Perspectives

High-Risk Sexual

Network

HIV Genetic Distance Network

Epidemiological Laboratory

Page 10: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Complementary Perspectives

High-Risk Sexual

Network

HIV Genetic Distance Network

IntegratedEpidemiological Laboratory

Both

Page 11: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Analytics Enrichment

Integration

Both Contact & Genetic Minimum Spanning Tree

(MST)

Informatics

Page 12: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Decision Support

Minimum Spanning Tree

(MST)

Informatics

Recency of HIV Infection

ClinicalIntegration

Both Contact & GeneticEstablishedAcute

Page 13: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Outbreak Response – HIV Outbreak among PWID in rural Indiana

Outbreak Details• >220 HIV+ diagnoses

• >90% inject drugs • 4-15x/day• 1-6 partners/injection

• >90% HCV co-infection

• High prevalence of sex exchange for drugs or money

Page 14: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

IDU ≤ 3

IDU ≥ 4

1 partner 2+ partners1 partner 2+ partners

Decision Support - Machine Learning- HIV Outbreak in Rural Indiana, 2015

Page 15: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Outbreak Simulations- HIV Outbreak in Rural Indiana, 2015

Page 16: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Outbreak Response – HIV Outbreak among PWID in rural Indiana

Genetic Distance Networkpolymerase region

1.5% distance threshold

• Most HIV sequences are highly similar, representing rapid and recent transmission

Page 17: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Reported Needle

Partners

Node Size

Group C

Group A

Group B

No Group

Node Color

Genetic Distance Network

0.1% distance threshold

• Pruning links with a lower threshold reveals community structure at cost of historical resolution

• Subgroups may be evidence of biological, social, temporal, and/or geographic factors

Outbreak Response – HIV Outbreak among PWID in rural Indiana

Page 18: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Composite of 100 unique MSTs

Outbreak Response – HIV Outbreak among PWID in rural Indiana

• Minimum spanning trees (MSTs) select the most parsimonious links from the distribution of distances

• There can be many equally true MSTs, so we combine many unique MSTs to illustrate uncertainty

Page 19: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

CloseGenetic Links

High-Risk

Contacts

N > 10,000

N = 182

MostlyUninformative

Noise

N > 1,500

Probable Transmission Links

Complementary Perspectives- HIV Outbreak in Rural Indiana, 2015

Page 20: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Genetic Distance Network1.5% distance threshold

Minimum Spanning TreesMean Distance <0.1%

Inferred Transmission Network

Outbreak Response – HIV Outbreak among PWID in rural Indiana

Page 21: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Integration is Complex and Resource Intensive

HIV Surveillance Data Integration Workflow

Page 22: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Data PrepTrifacta Wrangler • Summary statistics

– Interactive visualization1

• Data cleaning– Guided by machine learning2

– Live preview of changes3

• Optimized for collaboration– Shareable scripts4

• Personal & Enterprise versions available

31

2

4

https://www.trifacta.com/start-wrangling/

Page 23: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Exploratory AnalysisAlpine Chorus Powerful collaborative and analytics tools via web forms

Intuitive workflow editor with drag and drop “operators”

Built-in analytics (decision trees, linear regression, PCA, k-means, and dozens more)

Custom code turned into drag & drop operators (R, Python, Java)

Open source and Enterprisegithub.com/alpinedatalabs

Page 24: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Automated Solution

Enables reproducible machine learning analytics

…without the programmer!

Model #1

Model #2

Model #3

Page 25: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Network visualization, exploration,and animation

Dynamic model generation• Person Person• County County

Variables mapped via dropdown• Shape = Risk Factor• Color = Infection Status• Size = # of Reported Partners• Link width = genetic distance

Pattern Recognition

Low-cost ‘lite’ version on the way for public health

Network VisualizationCentrifuge - Enterprise

Page 26: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

MicrobeTRACE

Sequences + Epidemiologic Data Interactive Exploration of Transmission Networks

github.com/cdcgov/microbetrace

Page 27: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

MicrobeTRACE

Distance Histogram

Alignment Viewer

Maps

Networks

github.com/cdcgov/microbetrace

Tree Viewer

Page 28: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

Potential for Enhancement

BusinessPartner with vendors for licensing innovation

• Public health “ROI” doesn’t translate to increased budget

AdminNetwork administration and security must catch up and keep pace

TechnologyDynamic redaction of PII

• Submit necessary data for automated analysis• Rapidly integrate sensitive data locally

Page 29: Building Capacity for Automated Phylodynamic Analysis · 8/1/2017  · Building Capacity for Automated Phylodynamic Analysis A look at the 2015 HIV Outbreak in rural Indiana ... community

On the Shoulders of GiantsIndiana• Indiana State Department of Health

Scott County Health Department• Clark County Health Department• Foundations Family Medicine • Indiana University School of

Medicine, Division of Infectious Diseases

• University of Louisville School of Medicine, Division of Infectious Diseases

• NCHHSTP, Division of HIV/AIDS Prevention NCHHSTP, Division of Viral Hepatitis NCEZID, Division of Scientific Resources, Specimen Management Branch and Biotech Core Facility

• Epidemic Intelligence Service

Surveillance• HIV Incidence and Case Surveillance

Branch (HICSB)– Molecular HIV Surveillance

• State, Territory, Local, and Tribal Public Health Departments

STOP Study• NYC, NC, and SF public health • NCHHSTP, Division of HIV/AIDS

Prevention (DHAP)

Commercial Laboratories• Quest• LabCorp

Software and Support• SciComp, ITSO and OCISO at CDC• Leidos Inc.

– Centrifuge Systems– Trifacta– Alpine Data Labs

• UCSD Viral Evolution Group (VEG)• Stanford Visualization Group • Seattle Interactive Data Lab• Stanford HIV Drug Resistance DB

Team• Los Alamos National Laboratories• Gephi Consortium