accelerating drug discovery using hpc

Post on 14-Feb-2017

232 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

HPC at NIBRNick Holway, NIBR Scientific Computing GroupSpeedup 2016September 15, 2016Twitter: @nickholwayLinkedIn: https://ch.linkedin.com/in/nickholway

Novartis Institutes for Biomedical Research(NIBR)

Novartis Institutes for Biomedical Research

Today’s talk

1. HPC at NIBR – a quick introduction

2. HPC in the cloud

3. Accelerating a “compound search engine” using HPC

4. Expediting drug discovery with GPUs

Public2

Novartis Institutes for Biomedical Research

HPC at NIBR - Hardware

• x86 servers– Intel Xeon– 128-768GB RAM– FDR Infiniband– 10GigE

• Nvidia GPUs on some servers• Isilon storage

– CIFS/NFS– 10GigE to Arista switches

• Lustre– Scratch

Public3

Novartis Institutes for Biomedical Research

HPC at NIBR - Software

• RHEL 6.x

• Univa Grid Engine for scheduling

• Software compilation & configuration– Easybuild– Modules– GCC, Intel, Nvidia compilers

• Languages: C++, Fortran, CUDA, Python, R, Matlab

Public4

Novartis Institutes for Biomedical Research

HPC at NIBR - Humans

• Global team (Europe, USA, Asia)

• Complementary backgrounds and skills– Sysadmins– Mathematicians– Scientists

• HPCWire award winners in 2014

• NB: HPC exists elsewhere in the Company for Clinical Trial analysis, CFD etc.

Public5

HPC in the cloud

Novartis Institutes for Biomedical Research

HPC in the cloud

• NIBR have used Amazon EC2 for compute workloads– Cycle computing

• ISVs eg DNANexus– Bioinformatics NGS

Public7

Novartis Institutes for Biomedical Research

Docking at scale in the cloud

• Ligand-protein docking is “to predict the position and orientation of a ligand (a small molecule) when it is bound to a protein receptor or enzyme” (Wikipedia)

• Embarrassingly parallel - compute-heavy / data-light

• We used the cloud to screen 10 million molecules against a cancer target

Public8

Novartis Institutes for Biomedical Research

How we did it

• Cycle computing’s software (Cycle server, Cyclecloud)

• Over 10,000 EC2 spot instances– Extensive benchmarking to select instance type

• Licence files (licence servers cannot cope with the load)

• Proprietary compounds run in NIBR’s VPC, others in “public”

• See http://opensource.nibr.com/videos/aws-litster/ and http://cyclecomputing.com/novartis-taps-cloud-hpc-for-faster-drug-discovery-better-science/

Public9

Novartis Institutes for Biomedical Research

Where we’re going in the cloud

• “Cloud by default” for many non-HPC applications

• Clinical data (subject to “informed consent”)

• HPC where appropriate– IB etc for tightly-coupled parallel jobs usually unavailable– Data locality challenging

Public10

Accelerating a compound target search engineSlides courtesy Douglas Selenger

Novartis Institutes for Biomedical Research

Introduction

• There are many disparate public and private sources of information which is hard for experts to query and almost impossible for “normal” scientists

• Scientists would like to ask questions like “What is the target and Mechanism of Action (MoA) of my compound?”

• MOA Central is a web-based tool

Public12

Novartis Institutes for Biomedical Research

”Flow” of a query

Public13

Novartis Institutes for Biomedical Research

Diagrammatic network

Public14

Novartis Institutes for Biomedical Research

Example output

Glivec (Imatinib)

Public15

Novartis Institutes for Biomedical Research

Impact of HPC

• Our scientists developed a tool, MOA Central, using graph analysis techniques using well known workflow software

• MOA Central worked so well that the server couldn’t keep up with demand

• We helped port it to Python (Pandas, SciKit etc)– Large queries and data preparation can now run on the cluster– Version control!

• Moving from CSV files, database queries & web services to HDF5 will improve scalability

Public16

Novartis Institutes for Biomedical Research

Want to know more about MOA Central?• Look for more information at

https://www.researchgate.net/profile/Douglas_Selinger

Public17

Accelerating Motor Neuron Disease drug discovery with GPUsSlides courtesy Imtiaz Hossain

Novartis Institutes for Biomedical Research

In-vitro model for neuromuscular junctions• Faulty junctions between motor neurons and muscle

cells are implicated in MND

• We’d like to create a drug which corrects this

• Motor neurons & myotube (muscle fibre) cells were “co-cultured” in a “plate” to which drug candidates are added

• Cells were imaged in real time to measure their contractility

• This is very hard to see by eye and also hard to segment using computers

Public19

Novartis Institutes for Biomedical Research

What do the cells look like?

Public20

Novartis Institutes for Biomedical Research

Motion estimated with Optic Flow

Public21

Different contracting regions

Total area under contraction

Novartis Institutes for Biomedical Research

Impact of HPC

• A good joint project between bench scientists, lab automation experts & informaticians

• 80x increase of throughput compared to CPU

• NIBR scientists have access to new method of monitoring myotube contractility

Public22

Novartis Institutes for Biomedical Research

The future

• GPUs– Deep learning– Cryo-EM

• Real time collection & processing of data from clinical trials

• Integration of “big data” technologies such as Apache Spark into HPC

Public23

Thank you

Novartis Institutes for Biomedical Research

Backup: MOA Central predicting a side effect

Public25

top related