a big big data platform4 gateways and tools for building them gateways provide easy-to-use access to...

54
A Big Big Data Platform John Urbanic, Parallel Computing Scientist © 2019 Pittsburgh Supercomputing Center Optional! If you choose to work on exercises, or leave, you will not feel uninformed for tomorrow's work. We avoid platform-specific information as much as possible, but some of you do want to hear about an actual enterprise class environment, and real applications. We briefly do that here.

Upload: others

Post on 01-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

A Big Big Data Platform

John Urbanic, Parallel Computing Scientist

© 2019 Pittsburgh Supercomputing Center

Optional!

If you choose to work on exercises, or leave, you will not feel uninformed for tomorrow's work.

We avoid platform-specific information as much as possible, but some of you do want to hear about an actual enterprise class environment, and real applications. We briefly do that here.

Page 2: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

2

The Shift to Big Data

Pan-STARRS telescopehttp://pan-starrs.ifa.hawaii.edu/public/

Genome sequencers(Wikipedia Commons)

NOAA climate modelinghttp://www.ornl.gov/info/ornlreview/v42_3_09/article02.shtml

CollectionsHorniman museum: http://www.horniman.ac.uk/get_involved/blog/bioblitz-insects-reviewed

Legacy documentsWikipedia Commons

Environmental sensors: Water temperature profiles from tagged hooded sealshttp://www.arctic.noaa.gov/report11/biodiv_whales_walrus.html

Library of Congress stackshttps://www.flickr.com/photos/danlem2001/6922113091/

VideoWikipedia Commons

Social networks and the Internet

New Emphases

Page 3: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

3

Challenges and Software are Co-Evolving

Scientific Visualization

Statistics

Calculations on Data

Optimization(numerical)

Structured Data

Machine Learning

Optimization(decision-making)

Natural Language Processing

Image Analysis

Video

Sound

Unstructured Data

Graph Analytics

Information Visualization

Page 4: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

4

Gateways and Tools for Building Them

Gateways provide easy-to-use access to Bridges’ HPC and data

resources, allowing users to launch jobs, orchestrate complex

workflows, and manage data from their browsers.

- Extensive leveraging of databases and polystore systems

- Great attention to HCI is needed to get these right

Download sites for MEGA-6 (Molecular Evolutionary Genetic Analysis),from www.megasoftware.net

Interactive pipeline creation in GenePattern (Broad Institute)

Col*Fusion portal for the systematic accumulation, integration, and utilization

of historical data, from http://colfusion.exp.sis.pitt.edu/colfusion/

Page 5: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

5

Example: Causal Discovery PortalCenter for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence

Web node

VM

Apache Tomcat

Messaging

Database node

VM

Other DBs

LSM Node (3TB)

ESM Node(12TB)

Analytics:FGS and other algorithms,

building on TETRAD

Browser-based UI

• Authentication• Data• Provenance

Execute causal discovery algorithms

Omni-Path

• Prepare and upload data

• Run causal discovery algorithms

• Visualize results

Internet

Memory-resident datasets

Pylon filesystem

TCGAfMRI

Page 6: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

32 HPE Apollo 2000 nodes, each with

2 NVIDIA Tesla P100 GPUs

748 HPE Apollo 2000 (128 GB) compute

nodes

20 “leaf” Intel® OPA edge switches

6 “core” Intel® OPA edge switches:

fully interconnected, 2 links per switch

42 HPE ProLiant DL580 (3 TB)

compute nodes

20 Storage Building Blocks, implementing

the parallel Pylon storage system (10 PB

usable)

4 HPE Integrity Superdome X

(12 TB) compute nodes …

12 HPE ProLiant DL380 database

nodes

6 HPE ProLiant DL360 web

server nodes

4 MDS nodes

2 front-end nodes

2 boot nodes

8 management nodes

Intel® OPA cables

… each with 2 gateway nodes

Purpose-built Intel® Omni-Path

Architecture topology for

data-intensive HPC

16 HPE Apollo 2000 (128 GB) GPU nodes with 2

NVIDIA Tesla K80 GPUs each

Simulation,

including AI-enabled

ML, inferencing, DL development,

Spark, HPC AI (Libratus, Pluribus)

Distributed AI/ML, Spark, etc.

Representative

uses for AI

Robust paths to

parallel storage

Project &

community

datasets

Large-memory

Python, R,

MATLAB,

and Java

User interfaces for AIaaS,

BDaaS

https://psc.edu/bvt

Bridges Virtual Tour:

Maximum-Scale Deep Learning

NVIDIA DGX-2 and 9

HPE Apollo 6500 Gen10

nodes:

88 NVIDIA Tesla V100

GPUs

Bridges-AI

Simulation & AI

6

Page 7: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

7

Bridges’ Phase 1 and Phase 2 nodes accelerate both deep learning and simulation codes:

Phase 1: 16 nodes, each with:

• 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz)

• 128GB DDR4-2133 RAM

Phase 2: +32 nodes, each with:

• 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz)

• 128GB DDR4-2400 RAM

Looking towards tomorrow:GPU Nodes

Page 8: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Bridges DL

8

Type RAM # CPU / GPU / SSD Server

ESM12 TBb 2 16 × Intel Xeon E7-8880 v3 (18c, 2.3/3.1 GHz, 45MB LLC)

HPE Integrity Superdome X12 TBc 2 16 × Intel Xeon E7-8880 v4 (22c, 2.2/3.3 GHz, 55MB LLC)

LSM3 TBb 8 4 × Intel Xeon E7-8860 v3 (16c, 2.2/3.2 GHz, 40 MB LLC)

HPE ProLiant DL5803 TBc 34 4 × Intel Xeon E7-8870 v4 (20c, 2.1/3.0 GHz, 50 MB LLC)

RSM 128 GBb 752 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz, 35MB LLC)

RSM-GPU128 GBb 16 2 × Intel Xeon E5-2695 v3 + 2 × NVIDIA Tesla K80 HPE Apollo 2000

128 GBc 32 2 × Intel Xeon E5-2683 v4 (16c, 2.1/3.0 GHz, 40MB LLC) + 2 × NVIDIA Tesla P100

GPU-AI16 1.5 TBd 1 16 × NVIDIA V100 32GB SXM2 + 2 × Intel Xeon Platinum 8168 + 8 × 3.84 TB NVMe SSDs NVIDIA DGX-2 delivered by HPE

GPU-A8 192 GBd 9 2 × Intel Xeon Gold 6148 + 2 × 3.84 TB NVMe SSDs HPE Apollo 6500 Gen10

DB-s 128 GBb 6 2 × Intel Xeon E5-2695 v3 + SSD HPE ProLiant DL360

DB-h 128 GBb 6 2 × Intel Xeon E5-2695 v3 + HDDs HPE ProLiant DL380

Web 128 GBb 6 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360

Othera 128 GBb 16 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360, DL380

Gateway

64 GBb 4 2 × Intel Xeon E5-2683 v3 (14c, 2.0/3.0 GHz, 35MB LLC)HPE ProLiant DL380

64 GBc 4 2 × Intel Xeon E5-2683 v3

96 GBd 2 2 × Intel Xeon

Storage128 GBb 5 2 × Intel Xeon E5-2680 v3 (12c, 2.5/3.3 GHz, 30 MB LLC)

Supermicro X10DRi256 GBc 15 2 × Intel Xeon E5-2680 v4 (14c, 2.4/3.3 GHz, 35 MB LLC)

Total 286.5 TB 920

a. Other nodes = front end (2) + management/log (8) + boot (4) + MDS (4)

b. DDR4-2133

c. DDR4-2400

d. DDR4-2666

Bri

dges-

DL

Page 9: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

The Heart of Bridges-DL: NVIDIA Volta

New Streaming Multiprocessor (SM) architecture, introducing

Tensor Cores, independent thread scheduling, combined L1 data cache and shared

memory unit, and 50% higher energy efficiency over Pascal.

Tensor Cores accelerate deep learning training and inference, providing up to 12× and

6× higher peak flops respectively over the P100 GPUs currently available in XSEDE.

NVLink 2.0 delivering 300 GB/s total bandwidth per GV100, nearly 2× higher than

P100.

HBM2 bandwidth and capacity increases: 900 GB/s and up to 32GB.

Enhanced Unified Memory and Address Translation Services improve accuracy of

memory page migration by providing new access counters.

Cooperative Groups and New Cooperative Launch APIs expand the programming

model to allow organizing groups of communicating threads.

Volta-Optimized Software includes new versions of frameworks and libraries

optimized to take advantage of the Volta architecture: TensorFlow, Caffe2, MXNet,

CNTK, cuDNN, cuBLAS, TensorRT, etc.

9

NVIDIA Tesla V100 SXM2 Module

with Volta GV100 GPU

Training ResNet-50 with ImageNet:

V100 : 1075 images/sa

P100 : 219 images/sb

K80 : 52 images/sb

a. https://devblogs.nvidia.com/tensor-core-ai-performance-milestones/

b. https://www.tensorflow.org/performance/benchmarks

Page 10: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Balancing AI Capability & Capacity: HPE Apollo 6500

Bridges-DL adds 9 HPE Apollo 6500 Gen10 servers

Each HPE Apollo 6500 couples 8 NVIDIA Tesla V100 SXM2 GPUs

– 40,960 CUDA cores and 5,120 tensor cores

Performance: 1Pf/s mixed-precision tensor, 125Tf/s 32b, 64Tf/s 64b

Memory: 128GB HBM2, 7.2TB/s aggregate memory bandwidth

2× Intel Xeon Gold 6148 CPUs and 192GB of DDR4-2666 RAM

– 20c, 2.4–3.7 GHz, 27.5 MB L3, 3 UPI links

2×4TB NVMe SSDs for user and system data

1× Intel Omni-Path host channel adapter

Hybrid cube-mesh topology connecting the 8 V100 GPUs and 2 Xeon

CPUs, using NVLink 2.0 between the GPUs and PCIe3 to the CPUs

10

HPE Apollo 6500 Gen10

hybrid cube-mesh topology

HPE Apollo 6500 Gen10 Server

Page 11: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Maximum DL Capability: NVIDIA DGX-2

Couples 16 NVIDIA Tesla V100 SXM2 GPUs

– 81,920 CUDA cores and 10,240 tensor cores

Performance: 2Pf/s mixed-precision tensor, 251Tf/s 32b, 125Tf/s 64b

Memory: 512GB HBM2, 14.4TB/s aggregate memory bandwidth

2× Intel Xeon Platinum 8168 CPUs and 1.5TB of DDR4-2666 RAM

– 24c, 2.7–3.7GHz, 33 MB L3, 3 UPI links

2×960GB NVMe SSDs host the Ubuntu Linux OS

8×3.84 TB NVMe SSDs (aggregate ~30 TB) for user data

8×Mellanox ConnectX adapters for EDR InfiniBand & 100 Gb/s Ethernet

The NVSwitch tightly couples the 16 V100 GPUs for capability & scaling

– Each of the 12 NVSwitch chips is an 18×18-port, fully-connected crossbar

– 50 GB/s/port and 900 GB/s/chip bidirectional bandwidths

– 2.4TB/s system bisection bandwidth

11

NVIDIA DGX-2

NVIDIA DGX-2 with NVSwitch

internal topology

Page 12: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

12

Appendix

For your leisurely perusal.

Questions Welcome!

Page 13: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

AI @ PSC: The Evolution of No-Limit Texas Hold’em Poker

13

Blacklight

TartanianN

Claudico

Prof. Tuomas Sandholm

Noam Brown

2010

2015

• Imperfect information,

representative of real-world

challenges

• Heads-Up: 10161 situations

Page 14: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

2019

August 30, 2019

Surpassing Human Expertise

14

2017

Made possible with:

19M core-hours of computing

2.6PB knowledge base

Page 15: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Toward a Deeper Understanding of Cancer

Spatial maps of populations of tumor and immune cells provide a snapshot of

the interaction between a tumor and the patient’s immune system.

Dr. Joel Saltz Dr. Raj Gupta

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

15

Page 16: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Pathomics Using Whole-Slide Images (WSIs)

Whole-slide images (WSIs) are

generated from glass slides that are

used by pathologists to detect and

classify cancer and other diseases to

guide patient care and treatment.

WSIs are large: up to 100,000 × 100,000

pixels, and 1–4 GB each.

Pathomics Data in Digital Pathology

generated by extracting and quantifying

image-based phenotypic features of

tissues, cells, nuclei, and other structures

and objects in whole-slide images (WSIs)

of tissue samples used for diagnostic

testing of cancer. H. Le et al., arXiv:1905.10841v2

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

16

Page 17: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Tumor Infiltrating Lymphocytes (TILs)

Tumor infiltrating lymphocytes (TILs) are

particularly important in predicting response

to cancer immune therapies.

Spatial maps of tumors and TILs provide a

snapshot of the interaction and are clinically

valuable, particularly for immunotherapy.

– High densities of TILs correlate with favorable

clinical outcomes including longer disease-free

survival or improved overall survival in multiple

cancer types.

– The spatial context and the nature of cellular

heterogeneity are important in cancer

prognosis.

A: A multi-gigapixel whole slide breast cancer image obtained

from the public Cancer Genome Atlas collection.

B, C: Predicted probability maps corresponding to tumor and

infiltrating lymphocytes.

D: A map depicting the spatial distribution of lymphocytes in

and near tumor regions.

H. Le et al., arXiv:1905.10841v2

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

17

Page 18: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Deep Learning for Cancer and Lymphocyte Detection

• Independent networks were trained for cancer detection (C) and

lymphocyte classification (L)

• Evaluated VGG16, ResNet-34, and Inception-v4

– ResNet-34 and Inception-v4: dimension of the output layer from

1,000 classes to 2 classes for binary classification

– VGG16: size of the intermediate features of the classification layer reduced

from 4,096 to 1,024 and kept only kept the first

4 classification layers

• CNNs pretrained on ImageNet and implemented in PyTorch

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

18

Le, Gupta, Hou, Abousamra, Fassler, Kurc, Samaras, Batiste, Zhao, Van Dyke, Sharma, Bremer, Almeida, and Joel Saltz,

Utilizing Automated Breast Cancer Detection to Identify Spatial Distributions of Tumor Infiltrating Lymphocytes in Invasive Breast Cancer,

arXiv:1905.10841v2.

Page 19: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

AI Detection of Complex Heterogeneous Forms of Breast Cancer

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

19

Page 20: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

AI-Driven Analyses of Multi-Gigapixel WSIs of Breast Cancer

Tissue that is diffusely infiltrated by

TILs

TILs that are primarily outside and

adjacent to the tumor at the

invasive leading edge but unable to

infiltrate the tumor

Limited scattered TILs in the tumor

Scant TILs in a focal area of the

tumor

H&E stained WSIs of tissue

sections

Probability of TILs mapped to

the tissue

Combined tumor-TIL map

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

20

Page 21: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Tumor-Immune Interaction in Multiple Disease Sites: Pancreatic Cancer

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

21

Page 22: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

How Bridges-AI Helps

• Bridges-AI is playing a crucial role in the development of methods to extract image-

based biomarkers to use pathomics to steer patient treatment.

• Computationally intensive AI algorithms are indispensable to this work in creating

highly detailed, quantitative, and nuanced characterization of the dynamic and complex

immune responses within the tumor microenvironment.

• Algorithm development, evaluation, and validation are all critical.

• Both training and inference (predictions) are highly computationally intensive.

DEEP DIVE: AI QUANTIFICATION OF TUMOR-INFILTRATING LYMPHOCYTES

1. Le, Gupta, Hou, Abousamra, Fassler, Kurc, Samaras, Batiste, Zhao, Van Dyke,

Sharma, Bremer, Almeida, and Joel Saltz, Utilizing Automated Breast Cancer

Detection to Identify Spatial Distributions of Tumor Infiltrating Lymphocytes in

Invasive Breast Cancer, https://arxiv.org/abs/1905.10841 (2019).

2. Paola A. Buitrago, Nicholas A. Nystrom, Rajarsi Gupta, and Joel Saltz,

Delivering Scalable Deep Learning to Research with Bridges-AI, CARLA 2019,

Springer, to appear.

Accuracy thus far is the best published, but ResNet-

50 and ResNet-152 have not even been tried. Their

training times will be substantially higher.

Future Directions

22

Page 23: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Deep Learning for Large Volume Cancer Imaging AnalysisShandong Wu, University of Pittsburgh

Bridges GPU and AI enables the Intelligent Computing for

Clinical Imaging (ICCI) lab to perform deep

learning-based breast imaging analysis for:

– Breast cancer risk prediction (including risk of developing

breast cancer in screening populations and risk of

recurrence or new cancer development for patients with a

breast cancer diagnosis);

– Breast tumor proliferation rate estimation;

– Reducing unnecessary call-backs for screening;

– Pre-reading of digital breast tomosynthesis images to aid

radiologists for more efficient and accurate breast cancer

diagnosis;

– Enhancing deep learning interpretability for different

breast image classification;

– Incorporating physician expert knowledge into deep

learning modeling.

23

Top: Deep learning modeling pipeline.

Bottom-Left: Model prediction performance.

Bottom-right: Visualization of deep learning-identified imaging features in

relation to the risk prediction.

GPU

AI

AI on XSEDE Systems Promises Early Prediction of Breast Cancer,

insideHPC, September 9, 2019. https://insidehpc.com/2019/09/ai-on-

xsede-systems-promises-early-prediction-of-breast-cancer/“Bridges has enabled us to process large volume, both 2D and 3D,

and different modalities of breast images including digital

mammography, breast magnetic resonance imaging (MRI), and

digital breast tomosynthesis.” —Shandong Wu, UPittWork to date has used reduced-resolution images. For clinical use,

models will need to be trained on millions of full-resolution

(3000×2000) mammograms.

Future Directions

Page 24: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Shandong Wu, Director

Asst. Professor of:

• Radiology

• Biomedical Informatics

• Bioengineering

• Computer Science & Intelligent

Systems

• Computational Biology

• Clinical and Translational Science

• Machine Learning (CMU adjunct)

24

Page 25: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Mapping Between Language Models and Brain ActivityLeila Wehbe, CMU

Cross-modality brain mapping during language processing

– Combining fMRI and MEG brain imaging modalities using language features from a

deep network language model (ELMo) as an intermediary using a Multiview

Autoencoder – a novel architecture that encodes/decodes activity of multiple subjects

into/from common representations.

– An encoding model takes ELMo features of a word as input and produces fMRI/MEG

activity as output. The weights of this encoding model are then interpreted as the

feature patterns that maximally activate the corresponding voxel/sensor. This is

followed by a correlation analysis to see which voxels “look” for the same ELMo

feature patterns as the MEG signal, for a time bin.

Syntactic representations in the human brain: going beyond effort-based metrics

– Recent neuroimaging experiments with natural stories have shown that the number

of syntactic operations performed at each word correlates with the activity of certain

brain regions.

– Wehbe proposes to supplement effort-based metrics with syntax feature spaces that

describe the actual syntactic structure that evolves in a text when it is read word-by-

word.

– Using fMRI recordings of participants reading a natural text word-by-word, they find

that many brain regions associated with language processing are predicted with

higher accuracy from their syntax feature spaces than from effort-based metrics (see

figure).

25

Wehbe’s constructed syntactic structure embeddings (ConTreGE) are better predictors of brain activity than simple effort metrics (Node Count). ConTreGE embeddings computed with sub-trees of depth 4 appear to be best at predicting brain activity.

These results suggest that the temporal cortex is a computing language structure of intermediate complexity.

RM

AI

Page 26: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Scanning for Zebrafish Neurons using a 3D CNNJoel Welling, Liyunshu Qian*, Richard Zhao*, Minyue Fan*, and Brian Leonard*, PSC

Bridges-AI is being used to scan a representative fraction of a

larval zebrafish electron microscopy (EM) dataset for neurons

using a novel deep neural network (DNN).

– Fish neurons within the dataset have been identified using a

computer-assisted manual method. These known neurons

were used to build a training set for the DNN, but only a tiny

fraction of the fish can be included in the training set. The

goals are to test the DNN in a more representative sample of

the zebrafish volume and to add well-chosen cases to the

training set.

– The DNN was developed by PI Welling, with assistance from

several PSC interns. The DNN’s unique feature is that the

calculation is being done in a spherically symmetric way, rather

than slice-wise or in terms of a rectangular array of data voxels.

It is a double convolutional neural network (CNN), whose first

half performs spherically symmetric filtering on the input and

then feeds into a conventional CNN.

– Results so far are promising. They are expanding their

procedure to the edge of the available zebrafish data.

26

The grayscale surfaces show cuts through the 3D zebrafish EM data.

Human-labeled neuron locations are blue; the orange and green

surfaces form a double wall around regions identified by the DNN. The

green inner wall appears where the volume boundary cuts the surface.

Human-labeled neurons generally fall within the boundary.

AI

The larval zebrafish brain is only ~750 µm long. Scaling to

larger datasets, for example, mouse or a mm3 of the human

cortex, will require vast computation.

Future Directions

* Undergraduate intern at PSC

Page 27: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Neural Map: Structured Memory for Deep Reinforcement LearningRuslan Salakhutdinov and Emilio Parisotto, Carnegie Mellon University

In reinforcement learning (RL) the agent is provided with a reward signal

as a result of its continual interaction with its environment.

– The resulting mapping from observations to actions maximizes the

rewards the agent receives.

– Deep reinforcement learning (DRL) parameterizes this mapping with a

deep network. Long-term memory structures are crucial to scaling up DRL

agents to master partially-observable tasks such as 3D games with long-

term objectives and self-driving cars (partial occlusion).

Salakhutdinov and Parisotto designed a novel, spatially structured

memory for 2D and 3D RL agents where navigation is a crucial part of the

task.

– They demonstrated its ability to store information over long time lags,

enabling it to surpass the performance of several previous memory-based

agents.

– This is illustrated at right, where the agent observes a green or red torch

and must find the corresponding green or red tower by navigating a 3D

maze environment in Doom. The table shows the success rates of agents

trained on a fixed map, then tested on six previously unseen maps.

27

ViZDoom (Kempka et al., 2016)

AI

Scaling to increasingly challenging real-world applications is

expected to yield great benefit; however, DRL is highly

demanding computationally.

Future Directions

Page 28: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Reinforcement Learning for Self Driving CarsJeff Schneider, Carnegie Mellon University

Traditional autonomous vehicle pipelines are modular with different

subsystems tasked with perception, state estimation, prediction,

motion planning, and control. There are

strong limitations with this approach, the foremost being difficulty

to generalize to novel scenes and the existence of many subsystem

parameters to optimize.

– Reinforcement learning has been used in prior work for training

agents to complete games and perform robotic manipulation, which

is a paradigm that could transfer naturally to the task of learning to

drive. This would provide numerous benefits, including decreased

reliance on rule-based systems and the ability to rapidly transfer

agent behavior to new settings.

– Schneider used Bridges-AI to compare systems that were trained

end-to-end using deep reinforcement learning algorithms (model

free, continuous and discrete, on policy and off policy).

– They are investigating the dependence on increasing observation

space complexity, how that relates to sample efficiency, and what the

broader implications are for training autonomous vehicle agents

using deep RL algorithms at scale.

28

A scene from the CARLA simulator showing what the

reinforcement learning agent sees while learning to drive.

Available at https://youtu.be/YmV6WskOBEs.

AI

Page 29: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Urban Traffic AnalysisJosé Moura, Carnegie Mellon University

Estimating vehicle count at various intersections of a city can be effectively facilitated via CCTV camera videos at these

intersections. However, the success of the task can be severely challenged by the drastic vehicle scale variations due to

variations in camera perspectives. Perspective variations make generalization from labeled cameras (source) to

unlabeled ones (target) highly non-trivial. Most existing counting methods overfit to the data from source camera, thus

falling short of effective generalization.

– In this project, a unified deep neural network is used to automatically learn camera perspective and adapt the counting model

to unlabeled target cameras with perspective alignment.

– The network has two main components:

1) a vehicle density & count estimation branch; and

2) a perspective alignment branch that maps the target perspective

into the source perspective via a generative adversarial framework.

– By end-to-end training of the whole network, the feature extractor

is able to extract features that are both discriminative for vehicle

counting and reduce the geometric divergence between source

and target.

– The geometry-aware learning framework embeds the camera

perspective into the counting model and adapts it from source to

target. Extensive experiments on both vehicle and highly occluded

crowd datasets verify the efficacy of the proposed methods and

show significant improvement compared to prior state-of-the-art

methods.

29

GPU

AI

Page 30: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Mouse Behavior Recognition with Deep Neural NetworksVivek Kumar, The Jackson Laboratory

The Kumar lab used Bridges-AI compute resource to perform pose

inference on a large set of video recordings of multiple mouse

strains in an open field experimental setup.

– The pose tracking inferences generated on Bridges-AI are essential to

downstream gait analysis.

– The gait analysis allows them to characterize important phenotypic

differences between strains that inform us on the progression of

neuromuscular or behavioral disease.

– Their approach to pose inference relies on a deep neural networks.

30

AI

“Since we needed to perform inference on more than 2000 hours

of video the only way to get results in a reasonable amount of

time was to use a GPU cluster of the scale that Bridges-AI

provides.” —Vivek Kumar, The Jackson Laboratory

Page 31: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Severe Thunderstorm Prediction with Big Visual DataJames Z. Wang et al., Penn State

31

Applying machine learning to detect severe storm-causing

clouds

– Leveraging the vast historical archive of satellite imagery, radar

data, and weather report data from the NOAA to train statistical

models including deep neural networks on Bridges’ CPUs and

GPUs

– Achieved high accuracy in detection of cloud patterns

– Developed fundamental statistical methods for data analysis

– Increasing the prediction lead time using deep models and GPUsDetection of severe storm causing comma-shaped clouds

from satellite images

Detection and categorization of bow echoes

from weather radar data

1. Zheng, X., Ye, J., Chen, Y., Wistar, S., Li, J., Fernández, J.A.P., Steinberg, M.A., Wang,

J.Z.: Detecting Comma-Shaped Clouds for Severe Weather Forecasting Using

Shape and Motion. IEEE Trans. Geosci. Remote Sens. 57, 3788–3801 (2019).

https://doi.org/10.1109/TGRS.2018.2887206.

2. J. Ye, P. Wu, J. Z. Wang, J. Li, Fast Discrete Distribution Clustering Using

Wasserstein Barycenter With Sparse Support. IEEE Transactions on Signal

Processing 65, 2317-2332 (2017) doi: 10.1109/TSP.2017.2659647.

AI

RM

Achieving operational scale will require training many times on NOAA’s 10-year CONUS

radar dataset, where a single epoch takes 4 hours on Bridges-AI’s DGX-2.

Future Directions

Page 32: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Utilizing Very High Temporal and Spatial Resolution Multi-Sensor DataJames Z. Wang, Pennsylvania State University

32

Enabling computational meteorology through highly-

efficient training of deep neural networks

A Faster R-CNN model is used to detect severe weather

events from radar images.

– The most important factor for training the model with

high-resolution (800GB) radar images is GPU speed.

A framework, Targeted Meta-Learning, is being

developed to address biases in datasets for weather

prediction applications.

– Two nested deep learning models are trained

concurrently, placing additional demands on

computational speed for weather images.

– Targeted Meta-Learning is being expanded to other

applications where biases need to be properly addressed.

Deep severe weather detection neural network trained on

Bridges-AI. Top: The model processes real-time radar images to

find regions of possible severe weather events. Bottom:

Targeted Meta-Learning framework developed to address biases

in machine learning datasets.

RM

AI

Page 33: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Predictive Modeling for Climate Resilient Phenotypes in Mega-Size Plant GenomesCharles Chen, Translational Genomics Laboratory, Oklahoma State University

33

Genomic prediction that estimates phenotypic

variation based on whole-genome

polymorphism:

– Factors affecting predictability are

comprehensiveness and the learning capacity of

prediction algorithms.

– Convolutional deep learning models for bacterial

antimicrobial-resistance phenotypes require high-

dimensional whole bacterial genomes (median

number of variables: 2.84 million base-pairs) to

produce intermediate tensors and eventually

classify the antibiotic resistance of a given strain.

“With the new Volta V100 GPUs (32GB Graphics RAM), we are

able to fit tensors with millions of dimensions into the GPU-

RAM and have it easily train a model in under a day using

TensorFlow.” —Charles Chen, Oklahoma State University

The predictability of antibiotic resistance can be reached at the

comparable level of the traditional MIC (minimum inhibitory

concentration) methods, suggesting a possible paradigm shift for

antibiotic resistance detection to a predictive analytics.

AI

Scale this work on bacterial genomes (2.84M base pairs) to predicting

trees’ response to threat from fungi and micro-symbionts, requiring

analysis of 1000 genomes of 20–25Mbp each.

Future Directions

Page 34: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Exploring and Generating Data with Generative Adversarial NetworksGiulia Fanti, Carnegie Mellon University

Privacy-preserving dataset generation

– Fanti & Lin’s recent research aims to understand fundamentally

how Generative Adversarial Networks (GANs) internally represent

complex data structures and to harness these observations to use

GANs for privacy-preserving dataset generation

Generation of synthetic time series data

– Can effectively generate time series datasets that mimic the

patterns of real network data

– GANs are able to capture both short-term and long-term

correlations in data that competing methods cannot, and Fanti’s

GAN can capture low-frequency events that are difficult to learn

for traditional methods. These two attributes are critical for

producing high-fidelity synthetic data.

34

1. Z. Lin, A. Khetan, G. Fanti, and S. Oh, “PacGAN: The power of two samples

in generative adversarial networks,” arXiv:1712.04086, 2017.

2. K. Thekumparampil, A. Khetan, Z. Lin, and S. Oh, “Robustness of

conditional GANs to noisy labels,” NIPS 2018, 2018 (Spotlight Award).

Autocorrelation for Wikipedia web traffic dataset of time series, over 550 days.

The real data (orange) has a short-term weekly periodic autocorrelation, as well

as a yearly. GANs can capture both trends almost exactly (black). In contrast,

competing methods such as hidden Markov models (HMMs) (green) and

recurrent neural networks (RNNs) (blue) do not capture both of these

correlations.

CelebA samples generated from DCGAN (left) and PacDCGAN2 (right) show

PacDC-GAN2 generates more diverse and sharper images.

AI

GPU

Modeling possible weaknesses in systems current takes several days for

small-model reinforcement learning and a week using GANs. This will

require substantially greater training effort overall.

Future Directions

Page 35: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Multi-Modal and End-to-End Speech RecognitionFlorian Metze, Carnegie Mellon University

Metze’s group developed a novel conversational-context aware

end-to-end speech recognizer based on a gated neural network

that incorporates conversational-context, word, and speech

embeddings.

– Unlike conventional speech recognition models, their model

does not model isolated utterances, but learns to model

entire conversations that span across multiple sentences and

is consequently better at recognizing long conversations,

such as meetings, negotiations, or medical conversations.

– Specifically, they use text-based external word and/or

sentence embeddings (i.e., fastText, BERT) within an end-to-

end framework, yielding significant improvement in word

error rate with better conversational-context representation.

– The models outperform standard end-to-end speech

recognition models on the Switchboard conversational

speech corpus and show that our model.

35

AI

This is representative of the very rapidly evolving field of

natural language understanding, for which extremely complex

networks are being developed

Future Directions

Page 36: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Deep Learning for Large-Scale Astronomical SurveysAsad Khan, E.A. Huerta, et al., University of Illinois

Khan, Huerta, et al. demonstrate that knowledge from deep

learning algorithms, pre-trained with real-object images, can be

transferred to classify galaxies that overlap Sloan Digital Sky

Survey (SDSS) and Dark Energy Survey (DES) images, achieving

state-of-the-art accuracy around 99.6%.

– They used their neural network classifier to label over 10,000

unlabeled DES galaxies which do not overlap previous

surveys.

– They also used their neural network model as a feature

extractor for unsupervised clustering, finding that unlabeled

DES images can be grouped in two distinct galaxy classes

based on their morphology, providing a heuristic check that

the learning is successfully transferred to the classification of

unlabeled DES images.

– These newly labeled datasets can be combined with

unsupervised recursive training to create large-scale DES

galaxy catalogs in preparation for the Large Synoptic Survey

Telescope era.

36

To expose the neural network to a variety of potential scenarios for

classification, we augment original galaxy images with random

vertical and horizontal flips, random rotations, height and width

shifts and zooms.

A. Khan, E. A. Huerta, S. Wang, R. Gruendl, E. Jennings, and H. Zheng, “Deep learning at scale for the

construction of galaxy catalogs in the Dark Energy Survey,” Phys. Lett. B, vol. 795, pp. 248–258, 2019.

http://www.sciencedirect.com/science/article/pii/S0370269319303879

GPU

Page 37: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Sampling Rare Events in Biomolecules with All-Atom Simulations Aided by Statistical Mechanics and

Deep Learning – Pratyush Tiwary, University of Maryland, College Park

The ability to rapidly learn from high-dimensional data to make reliable

predictions about the future is crucial in many contexts. This could be a fly

avoiding predators, or the retina processing gigabytes of data to guide human

actions. There are parallels between these and the efficient sampling of

biomolecules with hundreds of thousands of atoms.

– The Predictive Information Bottleneck framework is used for the first two problems

and reformulated for the sampling of biomolecules, especially when plagued with

rare events.

– Tiwary’s method uses a deep neural network to learn the minimally complex yet

most predictive aspects of a given biomolecular trajectory. This information is used

to perform iteratively biased simulations that enhance the sampling and directly

obtain associated thermodynamic and kinetic information. They demonstrate the

method on two test-pieces, studying processes slower than milliseconds, calculating

free energies, kinetics, and critical mutations.

– Their method implements the PIB framework through a unique linear encoder–

stochastic decoder model, using a deep neural network with built-in noise.

– Through extremely short and computationally cheap simulations, they obtained

thermodynamic and kinetic observables for slow biomolecular processes in

excellent agreement with other methods, experiments, and long unbiased MD.

– Having captured the most predictive degrees of freedom in the system, they could

also make, arguably for the first time, direct predictions of how protein sequence

can impact dissociation dynamics – namely, which mutations in the protein would

be most deleterious to the dissociation process.

37

Critical residue analysis for benzene–lysozyme complex. The

plot on the left shows, for every residue, the maximal mutual

information between the predictive information bottleneck (PIB)

and either of the Ramachandran angles ϕ, ψ of that residue.

The top 10 residues are highlighted through markers and in the

right plot, illustrated relative to the ligand in a typical

intermediate position.

Y. Wang, J. M. L. Ribeiro, and P. Tiwary, “Past-future information bottleneck for sampling molecular reaction

coordinate simultaneously with thermodynamics and kinetics,” Nature Communications, vol. 10, no. 1, p. 3573, Aug.

2019. https://www.ncbi.nlm.nih.gov/pubmed/31395868

RM

GPU

Page 38: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Computational Design of Novel Materials for Energy Storage,

Light Conversion, and Structural Applications (1/2) – Shyue Ping Ong, UCSD

The Materials Virtual Lab has utilized the unique capabilities of Bridges to

discovery completely novel phosphors for solid state lighting as well as

advance our understanding of high-entropy alloys for high-temperature

structural applications using an inter-disciplinary combination of high-

throughput computations and state-of-the-art machine learning

techniques.

A key accomplishment is the discovery of the first-known Eu2+-activated

full-visible-spectrum phosphor, Sr2AlSi2O6N:Eu2+. Thousands of potential

novel materials are generated using a data-mined ionic substitution

algorithm and then evaluated for stability and other properties using DFT

calculations. This completely novel material was successfully synthetized

by collaborators and integrated into a working LED device showing a

single-phase full-visible-spectrum phosphor with color quality superior to

some tricolor pc-LEDs.

38

S. Li et al., “Data-Driven Discovery of Full-Visible-Spectrum Phosphor,”

Chem. Mater., vol. 31, no. 16, pp. 6286–6294, Aug. 2019.

https://doi.org/10.1021/acs.chemmater.9b02505

“Data-Driven Discovery of Full-Visible-Spectrum Phosphor”:

cover article, Chemistry of Materials, August 2019

RM

LM

Page 39: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Computational Design of Novel Materials for Energy Storage,

Light Conversion, and Structural Applications (2/2) – Shyue Ping Ong, UCSD

In recent years, machine learning has emerged as a powerful

tool to develop interatomic potentials (IAPs) that retain

near-DFT accuracy, but run orders of magnitude faster than

DFT.

– Using Bridges, the Materials Virtual Lab developed a

machine-learned spectral neighbor analysis potential

(SNAP) for the refractory Nb-Mo-Ta-W high-entropy

alloy (HEA) system, which is of major interest for high-

temperature structural applications.

– Using Monte Carlo Molecular Dynamics (MCMD)

calculations, they established that there is a strong

thermodynamic driving force for Nb segregation to the

grain boundaries in the HEA, especially at room

temperature, and this segregation improves the fracture

strength of the HEA under tensile deformation.

39

(a) (b) (d)Random Segregation@300K

GB

(c) Segregation@1200K

(f)(e)

CNA

Mo W Ta Nb

bulk

Grain boundary structure Nb(Sigma 473 <110> twist) for the quaternary -Mo-

Ta-W multi-principal element alloy (MPEA) of (a) the initial random structure

which was created by randomly distributing the atoms in grain boundary

structure; Nb atoms segregated to Grain boundary area after Monte Carlo

Molecular Dynamics (MCMD) simulation at 300K (b) and 1200K (c); (d) Common

neighbor analysis (CNA) algorithm is used in OVITO to identify the grain

boundary region and bulk bcc region; (e) and (f) are evolution of the element

distribution along with the MCMD steps.

RM

LM

Page 40: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Epigenomic Profiling of Cancer CellsDavid Valle-Garcia, Harvard Medical School

In recent years, it has become clear that cancer cells undergo

significant epigenomic changes, affecting the way the DNA is

compacted and organized inside the nucleus without affecting its

sequence.

– Using Bridges, Valle-Garcia’s group is now expanding our

repertory of genome-wide analyses to include methylation

mapping on RNA.

– Studying m6Am, one of the most abundant mRNA

modifications, they demonstrated that m6Am is an

evolutionarily conserved mRNA modification mediated by

the Phosphorylated CTD Interacting Factor 1 (PCIF1).

– They identified the only human mRNA m6Am

methyltransferase and demonstrated a mechanism of gene

expression regulation through PCIF1-mediated m6Am

mRNA methylation.

– By understanding the epigenome of different types of cancer

cells, we expect to be able to improve current treatments

and generate novel therapeutic avenues.

40

RM

E. Sendinc et al., “PCIF1 Catalyzes m6Am mRNA Methylation to

Regulate Gene Expression,” Mol. Cell, vol. 75, no. 3,

pp. 620-630.e9, 2019.

http://www.sciencedirect.com/science/article/pii/S109727651930

4022

Page 41: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

The Generalizability and Replicability of Twitter Data for Population ResearchGuangqing Chi, Pennsylvania State University

Twitter data have drawn multidisciplinary interest to

study population characteristics and social problems

that cannot be measured well by traditional surveys.

However, the use of Twitter data has been strongly

resisted because of concerns about the

representativeness of the population. It is critical to

evaluate the extent to which Twitter users represent the

population across different demographic groups.

– Chi’s group uses Hadoop/Spark on Bridges in

conjunction with tools in Microsoft Azure to

examine the geographic distribution of the Twitter

user population and its demographic

representativeness.

– The county-level population for 2014 is obtained

from the American Community Survey (ACS) dataset

(8.7 TB on Bridges’ Pylon filesystem).

41

RM

A representation index is defined and computed for each

county to assess how representative Twitter users in each

county are of its total population. In the Figure, Twitter users

are under-represented in blue counties and over-represented

in orange counties.

Page 42: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Understanding Vector-Borne DiseasesNoushin Ghaffari, Texas A&M University

Texas A&M AgriLife Research assembled the genomes of three

important cattle tick species on Bridges 12TB nodes.

– The first was R. annulatus in June 2018, followed by R. microplus

and H. longicornus in 2019.

– The latest species assembled, the longhorned tick (Haemaphysalis

longicornis), is a recently introduced invasive tick of the United States

that transmits several important diseases to both livestock and humans.

– In November 2017, this tick was discovered in mainland USA for

the first time. There has been significant international interest in

uncovering the underlying molecular genetic structure of this tick

to develop effective anti-tick control technologies.

– To that end, AgriLife Genomics and Bioinformatics sequenced and

assembled the high-quality DNA material provided by Dr. Felix Guerrero

in collaborations with USDA and New Zealand colleagues.

– Dr. Guerrero is leading the team annotating the assembled genome and

transferring the genome assembly and annotation to the scientific

community. There will be follow up publications as these results get

used by the scientific community.

42

F. D. Guerrero et al., “The Pacific Biosciences de novo

assembled genome dataset from a parthenogenetic

New Zealand wild population of the longhorned tick,

Haemaphysalis longicornis Neumann, 1901,” Data Br.,

vol. 27, p. 104602, 2019.

Haemaphysalis longicornis. By Desmond W. Helmore - Manaaki Whenua – Landcare

Research, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=72857507

A Texas Longhorn cow. By Ed Schipul from Houston, TX, US - Texas Longhorn,

CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=3470112

LM

Page 43: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Identifying Susceptibility to Climate ChangeRachael Bay, UC Davis

Bridges LM nodes enabled assembling the

genomes of five species of migratory birds

with no previous genomic resources, including

the yellow warbler and willow flycatcher.

– This work led to the identification of genomic

regions connected to climate

change vulnerability.

43

LM

1. K. Ruegg et al., “Ecological genomics predicts climate

vulnerability in an endangered southwestern songbird,”

Ecol. Lett., vol. 21, no. 7, pp. 1085–1096, Jul. 2018.

2. R. A. Bay, E. B. Taylor, and D. Schluter, “Parallel

introgression and selection on introduced alleles in a

native species,” Mol. Ecol., vol. 28, no. 11, pp. 2802–2813,

Jun. 2019.

Candidate SNPs linked to temperature in the Willow flycatcher, focusing on the

association between Mean Temperature of the Warmest Quarter (BIO10) and

Climate_20 across geographic space, with population allele frequencies color-

coded from high frequency (red) to low (yellow). From [1].

Page 44: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

MetaSUB ConsortiumChris Mason and Jonathan Foox, Weill Cornell Medical College

Chris Mason and Jonathan Foox (Cornell) continue to

use Bridges Large Memory nodes to assemble

hundreds of metagenomes from around the world as

part of an international collaboration (MetaSUB)

creating global geospatial metagenomics maps,

tracking antimicrobial resistance markers, and

identifying new biosynthetic gene markers (BGCs).

44

The planetary distribution of MetaSUB viral clusters. Solid red circles

indicate the number of viral contigs recovered in each region (Section2.5).

Black open circles indicate the number of viral species recovered in each

region. Blue lines indicate the number of viral clusters that are shared

between regions, thicker lines indicating more viral clusters in common.

From [1].

1. D. Danko et al., “Global Genetic Cartography of

Urban Metagenomes and Anti-Microbial

Resistance,” bioRxiv, p. 724526, Jan. 2019.

About MetaSUB

By developing and testing standards for the field and

optimizing methods for urban sample collection,

DNA/RNA isolation, taxa characterization, and data

visualization, the MetaSUB consortium is pioneering an

unprecedented study of urban mass-transit systems

and cities around the world. These data will benefit city

planners, public health officials, and designers, as well

as discovery new species, biological systems, and

biosynthetic gene clusters (BGCs), thus enabling an era

of more quantified, responsive, and “smarter cities.”

– metasub.org

Page 45: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Co-Designing Programming Model Runtimes and Applications for ExascaleKaren Tomko and DK Panda, Ohio State University

Early access to Bridges-AI led to progress in two important

areas: optimizing communication and understanding the

intra-node topology and communication patterns for

dense-GPU systems.

– Specifically, the team has been optimizing communication

(inter-node and intra-node) in the MVAPICH2-GDR MPI

library for the NVIDIA DGX-2 system.

– Some of the enhancements have been released under

MVAPICH2-GDR 2.3.1. More enhancements will be

released in the upcoming MVAPICH2-GDR 2.3.2 release.

See figure for improvements to MPI_Allreduce.

– The team has also gained understanding of the intra-node topology and communication for DGX-2, which is

relevant to development of the OSU InfiniBand Network Analysis and Monitoring (INAM) tool.

– DK Panda and other team members have presented these results at the following events:– Invited Tutorial, D. K. Panda, A. Awan and H. Subramoni, High-Performance Distributed Deep Learning: A Beginner's Guide, CCGrid '19. March 17, 2019.

– Keynote Talk, D. K. Panda, Scalable and Distributed DNN Training on Modern HPC Systems: Challenges and Solutions, High-Performance Machine Learning (HPML) Workshop, held in conjunction

with CCGRid '19 conference, May 14, 2019.

– Invited Talk, D. K. Panda, Designing Convergent HPC, Deep Learning and Big Data Analytics Software Stacks for Exascale Systems, IBM TJ Watson Research Center, March 25, 2019.

– Invited Talk, D. K. Panda and H. Subramoni, MVAPICH2-GDR: High-Performance and Scalable CUDA-Aware MPI Library for HPC and AI, NVIDIA GTC Conference, March 19, 2019.

– Invited Tutorial, D. K. Panda, A. Awan and H. Subramoni, High-Performance Distributed Deep Learning: A Beginner's Guide, NVIDIA GTC '19. March 18, 2019.

– Keynote Talk, D. K. Panda, How to Design Convergent HPC, Deep Learning and Big Data Analytics Software Stacks for Exascale Systems?, SCAsia '19 Conference, March 13, 2019.

– Invited Tutorial, D. K. Panda, A. Awan and H. Subramoni, High Performance Distributed Deep Learning: A Beginner’s Guide, PPoPP '19. February 17, 2019.

– Keynote Talk, D. K. Panda, How to Design Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems?, HPC Advisory Council Stanford Conference, February 14, 2019.

45

AI

Page 46: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Numerical Modeling of Tsunami Generation, Propagation, and Coastal ImpactStephan Grilli, University of Rhode Island

Tsunamis create major geohazards for highly populated coastal areas. Mitigating tsunami

risk is important for society and requires accurate source forecasting and modeling,

including non-seismic near-field sources such as Submarine Mass Failures (SMFs), and

performing run-up and inundation mapping using propagation models.

– Bridges enabled hundreds of individual simulations of the tsunami generation,

propagation, and coastal impact of SMFs located off the US East Coast. Effects of slide

deformation, bottom friction, and wave frequency dispersion were investigated. This

work is published in Pure and Applied Geophysics.

– The figure shows dispersive effects in FUNWAVE simulations of far-field tsunami

propagation and hazard, propagating in 1 arc-min Atlantic grid ATL:

(a) boundary of ATL grid with locations of wave gage stations 1–9 where time series of

surface elevation are compared;

(b) relative difference (color scale in %) of maximum surface elevations in non-dispersive

versus dispersive computations);

(c), (d) envelope of maximum surface elevation computed without/with dispersion.

46

RM

L. Schambach, S. T. Grilli, J. T. Kirby, and F. Shi, “Landslide Tsunami Hazard Along the Upper US

East Coast: Effects of Slide Deformation, Bottom Friction, and Frequency Dispersion,” Pure Appl.

Geophys., vol. 176, no. 7, pp. 3059–3098, 2019.

https://doi.org/10.1007/s00024-018-1978-7

Page 47: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

A Density Functional Theory Study: Quantum 2D Layer OptoelectronicsPratibha Dev, Howard University

The goal is to understand the effects at the interfaces of quantum materials,

to realize next-generation technologies based on quantum materials. As a

proof-of-principle structure, this study concentrates on semi-metal

(bismuth) thin films/nanowire arrays, capped by a 2D layered material such

as graphene.

– Dev et al. performed DFT calculations to study the atomic and

electronic structure of the interface between graphene and the Bi (111)

surface. They investigated many crystal approximants to the

experimentally undetermined interface and found that the most stable

is a moiré supercell (upper figure) consistent with a van der Waals

structure.

– Their calculations show that there is a charge transfer (electrons) from Bi

to graphene (lower figure) that is consistent with experimental reports.

47

RM

Presentations at the APS March Meeting 2019:

1. Ivan Naumov and Pratibha Dev, “Graphene-Bi (111) interface: atomic

structure and electronic properties”

2. Priyanka Manchanda and Pratibha Dev, “Phase transition in MoX2 (X= S,

Se, Te) monolayers with hydrogenation”

Page 48: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Computational Support for Research in ab initio Modeling of Amorphous MaterialsDavid Drabold, Ohio University

Bridges enabled the following scientific discoveries and

computational achievements:

– A comprehensive report on the structure, electronic and transport properties of

copper-doped alumina. These materials consist of an insulating or

semiconducting host with transition metals added. The substructure of the

transition metals can be manipulated electrochemically, thus making it

possible to reliably exploit two resistance states. [1]

– The first models of a metallic glass Pd40Ni40P20 using our Force Enhanced

Atomic Refinement method. Besides producing the best available models of

the material, they discovered an interesting vibrational localized-delocalized

transition. [2]

– Working with the LIGO group at Stanford, modeled Zr-doped tantala as a new

coating material for the interferometer mirrors. This is a limiting element for

increasing the sensitivity of LIGO to measure gravity waves.

The figure illustrates the ability of models to closely track the annealing-

induced changes in the coatings. [3]

48

The measured pair distribution functions for two samples of

amorphous zirconia-doped tantala (ZrO2−Ta2O5) films (top) are

compared to those computed from atomic models (bottom).1. K. N. Subedi, K. Prasai, M. N. Kozicki, and D. A. Drabold, “Structural origins of electronic

conduction in amorphous copper-doped alumina,” Phys. Rev. Mater., vol. 3, no. 6, p. 65605, Jun.

2019. https://link.aps.org/doi/10.1103/PhysRevMaterials.3.065605

2. B. Bhattarai, R. Thapa, and D. A. Drabold, “Ab initio inversion of structure and the lattice dynamics

of a metallic glass: The case of Pd40Ni40P20,” Model. Simul. Mater. Sci. Eng.,

vol. 27, no. 7, 2019. https://iopscience.iop.org/article/10.1088/1361-651X/ab2ebe

3. K. Prasai et al., “High Precision Detection of Change in Intermediate Range Order of Amorphous

Zirconia-Doped Tantala Thin Films Due to Annealing,” Phys. Rev. Lett., vol. 123, no. 4, p. 45501, Jul.

2019. https://link.aps.org/doi/10.1103/PhysRevLett.123.045501

RM

Page 49: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Topological Electronic StatesRichard Martin, Carnegie Mellon University

Martin’s group is applying topological insulators for the creation of novel electronic storage devices and for modulating

conductivity in metals to possibly enable elevated temperature superconductivity.

– Bridges enables very efficient runs of Quantum Espresso

with Effective Screening Medium Method (ESM) and

Spin-orbit Coupling (SOC) modules.

– They found nontrivial properties of a charged capacitor, as

shown in the figure. When two-dimensional InP is injected

with 10 electrons, some sharp bands and many band

crossings appear.

– In addition, they find a Dirac cone at the Fermi level, indicated

by the red circle.

– After consideration of spin-orbit coupling (SOC), a band gap

opens at the band crossing point. After the band gap opening,

they find dissipationless edge states at the boundary, which

could be used to make spintronic devices and dissipationless

transistors.

49

RM

(a) Crystal structure of two-dimensional InP.

(b) Band structure of two-dimensional InP with the

injection of 10 electrons.

Page 50: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Mechanisms of Protein Unfolding and Translocation by Biological NanomachinesGeorge Stan, University of Cincinnati

Protein quality control is essential for maintaining cellular viability. Stan’s group

focuses on protein degradation mechanisms mediated by bacterial Caseinolytic

proteases (Clp) and the eukaryotic proteasome.

– These nanomachines comprise axially-stacked ring-shaped ATPases, which use

chemical energy through ATP hydrolysis to mechanically unfold and translocate

substrate proteins (SPs) through narrow central channels, and a barrel-like peptidase,

which cleaves the unfolded polypeptide chain.

– MD simulations at multiple scales, ranging from coarse-grained to atomistic

descriptions and including implicit or explicit solvent reveal direction-dependent

protein unfolding pathways that pinpoint the effect of SP topology and mechanical

stability and the role of Clp-SP interactions.

– Simulations probing conformational transitions and SP remodeling mechanisms of

ClpB and katanin ATPases that have spiral ring structure indicate greater

conformational flexibility compared with the planar ring structures and reveal the

coupling mechanism between the ATPase subunits.

– The minimum free energy pathway associated with the gating mechanisms of the ClpP

peptidase was determined using the on-the-fly string method with swarms-of-

trajectories. Results indicate that conformational transitions involve moderate coupling

of peptidase subunits which underlie non-concerted motions of flexible ClpP loops

that control the gating mechanism (see figure). A manuscript reporting these results

has been submitted for publication.

50

Minimum free energy pathway associated with the gating transition of

the bacterial peptidase ClpP. (A) The converged one-dimensional free

energy and intermediate configurations of gate-controlling loops are

represented as a function of the string coordinate α that characterizes

the progress along the transition pathway between the “closed” (α =

0) to the “open” (α = 1) pore states. Loop configurations in the cis

(trans) ClpP rings are shown in red (blue), with Arginine 15 side chains

indicated in gray. (B) Evolution of the effective pore diameter and (C)

degree of asymmetry indicate partial opening of the pore prior to

complete loop ordering and non–concerted motions of loops. Solid

(dashed) curves correspond to all (Cα–only) atomic coordinates.

RM

GPU

Page 51: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Detailed Characterization of Liquid Jet AtomizationMario Trujillo, University of Wisconsin-Madison

With projections for liquid fuels to remain a dominant source of energy through at least

2040, a need exists for understanding the liquid injection and atomization processes

beyond current empirical means. This work aims to reveal he underlying causes of liquid

jet breakup and provide a newer perspective to this relatively old problem employing

highly-resolved simulations.

– A main thrust of the simulations is to develop an understanding of the growth of

large-scale modes since these modes have been identified as being primarily

responsible for atomization.

– A second objective is the characterization of the liquid breakup and the momentum

coupling between both gas and liquid phases. This process controls the extent of air

entrainment, which is important for droplet vaporization, mixing, and subsequent

combustion.

– High-fidelity simulations via Volume-of-Fluid in OpenFoam using Bridges (a) show

that the effect of internal flow on atomization is more than simply the addition of

turbulent kinetic energy: for some flows, the most substantial influence is from the

organized flow structures developed inside the injector nozzle.

– Experimental validation was performed with X-ray radiography measurements from

ANL (b, c).

51

RM

A. Agarwal and M. F. Trujillo, “The effect of nozzle internal flow on spray

atomization,” Int. J. Engine Res., vol. 21, no. 1, pp. 55–72, Sep. 2019.

https://doi.org/10.1177/1468087419875843

a

b

y(m m )

-0.2 -0.1 0 0.1 0.2

)z(7g/mm

2)

0

20

40

60

80x = 0.1mm

" x = 5:97m

" x = 3:97m

" x = 2:87m

Experim ental

c

Page 52: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

Human Biomolecular Atlas Program (HuBMAP)

The HuBMAP Infrastructure and Engagement Component is supported by NIH project 3OT2OD026675-01S1.

“The Human BioMolecular Atlas Program (HuBMAP) aims

to facilitate research on single cells within tissues by

supporting data generation and technology development

to explore the relationship between cellular organization

and function, as well as variability in normal tissue

organization at the level of individual cells.” —NIH

RM

LM

GPU

AI

52

Page 53: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

HuBMAP Consortium

Michael Snyder,

Stanford University

Small bowel and colon

Long Cai,

Caltech

Endothelium

Mark Atkinson,

University Of Florida

Lymphatic system

Jeffrey Spraggins,

Vanderbilt University

Kidney, pancreas, bone

Kun Zhang,

UCSD

Kidney, urinary tract, lung

Tissue Mapping

Centers (TMCs)

Pehr Harbury,

Stanford University

Next-generation genomic

imaging

Long Cai,

Caltech

In situ transcriptome

profiling in single cells

Julia Laskin,

Purdue University

Sub-cellular resolution mass

spectrometry

Peng Yin,

Harvard University

High-throughput, highly

multiplexed in situ

proteomic imaging

Transformative

Technology

Development (TTDs)

Nick Nystrom, PSCJonathan Silverstein, PittPhil Blood, PSCAlex Ropelewski, PSCFlexible Hybrid Cloud Infrastructure for Seamless Management of HuBMAP Resources

Infrastructure and

Engagement

Component (IEC)

Tools Components

(TCs)

Ziv Bar-Joseph,

Carnegie Mellon

University

Nils Gehlenborg,

Harvard University

Mapping

Components (MCs)

Katy Börner,

Indiana University

Bloomington

Rahul Satija,

New York Genome

Center

Rapid Technology

Implementation

(RTIs)

Yousef Al-Kofahi,

GEMulti-Scale 3-D Image Analytics for

High Dimensional Spatial Mapping

of Normal Tissues

Mike Angelo, Sean

Bendall, StanfordA robust platform for multiplexed,

subcellular proteomic imaging in

human tissue

Neil Kelleher,

Northwestern Univ.Renewable and specific affinity

reagents for mapping proteoforms

in human tissues

Evan Macosko,

Broad InstituteImplementation of Slide-seq for

high-resolution, whole-

transcriptome human tissue maps

53

Page 54: A Big Big Data Platform4 Gateways and Tools for Building Them Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex

The Brain Image Library

Confocal Fluorescence Microscopy:

multispectral, subcellular resolution, highly quantitative

Will contain whole-brain volumetric images of mouse, rat, and other

mammals, targeted experiments highlighting connectivity between

cells, spatial transcriptomic data, and metadata describing essential

information about the experiments.

Supported by the National Institute of Mental Health of the

NIH under award number R24MH114793 ($5M).

Alex Ropelewski (PSC), Marcel Bruchez (CMU Biology),

Simon Watkins (Pitt Cell Biology & Center for Biologic Imaging)

Integrated with Bridges to support additional advanced analytics and

development of AI/ML techniques.

54

A. M. Watson et al., Ribbon scanning confocal for high-speed high-resolution volume

imaging of brain. PLoS ONE 12 (2017) doi: https://doi.org/10.1371/journal.pone.0180486.

brainimagelibrary.org