precipitation nowcasting leveraging deep learning and hpc ... · introduction nowcasting predict...

14
Precipitation Nowcasting Leveraging Deep Learning and HPC Systems to Optimize the Data Pipeline

Upload: others

Post on 17-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Precipitation Nowcasting Leveraging Deep Learning and HPC Systems to Optimize the Data Pipeline

Page 2: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

AMS 2018 Copyright 2018 Cray Inc. 2

Agenda

● Introduction● Motivation● Dataset● Prediction Modeling● Data Pipelines● Results ● Q&A

Page 3: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Introduction

● Nowcasting● Predict precipitation locations and rates at a

regional level over a short timeframe● Traditional Approach: Numerical Weather

Prediction● Requires an extended lead time between

newly acquired data and release of forecasts● Deep Learning

● Branch of Machine Learning based on Neural Networks

● Deep implies multiple layers of computation between inputs and output

● Pattern Matching

AMS 2018 Copyright 2018 Cray Inc. 3

Page 4: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Motivation

● Why is short-term nowcasting important?● Provide reliable ride-to-work forecasts● Predict rapid formation of severe precipitation events: Flash Flood Warning!

● Why Deep Learning?● Traditional Nowcasting relies on NWP, slow to respond to new data● Deep Learning learns from past rainfall patterns● Trained models are computationally cheap to utilize

● Why HPC?● A lot of data! Training can utilize decades of observations● Models can be trained and inferred for small regions in parallel

AMS 2018 Copyright 2018 Cray Inc. 4

Page 5: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Prototype Nowcasting System

● Small: 4 stations● KATX (Seattle), KTLX (Oklahoma City), KTLH (Tallahassee),

KBUF (Buffalo)● Variable sized dataset, as large as 7 years of historical

rainfall data● Total size (raw data): 4TB● Total size (processed): 684GB

● Examine Pipeline Performance and Bottlenecks● Explore Nowcasting Performance via Deep Learning

AMS 2018 Copyright 2018 Cray Inc. 5

Page 6: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Dataset Processing

6

Data Collection• Historical Radar Data

(NETCDF) • Geographical Region • Days with over 0.1 inches

of precipitation, info from NOAA – NCDC

• Radar scans every 5-10 minutes throughout the day

Transformation• Raw radial data structure

converted to evenly spaced Cartesian grid (Tensors with float 32)

• Resolution scaling and clipping

• Configure dimensionality• Sequencing• 2 channels –

Reflectivity, Velocity• Uses Py-ART package

Sampling• Time-series • Inputs and

Labels• Random

samplingBigDL

Framework• Apache Spark on

Urika-XC/GX• Implemented in

Jupyter notebooks and Python

AMS 2018 Copyright 2018 Cray Inc.

Page 7: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Prediction Modeling

● Convolutional Recurrent Neural Network● Convolutional Neural Network – Spatial Patterns● Recurrent Neural Network – Temporal Patterns● ConvLSTM – Convolutional Long Short-Term Memory Network

● Sequence to Sequence● Encoder Decoder● Use recent history to predict future changes

AMS 2018 Copyright 2018 Cray Inc. 7

Page 8: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Pipeline: Data Processing

AMS 2018 Copyright 2018 Cray Inc. 8

Distributed File-System

Feed-Forward

Compute Gradients

BigDL

Iterate on Dataset

Save As NumPy Arrays

Convert to RDDs

Page 9: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Pipeline: Distributed Training

AMS 2018 Copyright 2018 Cray Inc. 9

Distributed File-System

Hyper-Parameter

Optimization

Split Data By Region

KATX

KTLH

KBUF

KTLX…

Train Unique Networks for each region

Trained Networks

Page 10: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Idealized Training Timeline

● Station: KATX● Dataset size:

● 118,342 Sequences● 101GB

● Parameters● Systems:

● Data processing: Cray Urika-GX – 1024 cores

● Training: Cray CS-Storm –8 Nvidia P100 GPUs

AMS 2018 Copyright 2018 Cray Inc. 10

Process Wall-Time Proportion

Download 13 hours 32%

Spark 4 hours 10%

Training 24 hours 58%

Inference 10 seconds 0%

Page 11: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Scaling

● Tensorflow via Cray MPI Com. Plugin

● Nvidia Tesla P100 GPUs● Batchsize of 4 samples

per device● Throughput in

Samples/Second

AMS 2018 Copyright 2018 Cray Inc. 11

Device Count Throughput Scaling

Efficiency

1 25.8 1.0

2 51.6 1.0

4 102.7 .995

8 205.4 .995

16 410.5 .994

Page 12: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Model Performance

AMS 2018 Copyright 2018 Cray Inc. 12

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6

KTLH CSI and Average Error

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6

KTLX CSI and Average Error

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1 2 3 4 5 6

KBUF CSI and Average Error

CSI-ConvLSTM CSI-PersistenceMAE-ConvLSTM MAE-Persistence

0.1

0.3

0.5

0.7

1 2 3 4 5 6

KATX CSI and Average Error

CSI-ConvLSTM CSI-PersistenceMAE-ConvLSTM MAE-Persistence

Page 13: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Effect of dataset size

● More data is correlated to higher performing prediction model

● Station: KATX (Seattle)

AMS 2018 Copyright 2018 Cray Inc. 13

0.44

0.442

0.444

0.446

0.448

0.45

0.452

0.454

0.456

0.458

0 1 2 3 4 5 6 7 8

Crit

ical

Suc

cess

Inde

xYears in Dataset

Average CSI for varying sized datasets

Years Size (GB) Sequences1 40 43,1713 109 118,3425 143 155,3377 217 235,301

Page 14: Precipitation Nowcasting Leveraging Deep Learning and HPC ... · Introduction Nowcasting Predict precipitation locations and rates at a regional level over a short timeframe Traditional

Sample Prediction + Q/A

AMS 2018 Copyright 2018 Cray Inc. 14