s u m m i t - amazon web services... · task2/slide1 task dispatcher spark driver spark worker...

Post on 12-Jun-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

S U MM I TB E R L I N

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Building an Image Analysis auto-scaling hybrid HPC to research cancerAmador PahimQuality Assurance EngineerDefiniens AG

S e s s i o n I D

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Our vision is to improve patient lives by matching patients to the

best therapies based on the most comprehensive digital profiling

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Our proprietary technology finds structures, patterns and textures

in the tumor tissue image to better understand the disease biology

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project Steps

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Internal Grid System

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

800 cores

Internal Grid System

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Tissue

Blur

Nucleus

Detectio

n

Tumor

Stroma

Annotation

s

Cell

Segmentatio

n

Level

Aggregatio

n

800 cores

35TB of RAM

Internal Grid System

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Project

Initiation

Receiving

Inspection

Region

annotations

Image

Analysis

Data

processingQC

Report and

Delivery

Definiens

Proprietary

Software

Project Steps

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• Multiple data sets

• Different types of tasks

• Hybrid cloud support

• Auto scaling

• Job flow control

• Easy deployment

Requirements

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Executor

API Web UI

Executor Executor

Executor

Task Scheduler

Architecture

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Executor

Executor

Executor

Executor

Task Dispatcher

API Web UI

Task Scheduler

Deployment

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

User Input

Tasks:

• Task1

• Type: python

• App: convert_format.py

• Task2:

• Type: spark

• App: heatmaps_calculation.py

• Upstream tasks: Task1,

Input Data

• Slide1

• Slide2

Task1/Slide1 Task1/Slide2

Task2/Slide1 Task2/Slide2

Resulting Tasks Workflow

First Level of Parallelismper input data

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1

Python

Executor

- Python Executor

provisioning

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

Python

Executor

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

Python

Executor

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

- Results report

Python

Executor

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task Dispatcher

Second Level of Parallelismmultiprocessing

Task1/Slide1- Python Executor

provisioning

- Task parameters

- Parallel execution of App

run() methods

- Results report

- Executor teardown

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

- Spark Driver provisioning

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

- Spark Driver provisioning

- Task parameters

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

Worker

Spark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

Worker

Spark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

Spark

Driver

Spark

Worker

Spark

WorkerSpark

Worker

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

- Results report

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Task2/Slide1

Task Dispatcher

- Spark Driver provisioning

- Task parameters

- Spark Workers

provisioning

- Processing orchestration

- Results report

- Cluster teardown

Second Level of Parallelismdistributed processing

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• DSS - Data Streaming Service

• Serves tiles from multiple file formats

• Standard data access service for internal applications

• Can be executed as a container

• Supports multiple storage backends (S3 included)

Data Access

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

DSS

Executor

DSS

Data Access

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

from jpc import Job

from jpc import InputData

from jpc import PythonTask

task1 = PythonTask(name='heatmaps',

app='Heatmaps.py',

app_args=['-p', '-r'],

repository_url='git@git.definiens.com:projects/12312.git')

input_data = InputData([['dss://dss.definiens.com/projects/12312/slide1'],

['dss://dss.definiens.com/projects/12312/slide2']])

job = Job(name='heatmaps_generation',

tasks=[task1],

input_data=input_data)

job_status = job.submit()

User Interface – First Version

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

User Interface – First Version

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

• Data Access Framework

• Final User Interface: Portal Integration

• More executors to come:

• Amazon SageMaker

• Amazon EMR

• Amazon Lambda experiment

• Projects billing

Next Steps

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Thank you!

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amador Pahimapahim@definiens.com

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

top related