large scale image processing and high performance computing · nipype (python 3), combining tgv_qsm...

Post on 04-Apr-2021

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Large scale image processing and high

performance computing

Steffen BollmannResearch Fellow

School of Information Technology and Electrical Engineering

David Butler, Quang Tieng, Alan Hockings, Jake Carroll

Oren Civier, Aswin Narayanan, Markus Barth, Tom Johnstone

Jakub Kaczmarzyk, Martin Grignard

Update on CAI imaging

workstations

Show integration of different compute

systems at UQ

Feedback

On Developments

Goals of this seminar

2

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Large ecosystem of scientific software …

3

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Large ecosystem of scientific software …

4

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

5

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

6

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

7

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

8

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

9

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

10

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• glibc 2.5 vs 2.18 deliver different floating-point results

• leads to significant differences in long pipelines

GLIBC 2.5 vs 2.18

11

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

How to help with this…

https://xkcd.com/1987/ 12

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

… but avoid …

https://xkcd.com/927/ 13

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• Researcher wants to run an analysis

with Nipype (Python 3), combining

tgv_qsm (Python 2), FSL 6.0.3 (Linux)

and MINC 1.9.17

Let’s start with a use case

14

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• Researcher wants to run an analysis with

Nipype (Python 3), combining tgv_qsm

(Python 2), FSL 6.0.3 (Linux) and MINC

1.9.17 (Prebuilt packages only available

for Ubuntu)

• Develop pipeline interactively on

Windows 10 notebook

• Test analysis on pilot data on a Linux

workstation running Ubuntu 18.04

• Analyse all data on a cluster running

ROCKS Centos

• Visualize results interactively on Windows

10 notebook and prepare for publication

• Share analysis pipeline with readers of

paper

Let’s start with a use case

15

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

FSL/MINC

do not run

on Windows Mixing Python

versions

causes trouble

Compiling MINC

fails due to

missing libraries

Software

setup not

reproducible

Outdated

libraries on

old Centos

cluster

What exists already and how can we combine efforts?

16

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Virtual Machines VS Containers

17

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• e.g. itksnapApplication

• e.g. QT4Libraries

• e.g. Ubuntu 16.04Guest OS

• e.g. VirtualboxHypervisor

• e.g. Centos 6Host OS

• e.g. Dell Precision

Hardware

Storage: 10 GB

Startup: 15s

RAM: 4GB

Storage: 0.1 GB

Startup: 0.2 s

RAM: 0.1 GB

Where to Run analyses?

18

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

Goal: Run on optimal hardware for job at hand

19

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Goal: Run on optimal hardware for job at hand

20

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

PC/Laptop:

21

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• very interactive and flexible

Access

• helpdesk@cai.uq.edu.au

RDM Storage

• R-drive

• UQ-InstGateway data.cai.uq.edu.au

Linux-Applications

• https://github.com/NeuroDesk/vnm/ (requires Docker)

4 cores

16 GB RAM

NeuroDesk

22

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• Community project

• Started at

Organisation for

Human Brain

Mapping Hackathon

• Docker Linux, Mac, Windows

• SingularityScale to HPC

• Full Linux desktop interfaceInteractive

• Tools are installed on demandLightweight

• NeuroDebian, conda, NeuroDockerRe-use existing repositories

Design principles for NeuroDesk

23

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Architecture

24

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

CAID – Automated

Container building

Community developing recipes using conda,

neurodebian, neurodocker ….

Automated container building using github actions

25

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Architecture

26

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NeuroDesk – Integrating

our containers on any

Linux OS

CAID – Automated

Container building

PowerUsers on Linux, HPC, CVL

Community developing recipes

Combining tools from different Containers using modules

27

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Architecture

28

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

VNM – Lightweight Linux

Desktop in Docker

container

NeuroDesk – Integrating

our containers on any

Linux OS

CAID – Automated

Container building

Community developing recipesUsers on Windows, Mac

VNM – Interface accessible from any browser ☺

29

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

VNM – Containers are installed when needed ☺

30

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

VNM – Reproducible/Scriptable via lmod module system ☺

31

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Goal: Run on optimal hardware for job at hand

32

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

CAI-WKS1..6:

33

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• quite interactive and flexible

Access

• helpdesk@cai.uq.edu.au

RDM Storage• /winmounts/uqusername/data.cai.uq.edu.au/CollectionName-Qxxxx

• /winmounts/uqusername/uq-research/CollectionName-Qxxxx

Applications

• VNM menu

• Module system

28 cores

192 GB RAM

NeuroDesk on CAI-WKS1 – Menu:

34

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NeuroDesk on CAI-WKS1..6 – Module System:

35

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NeuroDesk on CAI-WKS1..6 – Module System:

36

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Goal: Run on optimal hardware for job at hand

37

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

CVL@Wiener:

38

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• quite interactive and flexible

Access

• helpdesk@qbi.uq.edu.au

RDM Storage

• /afm02/Q[0,1,2,3]/Qxxxx

Applications

• VNM menu

• Module system

12 cores

120 GB RAM

1 GPU

NeuroDesk on CVL@Wiener – Menu:

39

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NeuroDesk on CVL@Wiener – Module System:

40

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Goal: Run on optimal hardware for job at hand

41

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Awoonga:

42

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• not interactive, nor flexible, but high performance

Access

• rcc-support@uq.edu.au

RDM Storage

• /QRISdata/Qxxxx

Applications

• https://github.com/NeuroDesk/neurodesk

• Module system

1920 cores

256 GB RAM

Installation:

Then logout and back in

NeuroDesk on Awoonga – Module System:

43

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use:

NeuroDesk on Awoonga – Module System:

44

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Goal: Run on optimal hardware for job at hand

45

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Cloud:

46

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• interactive, flexible, but expensive

Access

• Nectar, Amazon, Google, Microsoft, Oracle …

RDM Storage

• Nextcloud client

Applications

• https://github.com/NeuroDesk/vnm

Infinite cores

512 GB RAM

Start a compute instance on your provider of choice, then:

Neurodesk VNM in the cloud:

47

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Port forwarding (don’t change) user@server

Then open browser: http://localhost:6080/ (or any vnc client on localhost:5900)

• Researcher wants to run an analysis with

Nipype (Python 3), combining tgv_qsm

(Python 2), FSL 6.0.3 (Linux) and MINC

1.9.17 (Prebuilt packages only available

for Ubuntu)

• Develop pipeline interactively on

Windows 10 notebook

• Test analysis on pilot data on a Linux

workstation running Ubuntu 18.04

• Analyse all data on a cluster running

ROCKS Centos

• Visualize results interactively on Windows

10 notebook and prepare for publication

• Share analysis pipeline with readers of

paper

Our use case

48

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

FSL/MINC

now usable

on Windows Python versions

now separated in

containers

Compiling MINC

done in CI/CD

pipeline

Software

setup

reproducible

Outdated

libraries on old

Centos cluster

don’t matter

Patient data

can now

stay local

Acknowledgements

49

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NEURODESK GITHUB CONTRIBUTORS

@aswinnarayanan Aswin Narayanan

@civier Oren Civier

@crnolan Christopher Nolan

@geetah05 Geeta Hariharaputran

@kaczmarj Jakub Kaczmarzyk

@MartinGrignard Martin Grignard

@meganEJcampbell MEJ Campbell

@stebo85 Steffen Bollmann

@thomshaw92 Thom Shaw

top related