large scale image processing and high performance computing · nipype (python 3), combining tgv_qsm...

49
Large scale image processing and high performance computing Steffen Bollmann Research Fellow School of Information Technology and Electrical Engineering David Butler, Quang Tieng, Alan Hockings, Jake Carroll Oren Civier, Aswin Narayanan, Markus Barth, Tom Johnstone Jakub Kaczmarzyk, Martin Grignard

Upload: others

Post on 04-Apr-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Large scale image processing and high

performance computing

Steffen BollmannResearch Fellow

School of Information Technology and Electrical Engineering

David Butler, Quang Tieng, Alan Hockings, Jake Carroll

Oren Civier, Aswin Narayanan, Markus Barth, Tom Johnstone

Jakub Kaczmarzyk, Martin Grignard

Page 2: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Update on CAI imaging

workstations

Show integration of different compute

systems at UQ

Feedback

On Developments

Goals of this seminar

2

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 3: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Large ecosystem of scientific software …

3

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 4: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Large ecosystem of scientific software …

4

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 5: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

5

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 6: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

6

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 7: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

7

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 8: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

8

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 9: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

9

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 10: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Most tools require Linux

Tools are not available in standard package systems

Compiling from source often a nightmare

Conflicting dependencies

Reinstalling tools on different platforms takes time

Differing results between software versions

… creating problems for researchers:

10

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 11: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

• glibc 2.5 vs 2.18 deliver different floating-point results

• leads to significant differences in long pipelines

GLIBC 2.5 vs 2.18

11

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 12: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

How to help with this…

https://xkcd.com/1987/ 12

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 13: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

… but avoid …

https://xkcd.com/927/ 13

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 14: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

• Researcher wants to run an analysis

with Nipype (Python 3), combining

tgv_qsm (Python 2), FSL 6.0.3 (Linux)

and MINC 1.9.17

Let’s start with a use case

14

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 15: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

• Researcher wants to run an analysis with

Nipype (Python 3), combining tgv_qsm

(Python 2), FSL 6.0.3 (Linux) and MINC

1.9.17 (Prebuilt packages only available

for Ubuntu)

• Develop pipeline interactively on

Windows 10 notebook

• Test analysis on pilot data on a Linux

workstation running Ubuntu 18.04

• Analyse all data on a cluster running

ROCKS Centos

• Visualize results interactively on Windows

10 notebook and prepare for publication

• Share analysis pipeline with readers of

paper

Let’s start with a use case

15

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

FSL/MINC

do not run

on Windows Mixing Python

versions

causes trouble

Compiling MINC

fails due to

missing libraries

Software

setup not

reproducible

Outdated

libraries on

old Centos

cluster

Page 16: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

What exists already and how can we combine efforts?

16

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 17: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Virtual Machines VS Containers

17

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• e.g. itksnapApplication

• e.g. QT4Libraries

• e.g. Ubuntu 16.04Guest OS

• e.g. VirtualboxHypervisor

• e.g. Centos 6Host OS

• e.g. Dell Precision

Hardware

Storage: 10 GB

Startup: 15s

RAM: 4GB

Storage: 0.1 GB

Startup: 0.2 s

RAM: 0.1 GB

Page 18: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Where to Run analyses?

18

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

Page 19: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

19

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 20: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

20

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 21: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

PC/Laptop:

21

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• very interactive and flexible

Access

[email protected]

RDM Storage

• R-drive

• UQ-InstGateway data.cai.uq.edu.au

Linux-Applications

• https://github.com/NeuroDesk/vnm/ (requires Docker)

4 cores

16 GB RAM

Page 22: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk

22

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• Community project

• Started at

Organisation for

Human Brain

Mapping Hackathon

Page 23: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

• Docker Linux, Mac, Windows

• SingularityScale to HPC

• Full Linux desktop interfaceInteractive

• Tools are installed on demandLightweight

• NeuroDebian, conda, NeuroDockerRe-use existing repositories

Design principles for NeuroDesk

23

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 24: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Architecture

24

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

CAID – Automated

Container building

Community developing recipes using conda,

neurodebian, neurodocker ….

Page 25: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Automated container building using github actions

25

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 26: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Architecture

26

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NeuroDesk – Integrating

our containers on any

Linux OS

CAID – Automated

Container building

PowerUsers on Linux, HPC, CVL

Community developing recipes

Page 27: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Combining tools from different Containers using modules

27

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 28: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Architecture

28

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

VNM – Lightweight Linux

Desktop in Docker

container

NeuroDesk – Integrating

our containers on any

Linux OS

CAID – Automated

Container building

Community developing recipesUsers on Windows, Mac

Page 29: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

VNM – Interface accessible from any browser ☺

29

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 30: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

VNM – Containers are installed when needed ☺

30

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 31: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

VNM – Reproducible/Scriptable via lmod module system ☺

31

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 32: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

32

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 33: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

CAI-WKS1..6:

33

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• quite interactive and flexible

Access

[email protected]

RDM Storage• /winmounts/uqusername/data.cai.uq.edu.au/CollectionName-Qxxxx

• /winmounts/uqusername/uq-research/CollectionName-Qxxxx

Applications

• VNM menu

• Module system

28 cores

192 GB RAM

Page 34: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk on CAI-WKS1 – Menu:

34

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 35: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk on CAI-WKS1..6 – Module System:

35

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 36: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk on CAI-WKS1..6 – Module System:

36

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 37: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

37

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 38: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

CVL@Wiener:

38

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• quite interactive and flexible

Access

[email protected]

RDM Storage

• /afm02/Q[0,1,2,3]/Qxxxx

Applications

• VNM menu

• Module system

12 cores

120 GB RAM

1 GPU

Page 39: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk on CVL@Wiener – Menu:

39

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 40: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

NeuroDesk on CVL@Wiener – Module System:

40

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 41: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

41

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 42: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Awoonga:

42

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• not interactive, nor flexible, but high performance

Access

[email protected]

RDM Storage

• /QRISdata/Qxxxx

Applications

• https://github.com/NeuroDesk/neurodesk

• Module system

1920 cores

256 GB RAM

Page 43: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Installation:

Then logout and back in

NeuroDesk on Awoonga – Module System:

43

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 44: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Use:

NeuroDesk on Awoonga – Module System:

44

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Page 45: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Goal: Run on optimal hardware for job at hand

45

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

• very interactive and flexible

• not very powerful

PC/Laptop

• quite interactive and flexible

• quite powerful

CAI-WKS1...6 + CVL

• not interactive, nor flexible

• very powerful

Clusters: Awoonga, Wiener, Gadi ...

• does everything you want if you can pay for it...

Cloud

1. Storage:

RDM

2. Tools:

NeuroDesk

Page 46: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Cloud:

46

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Use case

• interactive, flexible, but expensive

Access

• Nectar, Amazon, Google, Microsoft, Oracle …

RDM Storage

• Nextcloud client

Applications

• https://github.com/NeuroDesk/vnm

Infinite cores

512 GB RAM

Page 47: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Start a compute instance on your provider of choice, then:

Neurodesk VNM in the cloud:

47

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

Port forwarding (don’t change) user@server

Then open browser: http://localhost:6080/ (or any vnc client on localhost:5900)

Page 48: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

• Researcher wants to run an analysis with

Nipype (Python 3), combining tgv_qsm

(Python 2), FSL 6.0.3 (Linux) and MINC

1.9.17 (Prebuilt packages only available

for Ubuntu)

• Develop pipeline interactively on

Windows 10 notebook

• Test analysis on pilot data on a Linux

workstation running Ubuntu 18.04

• Analyse all data on a cluster running

ROCKS Centos

• Visualize results interactively on Windows

10 notebook and prepare for publication

• Share analysis pipeline with readers of

paper

Our use case

48

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

FSL/MINC

now usable

on Windows Python versions

now separated in

containers

Compiling MINC

done in CI/CD

pipeline

Software

setup

reproducible

Outdated

libraries on old

Centos cluster

don’t matter

Patient data

can now

stay local

Page 49: Large scale image processing and high performance computing · Nipype (Python 3), combining tgv_qsm (Python 2), FSL 6.0.3 (Linux) and MINC 1.9.17 (Prebuilt packages only available

Acknowledgements

49

Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk

NEURODESK GITHUB CONTRIBUTORS

@aswinnarayanan Aswin Narayanan

@civier Oren Civier

@crnolan Christopher Nolan

@geetah05 Geeta Hariharaputran

@kaczmarj Jakub Kaczmarzyk

@MartinGrignard Martin Grignard

@meganEJcampbell MEJ Campbell

@stebo85 Steffen Bollmann

@thomshaw92 Thom Shaw