large scale image processing and high performance computing · nipype (python 3), combining tgv_qsm...
Post on 04-Apr-2021
5 Views
Preview:
TRANSCRIPT
Large scale image processing and high
performance computing
Steffen BollmannResearch Fellow
School of Information Technology and Electrical Engineering
David Butler, Quang Tieng, Alan Hockings, Jake Carroll
Oren Civier, Aswin Narayanan, Markus Barth, Tom Johnstone
Jakub Kaczmarzyk, Martin Grignard
Update on CAI imaging
workstations
Show integration of different compute
systems at UQ
Feedback
On Developments
Goals of this seminar
2
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Large ecosystem of scientific software …
3
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Large ecosystem of scientific software …
4
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
5
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
6
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
7
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
8
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
9
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Most tools require Linux
Tools are not available in standard package systems
Compiling from source often a nightmare
Conflicting dependencies
Reinstalling tools on different platforms takes time
Differing results between software versions
… creating problems for researchers:
10
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• glibc 2.5 vs 2.18 deliver different floating-point results
• leads to significant differences in long pipelines
GLIBC 2.5 vs 2.18
11
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
How to help with this…
https://xkcd.com/1987/ 12
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
… but avoid …
https://xkcd.com/927/ 13
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• Researcher wants to run an analysis
with Nipype (Python 3), combining
tgv_qsm (Python 2), FSL 6.0.3 (Linux)
and MINC 1.9.17
Let’s start with a use case
14
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• Researcher wants to run an analysis with
Nipype (Python 3), combining tgv_qsm
(Python 2), FSL 6.0.3 (Linux) and MINC
1.9.17 (Prebuilt packages only available
for Ubuntu)
• Develop pipeline interactively on
Windows 10 notebook
• Test analysis on pilot data on a Linux
workstation running Ubuntu 18.04
• Analyse all data on a cluster running
ROCKS Centos
• Visualize results interactively on Windows
10 notebook and prepare for publication
• Share analysis pipeline with readers of
paper
Let’s start with a use case
15
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
FSL/MINC
do not run
on Windows Mixing Python
versions
causes trouble
Compiling MINC
fails due to
missing libraries
Software
setup not
reproducible
Outdated
libraries on
old Centos
cluster
What exists already and how can we combine efforts?
16
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Virtual Machines VS Containers
17
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• e.g. itksnapApplication
• e.g. QT4Libraries
• e.g. Ubuntu 16.04Guest OS
• e.g. VirtualboxHypervisor
• e.g. Centos 6Host OS
• e.g. Dell Precision
Hardware
Storage: 10 GB
Startup: 15s
RAM: 4GB
Storage: 0.1 GB
Startup: 0.2 s
RAM: 0.1 GB
Where to Run analyses?
18
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
Goal: Run on optimal hardware for job at hand
19
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
Goal: Run on optimal hardware for job at hand
20
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
PC/Laptop:
21
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use case
• very interactive and flexible
Access
• helpdesk@cai.uq.edu.au
RDM Storage
• R-drive
• UQ-InstGateway data.cai.uq.edu.au
Linux-Applications
• https://github.com/NeuroDesk/vnm/ (requires Docker)
4 cores
16 GB RAM
NeuroDesk
22
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• Community project
• Started at
Organisation for
Human Brain
Mapping Hackathon
• Docker Linux, Mac, Windows
• SingularityScale to HPC
• Full Linux desktop interfaceInteractive
• Tools are installed on demandLightweight
• NeuroDebian, conda, NeuroDockerRe-use existing repositories
Design principles for NeuroDesk
23
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Architecture
24
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
CAID – Automated
Container building
Community developing recipes using conda,
neurodebian, neurodocker ….
Automated container building using github actions
25
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Architecture
26
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
NeuroDesk – Integrating
our containers on any
Linux OS
CAID – Automated
Container building
PowerUsers on Linux, HPC, CVL
Community developing recipes
Combining tools from different Containers using modules
27
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Architecture
28
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
VNM – Lightweight Linux
Desktop in Docker
container
NeuroDesk – Integrating
our containers on any
Linux OS
CAID – Automated
Container building
Community developing recipesUsers on Windows, Mac
VNM – Interface accessible from any browser ☺
29
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
VNM – Containers are installed when needed ☺
30
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
VNM – Reproducible/Scriptable via lmod module system ☺
31
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Goal: Run on optimal hardware for job at hand
32
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
CAI-WKS1..6:
33
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use case
• quite interactive and flexible
Access
• helpdesk@cai.uq.edu.au
RDM Storage• /winmounts/uqusername/data.cai.uq.edu.au/CollectionName-Qxxxx
• /winmounts/uqusername/uq-research/CollectionName-Qxxxx
Applications
• VNM menu
• Module system
28 cores
192 GB RAM
NeuroDesk on CAI-WKS1 – Menu:
34
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
NeuroDesk on CAI-WKS1..6 – Module System:
35
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
NeuroDesk on CAI-WKS1..6 – Module System:
36
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Goal: Run on optimal hardware for job at hand
37
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
CVL@Wiener:
38
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use case
• quite interactive and flexible
Access
• helpdesk@qbi.uq.edu.au
RDM Storage
• /afm02/Q[0,1,2,3]/Qxxxx
Applications
• VNM menu
• Module system
12 cores
120 GB RAM
1 GPU
NeuroDesk on CVL@Wiener – Menu:
39
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
NeuroDesk on CVL@Wiener – Module System:
40
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Goal: Run on optimal hardware for job at hand
41
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
Awoonga:
42
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use case
• not interactive, nor flexible, but high performance
Access
• rcc-support@uq.edu.au
RDM Storage
• /QRISdata/Qxxxx
Applications
• https://github.com/NeuroDesk/neurodesk
• Module system
1920 cores
256 GB RAM
Installation:
Then logout and back in
NeuroDesk on Awoonga – Module System:
43
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use:
NeuroDesk on Awoonga – Module System:
44
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Goal: Run on optimal hardware for job at hand
45
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
• very interactive and flexible
• not very powerful
PC/Laptop
• quite interactive and flexible
• quite powerful
CAI-WKS1...6 + CVL
• not interactive, nor flexible
• very powerful
Clusters: Awoonga, Wiener, Gadi ...
• does everything you want if you can pay for it...
Cloud
1. Storage:
RDM
2. Tools:
NeuroDesk
Cloud:
46
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Use case
• interactive, flexible, but expensive
Access
• Nectar, Amazon, Google, Microsoft, Oracle …
RDM Storage
• Nextcloud client
Applications
• https://github.com/NeuroDesk/vnm
Infinite cores
512 GB RAM
Start a compute instance on your provider of choice, then:
Neurodesk VNM in the cloud:
47
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
Port forwarding (don’t change) user@server
Then open browser: http://localhost:6080/ (or any vnc client on localhost:5900)
• Researcher wants to run an analysis with
Nipype (Python 3), combining tgv_qsm
(Python 2), FSL 6.0.3 (Linux) and MINC
1.9.17 (Prebuilt packages only available
for Ubuntu)
• Develop pipeline interactively on
Windows 10 notebook
• Test analysis on pilot data on a Linux
workstation running Ubuntu 18.04
• Analyse all data on a cluster running
ROCKS Centos
• Visualize results interactively on Windows
10 notebook and prepare for publication
• Share analysis pipeline with readers of
paper
Our use case
48
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
FSL/MINC
now usable
on Windows Python versions
now separated in
containers
Compiling MINC
done in CI/CD
pipeline
Software
setup
reproducible
Outdated
libraries on old
Centos cluster
don’t matter
Patient data
can now
stay local
Acknowledgements
49
Steffen Bollmann | @sbollmann_MRI | github.com/NeuroDesk
NEURODESK GITHUB CONTRIBUTORS
@aswinnarayanan Aswin Narayanan
@civier Oren Civier
@crnolan Christopher Nolan
@geetah05 Geeta Hariharaputran
@kaczmarj Jakub Kaczmarzyk
@MartinGrignard Martin Grignard
@meganEJcampbell MEJ Campbell
@stebo85 Steffen Bollmann
@thomshaw92 Thom Shaw
top related