visualization on biohpc€¦ · part iii: scientific visualization (3d volume rendering) -- big...

36
Visualization on BioHPC 1 Updated for 2016-07-20 [web] portal.biohpc.swmed.edu [email] [email protected]

Upload: others

Post on 10-Jun-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Visualization on BioHPC

1 Updated for 2016-07-20

[web] portal.biohpc.swmed.edu

[email] [email protected]

Page 2: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Outline

2

• What is Visualization- Scientific visualization- Workflow examples

• Visualization on BioHPC- Storage and management of image

BioHPC cluster & Lamella cloud storageBioHPC OMERO (Open Microscopy Environment Remote Objects)

- Image processing/Analyzing with HPCWrite your own processing packageUse existing software

- Visualization (3D volume rendering)Concepts and examples of different type of volume renderingGPU rendering v.s. CPU renderingImageJ 3D viewer plugin, Amira, Imaris, Paraview, Chimera

Page 3: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

What is Visualization

3

Visualizationa high bandwidth connection between data on a computer and a human brain.

Visualization and Analysis of 3D Imageslarger datasets demand not just visualization, but advanced visualization resources and techniques.

Page 4: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Scientific Visualization : from single image/view to multiple datasets

4

raw image produced by light-sheet microscopes

Finding collagen fibers

Fiber Alignment

* Example provided by Meghan Driscoll

Page 5: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Scientific Visualization : from multiple datasets to a single image/view

5

Red blood cell

https://www.alcf.anl.gov/user-guides/vis-paraview-red-blood-cell-tutorial

unstructured mesh

streamline (velocity)xy-plane slice (based on velocity)

diseased cell

healthy cellparticles

Page 6: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Shneiderman's Mantra

6

Mantra

• overview• zoom and filter• details-on-demand

Ben ShneidermanThe Trill of Discovery: Information visualization for high-dimensional spaces

Invited talk@ center for HCI Design, City University London

September, 2007

Get Data/Images Data Storage Pre-processing Visualization Segmentation/Analysis

Page 7: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

From image to understanding – workflow of whole brain image analysis

7

TissueVision TissueCyte 1000

7 by 14 mosaic~ 1.2 MB / tile

filtering stitching 3D renderingresampling

XY planeFactor = 4

~ 120 MB / section 150 sections

3D stack16GB -> 1GB

Segmen

tation

analysis

find out cells and fibers

* Images and workflow were provided by Denise Ramirez from WBMF

Page 8: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

From image to understanding – workflow of visualizing the native ultrastructure of motile cilia

8

Tecnai G2 F30 Cryo-electron tomography 66 tile series

Image alignment and calculation of 3D tomogram

3D subtomogram averaging of axonemal repeats

* Images and workflow were provided by Kangkang Song from Nicastro LabNative 3D structure of axonemal repeat

Page 9: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

From image to understanding – workflow of bleb detection and tracking

9

~ 18.2 KB / slice160 slice

Extract Surface3D rendering

prep

rocessin

g

MESPIM2-photon Bessel Beam Light Sheet Microscope

~ 13.8 MB

SegmentationLocating Protrusions

• Images were taken by Reto Fiolka, Kevin Dean, and Erik Welf from Gaudenz Lab• Bleb detection and tracking were processed by Meghan Driscoll from Gaudenz Lab

Page 10: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

1. Storage and management of images

2. Analyzing Microscope images

3. Scientific Visualization (3D volume rendering)

Resources on BioHPC

10

G’MIKEMAN2

eTomoIMOD/PEET

BioHPC Cluster

Lamella Cloud Storage

Page 11: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part I : Storage and management of images

11

BioHPC provides spacious storage space (~3.7 PB) and

various methods to access the storage system

Problem

--How to find a single 20 MB image among terabytes data?

50 GBBioHPC File Exchange

BioHPC Cluster

Compute nodes Lamella Cloud Storage/Gateway

UTSW private cloud

1 way access2 way access

Page 12: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part I : Storage and management of images -- OMERO

12

BioHPC OMERO Sever:http://imagebank.biohpc.swmed.edu

Client

• Download and install the OMERO desktop client from login page

• Use OMERO client module on BioHPC clustermodule add omero/5.1.0OMEROinsight_unix.sh

a modern client-server software platform for visualizing, managing, and annotating scientific image data.

Open Microscopy Environment Remote Objects

Page 13: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part I : Storage and management of images -- Benefits of using BioHPC OMERO

13

• Keep your images well organized.• Allow you import and archive your images, annotate and tag them.• Allows you to collaborate with colleagues by creating user groups with different permission levels. • Consists of a web application and cross-platform clients, as well as ImageJ plugin, MATLAB bindings and etc.• Use OMERO as a backup of important/expensive images.

Page 14: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part I : Storage and management of images -- Using OMERO with ImageJ

14

Tutorial:http://help.openmicroscopy.org/imagej.html

Page 15: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part I : Storage and management of images : Using OMERO with MATLAB

15

% create a connection to OMERO[session, client] = connectToBioHPCOmero;

% read image by imageIDimageId = 3117;targetImage = getImages(session,imageId);binaryData = getPlane(session,targetImage, 0, 0, 0); % z=1, time=1, channel=1

% plotfigure;imshow(binaryData);

% write commentscomments = 'comment by MATLAB @ 2016';newCommentAnnotation = writeCommentAnnotation(session,comments);link1 = linkAnnotation(session, newCommentAnnotation, 'image', imageId);

% close connection to OMEROunloadOmero();

Example: open image & add comments

Download the latest OMERO.matlab toolbox from :http://downloads.openmicroscopy.org/omero/5.2.4/

Download two additional matlab scripts from portal and save into toolbox folder:connectToBioHPCOmero.m and GetAuthentication.mTutorial:https://www.openmicroscopy.org/site/support/omero5.2/developers/Matlab.html

Page 16: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC

16

Even the images/data itself is tiny, to process the image may need more computational resource

Example: 3D image --> IsoVolume --> Extract Surface

Grid Type No. Cells No. Points Memory

3D image Structured 7,001,280 7,362,900 14 MB

IsoVolume Unstructured 3,167,094 1,736,075 200 MB

Extract Surface Polygonal Mesh 1,576,122 1,074,887 65 MB

Page 17: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC : parallelization

17

Upcoming training sessions:

• Parallel Programming in Matlab on BioHPC(MDCS and parallel tool box)Sep. 21st

• Python bioinformatics on BioHPCOct. 19th

Page 18: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC : MPI enabled software

18

http://bio3d.colorado.edu/imod/doc/etomoTutorial.htmlhttp://bio3d.colorado.edu/PEET/PEETmanual.html

http://www2.mrc-lmb.cam.ac.uk/groups/scheres/relion13_tutorial.pdf

http://blake.bcm.edu/emanwiki/EMAN2/Tutorials

eTomoIMOD/PEET

EMAN2

Online Tutorials

Page 19: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC -- PEET

19

PEET used in conjunction with IMODeTomo is IMOD’s GUI interface

Steps:

Step 1: processchunks (broken up job into "chunks“, each for a single cpu.)Step 2: salloct -> return JobIDStep 3: srun with JobID queuechunks (from the initial processchunks command)Step 4: clean up allocation

Required module version:peet/1.11.0-alpha(will load imod/4.8.50-beta as well)

eTomoIMOD/PEET

Page 20: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC -- PEET

20

Queue Max # CPUs #CPUs / NODE

128 GB 16*16=256 16

256 GB 24*16=256 24

384 GB 16*2=32 16

super 16*16=256 16

GPU 16*2=32 16

submit directly from eTomo GUI interface

Page 21: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC -- RELION (Single-particle processing)

21

module add relion/intel/mvapich2/1.4 relion &

The build-in job scheduler of RELION is PBS (qsub) – Can not directly submit MPI job to BioHPC cluster by pressing the Run Button.

--ntasks--cpus-per-task--mem-per-cpu

Page 22: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC -- RELION

22

#!/bin/bash

#SBATCH --job-name=MPI_relion#SBATCH --partition=super#SBATCH --nodes=2#SBATCH –ntasks=8 #SBATCH --time=0-80:00:00#SBATCH --output=MPI_relion.%j.out

module add relion/intel/mvapich2/1.4

mpirun relion_refine_mpi --o Class3D_2classes/run5 --i all_particles.star --particle_diameter 268 --angpix 1.687 --ref run1_C1_sort_it025_class001.mrc --firstiter_cc --ini_high 30 --iter 25 --tau2_fudge 4 --flatten_solvent --zero_mask --ctf --ctf_corrected_ref --ctf_phase_flipped --sym C1 --K 2 --oversampling 1 --healpix_order 3 --offset_range 5 --offset_step 2 --norm --scale --j 4 --dont_combine_weights_via_disc --dont_centralize_image_reading --limit_tilt 30 --helix --htune --htake 0.25 --hsuper 4 --hinner 0 --houter 80 --hrise 5.10 --hturn -101.25

• Q: How big is your data?Choose partition and number of Nodes to fit in your datatotal tasks / number of Node <= 32memory needed for each task * tasks on each node <=

128GB/256GB/384GB

• Q: What is the maximum speed-up you could achieve?

submit with SLURM job scheduler

Page 23: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

23

Part II : Image processing/Analyzing with HPC – EMAN2

submit directly from EMAN2 GUI interface

module add eman2/2.12-PyDusae2projectmanager.py

mpi:<nproc>:/path/to/scratchthread:<nproc>

EMAN2

Only submit job to current node

Page 24: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC – EMAN2

24

submit with SLURM job scheduler

#!/bin/bash#SBATCH --job-name=e2refine2nodes#SBATCH --partition=super#SBATCH --nodes=2#SBATCH --ntasks=64#SBATCH --time=0-02:30:00#SBATCH --output=e2refine2nodes.%j.out#SBATCH --error=e2refine2nodes.%j.err

module add eman2/2.12-PyDusa

ulimit -l unlimited

e2refine2d.py --input=sets/all__ctf_flip_small.lst --ncls=80 --normproj --fastseed --iter=6 --nbasisfp=12 --naliref=12 --parallel=mpi:64:mpi_folder --center=xform.center --simalign=rotate_translate_flip --simaligncmp=ccc --simraligncmp=dot --simcmp=ccc --classkeep=0.85 --classiter=5 --classalign=rotate_translate_flip --classaligncmp=ccc --classraligncmp=ccc --classaverager=mean --classcmp=ccc --classnormproc=normalize.edgemean

Page 25: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part II : Image processing/Analyzing with HPC – Performance Optimization

25

6.2x 3.5x 3.1x

• Not all portion of e2refine2d can processed in parallel• MPI has more overhead than multithreading

• Always use multithreading on one node• Use multiple nodes only when necessary

Page 26: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering)

26

Converting 3D datasets to 2D images with 3D effectsvoxel pixel

• Direct volume rendering (DVR) volume ray casting, texture-based volume rendering, splatting

• Projection viewMaximum intensity projection (MIP)

• Threshold based renderingIsosurface

http://scripts.crida.net/gh/wp-content/uploads/2012/08/ARC2002_Amira_Investigation.pdf

Texture-based volume rendering(VOLTEX)

Maximum intensity projection(PROJECTIONVIEW)

Iso surface rendering(ISOSURFACE)

DVR with shading effect(VOLREN)

Page 27: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering)

27

Example : volume ray casting:

(1) Ray Casting (2) Sampling (3) Shading (4) Compositing

https://en.wikipedia.org/wiki/Volume_ray_casting

Involves large amount of computation

Page 28: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering)

28

How much faster is GPU rendering as compared to CPU rendering?

Nucleus044-049 Nucleus042-043 BioHPC Workstation

GPU card Tesla K40 Tesla K20 Quadro K600

GPU Memory 12 GB 5 GB 1 GB

GPU Memory Bandwidth 288 GB/s 208 GB/s 29.0 GB/s

CUDA cores 2880 2496 192

Benchmarking (with the same image quality)

BioHPC GPU resources

http://blog.boxxtech.com/2014/10/02/gpu-rendering-vs-cpu-rendering-a-method-to-compare-render-times-with-empirical-benchmarks/

Could be significant faster if rendering with high-end GPU : the GPU was 6.2 times faster than the CPU in this case.

many slower cores v.s. few fast cores

Page 29: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering)

29

BioHPC webGPU : launch a graphic environment on the Nucleus cluster

Web based visualization access: https://portal.biohpc.swmed.edu/terminal/webgui/

• WebGUIreserve a CPU node

• WebGPUreserve a GPU node, good for ImageJ, Amira, stand-alone paraview or compile your CUDA codestep 1: add visualization software as a modulestep 2: vglrun <software name>

• WebWinDCVreserve a GPU node, windows OS, good for Imaris

• PVGPUreserve a GPU node, client-server mode for Paraview

Page 30: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering)

30

Page 31: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering) -- Chimera

31

Define your own color map

Page 32: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering) -- Chimera

32

Chimera Movie Making

Tutorial : https://www.cgl.ucsf.edu/chimera/animations/animations.htmlhttps://www.youtube.com/watch?v=hQxKYSUdiD8&list=PLHib7JgKNUUeTZONxd0h0WBiZzAJmXmva

Page 33: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization

33

How to interactively manage visually overwhelming amounts of data

Option A: Resampling on original dataset

Option B: Parallel visualizatione.g, paraview for Interactive Remote Parallel Visualization

- Allocate multiple nodes for visualization, each node/process will process one part of the image

Original Data Subsample Rate: 2 pixels Subsample Rate: 4 pixels Subsample Rate: 8 pixels

Page 34: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization

34

Client-Server mode of Paraview (remote visulization)

ClientData

ServerRenderServer

Minimize Communication

Render Remotely

Send Images

The ParaView Guide, p. 191

Page 35: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization

35

Paraview with multiple processes (parallel visualization)

• Launch a PVGPU job from Web Visualization with a single GPU node• https://portal.biohpc.swmed.edu/terminal/webgui/• You will have both Paraview server (pvserver, running 15 MPI processes) and client running

on the same GPU node

Page 36: Visualization on BioHPC€¦ · Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization 34 Client-Server mode of Paraview (remote visulization) Client Data

Last but not least

36

• The primary goal of visualization is insight• Visualization system technology improves with advances in GPUs and

HPC solutions• Challenges at every stage of visualization when operating on large data• We will continuously working on delivery better visualization and

analytic tools• For questions and commands please email the ticket system: biohpc-

[email protected]

Summary

Acknowledgement

Denise Ramirez, Kangkang Song, Meghan Driscoll, Reto Fiolka, Kevin Dean, Erik Welf, Gaudenz Danuser, Philippe Roudot, Liqiang Wang, David Trudgian