visualization on biohpc€¦ · part iii: scientific visualization (3d volume rendering) -- big...
TRANSCRIPT
Visualization on BioHPC
1 Updated for 2016-07-20
[web] portal.biohpc.swmed.edu
[email] [email protected]
Outline
2
• What is Visualization- Scientific visualization- Workflow examples
• Visualization on BioHPC- Storage and management of image
BioHPC cluster & Lamella cloud storageBioHPC OMERO (Open Microscopy Environment Remote Objects)
- Image processing/Analyzing with HPCWrite your own processing packageUse existing software
- Visualization (3D volume rendering)Concepts and examples of different type of volume renderingGPU rendering v.s. CPU renderingImageJ 3D viewer plugin, Amira, Imaris, Paraview, Chimera
What is Visualization
3
Visualizationa high bandwidth connection between data on a computer and a human brain.
Visualization and Analysis of 3D Imageslarger datasets demand not just visualization, but advanced visualization resources and techniques.
Scientific Visualization : from single image/view to multiple datasets
4
raw image produced by light-sheet microscopes
Finding collagen fibers
Fiber Alignment
* Example provided by Meghan Driscoll
Scientific Visualization : from multiple datasets to a single image/view
5
Red blood cell
https://www.alcf.anl.gov/user-guides/vis-paraview-red-blood-cell-tutorial
unstructured mesh
streamline (velocity)xy-plane slice (based on velocity)
diseased cell
healthy cellparticles
Shneiderman's Mantra
6
Mantra
• overview• zoom and filter• details-on-demand
Ben ShneidermanThe Trill of Discovery: Information visualization for high-dimensional spaces
Invited talk@ center for HCI Design, City University London
September, 2007
Get Data/Images Data Storage Pre-processing Visualization Segmentation/Analysis
From image to understanding – workflow of whole brain image analysis
7
TissueVision TissueCyte 1000
7 by 14 mosaic~ 1.2 MB / tile
filtering stitching 3D renderingresampling
XY planeFactor = 4
~ 120 MB / section 150 sections
3D stack16GB -> 1GB
Segmen
tation
analysis
find out cells and fibers
* Images and workflow were provided by Denise Ramirez from WBMF
From image to understanding – workflow of visualizing the native ultrastructure of motile cilia
8
Tecnai G2 F30 Cryo-electron tomography 66 tile series
Image alignment and calculation of 3D tomogram
3D subtomogram averaging of axonemal repeats
* Images and workflow were provided by Kangkang Song from Nicastro LabNative 3D structure of axonemal repeat
From image to understanding – workflow of bleb detection and tracking
9
~ 18.2 KB / slice160 slice
Extract Surface3D rendering
prep
rocessin
g
MESPIM2-photon Bessel Beam Light Sheet Microscope
~ 13.8 MB
SegmentationLocating Protrusions
• Images were taken by Reto Fiolka, Kevin Dean, and Erik Welf from Gaudenz Lab• Bleb detection and tracking were processed by Meghan Driscoll from Gaudenz Lab
1. Storage and management of images
2. Analyzing Microscope images
3. Scientific Visualization (3D volume rendering)
Resources on BioHPC
10
G’MIKEMAN2
eTomoIMOD/PEET
BioHPC Cluster
Lamella Cloud Storage
Part I : Storage and management of images
11
BioHPC provides spacious storage space (~3.7 PB) and
various methods to access the storage system
Problem
--How to find a single 20 MB image among terabytes data?
50 GBBioHPC File Exchange
BioHPC Cluster
Compute nodes Lamella Cloud Storage/Gateway
UTSW private cloud
1 way access2 way access
Part I : Storage and management of images -- OMERO
12
BioHPC OMERO Sever:http://imagebank.biohpc.swmed.edu
Client
• Download and install the OMERO desktop client from login page
• Use OMERO client module on BioHPC clustermodule add omero/5.1.0OMEROinsight_unix.sh
a modern client-server software platform for visualizing, managing, and annotating scientific image data.
Open Microscopy Environment Remote Objects
Part I : Storage and management of images -- Benefits of using BioHPC OMERO
13
• Keep your images well organized.• Allow you import and archive your images, annotate and tag them.• Allows you to collaborate with colleagues by creating user groups with different permission levels. • Consists of a web application and cross-platform clients, as well as ImageJ plugin, MATLAB bindings and etc.• Use OMERO as a backup of important/expensive images.
Part I : Storage and management of images -- Using OMERO with ImageJ
14
Tutorial:http://help.openmicroscopy.org/imagej.html
Part I : Storage and management of images : Using OMERO with MATLAB
15
% create a connection to OMERO[session, client] = connectToBioHPCOmero;
% read image by imageIDimageId = 3117;targetImage = getImages(session,imageId);binaryData = getPlane(session,targetImage, 0, 0, 0); % z=1, time=1, channel=1
% plotfigure;imshow(binaryData);
% write commentscomments = 'comment by MATLAB @ 2016';newCommentAnnotation = writeCommentAnnotation(session,comments);link1 = linkAnnotation(session, newCommentAnnotation, 'image', imageId);
% close connection to OMEROunloadOmero();
Example: open image & add comments
Download the latest OMERO.matlab toolbox from :http://downloads.openmicroscopy.org/omero/5.2.4/
Download two additional matlab scripts from portal and save into toolbox folder:connectToBioHPCOmero.m and GetAuthentication.mTutorial:https://www.openmicroscopy.org/site/support/omero5.2/developers/Matlab.html
Part II : Image processing/Analyzing with HPC
16
Even the images/data itself is tiny, to process the image may need more computational resource
Example: 3D image --> IsoVolume --> Extract Surface
Grid Type No. Cells No. Points Memory
3D image Structured 7,001,280 7,362,900 14 MB
IsoVolume Unstructured 3,167,094 1,736,075 200 MB
Extract Surface Polygonal Mesh 1,576,122 1,074,887 65 MB
Part II : Image processing/Analyzing with HPC : parallelization
17
Upcoming training sessions:
• Parallel Programming in Matlab on BioHPC(MDCS and parallel tool box)Sep. 21st
• Python bioinformatics on BioHPCOct. 19th
Part II : Image processing/Analyzing with HPC : MPI enabled software
18
http://bio3d.colorado.edu/imod/doc/etomoTutorial.htmlhttp://bio3d.colorado.edu/PEET/PEETmanual.html
http://www2.mrc-lmb.cam.ac.uk/groups/scheres/relion13_tutorial.pdf
http://blake.bcm.edu/emanwiki/EMAN2/Tutorials
eTomoIMOD/PEET
EMAN2
Online Tutorials
Part II : Image processing/Analyzing with HPC -- PEET
19
PEET used in conjunction with IMODeTomo is IMOD’s GUI interface
Steps:
Step 1: processchunks (broken up job into "chunks“, each for a single cpu.)Step 2: salloct -> return JobIDStep 3: srun with JobID queuechunks (from the initial processchunks command)Step 4: clean up allocation
Required module version:peet/1.11.0-alpha(will load imod/4.8.50-beta as well)
eTomoIMOD/PEET
Part II : Image processing/Analyzing with HPC -- PEET
20
Queue Max # CPUs #CPUs / NODE
128 GB 16*16=256 16
256 GB 24*16=256 24
384 GB 16*2=32 16
super 16*16=256 16
GPU 16*2=32 16
submit directly from eTomo GUI interface
Part II : Image processing/Analyzing with HPC -- RELION (Single-particle processing)
21
module add relion/intel/mvapich2/1.4 relion &
The build-in job scheduler of RELION is PBS (qsub) – Can not directly submit MPI job to BioHPC cluster by pressing the Run Button.
--ntasks--cpus-per-task--mem-per-cpu
Part II : Image processing/Analyzing with HPC -- RELION
22
#!/bin/bash
#SBATCH --job-name=MPI_relion#SBATCH --partition=super#SBATCH --nodes=2#SBATCH –ntasks=8 #SBATCH --time=0-80:00:00#SBATCH --output=MPI_relion.%j.out
module add relion/intel/mvapich2/1.4
mpirun relion_refine_mpi --o Class3D_2classes/run5 --i all_particles.star --particle_diameter 268 --angpix 1.687 --ref run1_C1_sort_it025_class001.mrc --firstiter_cc --ini_high 30 --iter 25 --tau2_fudge 4 --flatten_solvent --zero_mask --ctf --ctf_corrected_ref --ctf_phase_flipped --sym C1 --K 2 --oversampling 1 --healpix_order 3 --offset_range 5 --offset_step 2 --norm --scale --j 4 --dont_combine_weights_via_disc --dont_centralize_image_reading --limit_tilt 30 --helix --htune --htake 0.25 --hsuper 4 --hinner 0 --houter 80 --hrise 5.10 --hturn -101.25
• Q: How big is your data?Choose partition and number of Nodes to fit in your datatotal tasks / number of Node <= 32memory needed for each task * tasks on each node <=
128GB/256GB/384GB
• Q: What is the maximum speed-up you could achieve?
submit with SLURM job scheduler
23
Part II : Image processing/Analyzing with HPC – EMAN2
submit directly from EMAN2 GUI interface
module add eman2/2.12-PyDusae2projectmanager.py
mpi:<nproc>:/path/to/scratchthread:<nproc>
EMAN2
Only submit job to current node
Part II : Image processing/Analyzing with HPC – EMAN2
24
submit with SLURM job scheduler
#!/bin/bash#SBATCH --job-name=e2refine2nodes#SBATCH --partition=super#SBATCH --nodes=2#SBATCH --ntasks=64#SBATCH --time=0-02:30:00#SBATCH --output=e2refine2nodes.%j.out#SBATCH --error=e2refine2nodes.%j.err
module add eman2/2.12-PyDusa
ulimit -l unlimited
e2refine2d.py --input=sets/all__ctf_flip_small.lst --ncls=80 --normproj --fastseed --iter=6 --nbasisfp=12 --naliref=12 --parallel=mpi:64:mpi_folder --center=xform.center --simalign=rotate_translate_flip --simaligncmp=ccc --simraligncmp=dot --simcmp=ccc --classkeep=0.85 --classiter=5 --classalign=rotate_translate_flip --classaligncmp=ccc --classraligncmp=ccc --classaverager=mean --classcmp=ccc --classnormproc=normalize.edgemean
Part II : Image processing/Analyzing with HPC – Performance Optimization
25
6.2x 3.5x 3.1x
• Not all portion of e2refine2d can processed in parallel• MPI has more overhead than multithreading
• Always use multithreading on one node• Use multiple nodes only when necessary
Part III: Scientific Visualization (3D volume rendering)
26
Converting 3D datasets to 2D images with 3D effectsvoxel pixel
• Direct volume rendering (DVR) volume ray casting, texture-based volume rendering, splatting
• Projection viewMaximum intensity projection (MIP)
• Threshold based renderingIsosurface
http://scripts.crida.net/gh/wp-content/uploads/2012/08/ARC2002_Amira_Investigation.pdf
Texture-based volume rendering(VOLTEX)
Maximum intensity projection(PROJECTIONVIEW)
Iso surface rendering(ISOSURFACE)
DVR with shading effect(VOLREN)
Part III: Scientific Visualization (3D volume rendering)
27
Example : volume ray casting:
(1) Ray Casting (2) Sampling (3) Shading (4) Compositing
https://en.wikipedia.org/wiki/Volume_ray_casting
Involves large amount of computation
Part III: Scientific Visualization (3D volume rendering)
28
How much faster is GPU rendering as compared to CPU rendering?
Nucleus044-049 Nucleus042-043 BioHPC Workstation
GPU card Tesla K40 Tesla K20 Quadro K600
GPU Memory 12 GB 5 GB 1 GB
GPU Memory Bandwidth 288 GB/s 208 GB/s 29.0 GB/s
CUDA cores 2880 2496 192
Benchmarking (with the same image quality)
BioHPC GPU resources
http://blog.boxxtech.com/2014/10/02/gpu-rendering-vs-cpu-rendering-a-method-to-compare-render-times-with-empirical-benchmarks/
Could be significant faster if rendering with high-end GPU : the GPU was 6.2 times faster than the CPU in this case.
many slower cores v.s. few fast cores
Part III: Scientific Visualization (3D volume rendering)
29
BioHPC webGPU : launch a graphic environment on the Nucleus cluster
Web based visualization access: https://portal.biohpc.swmed.edu/terminal/webgui/
• WebGUIreserve a CPU node
• WebGPUreserve a GPU node, good for ImageJ, Amira, stand-alone paraview or compile your CUDA codestep 1: add visualization software as a modulestep 2: vglrun <software name>
• WebWinDCVreserve a GPU node, windows OS, good for Imaris
• PVGPUreserve a GPU node, client-server mode for Paraview
Part III: Scientific Visualization (3D volume rendering)
30
Part III: Scientific Visualization (3D volume rendering) -- Chimera
31
Define your own color map
Part III: Scientific Visualization (3D volume rendering) -- Chimera
32
Chimera Movie Making
Tutorial : https://www.cgl.ucsf.edu/chimera/animations/animations.htmlhttps://www.youtube.com/watch?v=hQxKYSUdiD8&list=PLHib7JgKNUUeTZONxd0h0WBiZzAJmXmva
Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization
33
How to interactively manage visually overwhelming amounts of data
Option A: Resampling on original dataset
Option B: Parallel visualizatione.g, paraview for Interactive Remote Parallel Visualization
- Allocate multiple nodes for visualization, each node/process will process one part of the image
Original Data Subsample Rate: 2 pixels Subsample Rate: 4 pixels Subsample Rate: 8 pixels
Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization
34
Client-Server mode of Paraview (remote visulization)
ClientData
ServerRenderServer
Minimize Communication
Render Remotely
Send Images
The ParaView Guide, p. 191
Part III: Scientific Visualization (3D volume rendering) -- Big Data Visualization
35
Paraview with multiple processes (parallel visualization)
• Launch a PVGPU job from Web Visualization with a single GPU node• https://portal.biohpc.swmed.edu/terminal/webgui/• You will have both Paraview server (pvserver, running 15 MPI processes) and client running
on the same GPU node
Last but not least
36
• The primary goal of visualization is insight• Visualization system technology improves with advances in GPUs and
HPC solutions• Challenges at every stage of visualization when operating on large data• We will continuously working on delivery better visualization and
analytic tools• For questions and commands please email the ticket system: biohpc-
Summary
Acknowledgement
Denise Ramirez, Kangkang Song, Meghan Driscoll, Reto Fiolka, Kevin Dean, Erik Welf, Gaudenz Danuser, Philippe Roudot, Liqiang Wang, David Trudgian