scientific visualization in hpc...visualization in hpc leverage graphics capabilities on...

40
April 4-7, 2016 | Silicon Valley Peter Messmer, 4/4/2016 SCIENTIFIC VISUALIZATION IN HPC

Upload: others

Post on 29-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • April 4-7, 2016 | Silicon Valley

    Peter Messmer, 4/4/2016

    SCIENTIFIC VISUALIZATION IN HPC

  • 2

    "Yes," said Deep Thought, "I can do it."

    [Seven and a half million years later.... ]

    “The Answer to the Great Question... Of Life, the Universe and Everything...

    Is... Forty-two,' said Deep Thought, with infinite majesty and calm.”

    — Douglas Adams, Hitchhiker’s Guide to the Galaxy

    HIGH PERFORMANCE COMPUTING TODAY*

    *mostly

  • 3

    Accuracy

    Latency

    HPC

    application

    Month

    Week

    Day

    Hour

    5 min

    100 ms

    30 ms

    10 ms

    Sit in it Has Engine Moves Flies

  • 4

    Accuracy

    Latency

    HPC

    application

    Action

    Game

    Month

    Week

    Day

    Hour

    5 min

    100 ms

    30 ms

    10 ms

    Sit in it Has Engine Moves Flies

  • 5

    Accuracy

    Latency

    HPC

    application

    Action

    Game

    Month

    Week

    Day

    Hour

    5 min

    100 ms

    30 ms

    10 ms

    Sit in it Has Engine Moves Flies

    Flight

    Simulator

    CG Movie

  • 6

    Accuracy

    Latency

    HPC

    application

    Action

    Game

    Month

    Week

    Day

    Hour

    5 min

    100 ms

    30 ms

    10 ms

    Sit in it Has Engine Moves Flies

    Flight

    Simulator

    CG Movie

    Parameter Space

    Exploration,

    Approximate models

  • 7

    Accuracy

    Latency

    HPC

    application

    Action

    Game

    Month

    Week

    Day

    Hour

    5 min

    100 ms

    30 ms

    10 ms

    Sit in it Has Engine Moves Flies

    Opportunity!

    Flight

    Simulator

    CG Movie

    Parameter Space

    Exploration,

    Approximate models

    Explorative Science,

    Real-time systems

  • 8

    BONSAI WITH IN-SITU VIZ ON PIZ DAINT

    Presented at SC14, streaming from CSCS/Switzerland to New Orleans Presented at SC14, streaming from CSCS/Switzerland to New Orleans

    Compute & Vis on 1024 GPU nodes Live Streaming

    J. Bedorf, E. Gaburov, P.Messmer, S. Portegies Zwart

  • 9

    BONSAI WITH IN-SITU VIZ ON PIZ DAINT

    Presented at SC14, streaming from CSCS/Switzerland to New Orleans Presented at SC14, streaming from CSCS/Switzerland to New Orleans

    Compute & Vis on 1024 GPU nodes Live Streaming

  • 10

  • 11

    Coordinate

    transformations

    Feature

    extraction

    Thresholding

    Isosurfaces,

    Isovolumes

    Streamlines

    Field Operators

    (Gradient, Curl,.. )

    Clip, Slice

    Binning,

    Resample

    Surface

    Rendering

    Volume

    Rendering Line

    Rendering

    Compositing

    VISUALIZATION ≠ RENDERING * * but it’s a part of it

  • 12

    VISUALIZATION PIPELINE

    - Analysis: Data processing to extract meaningful quantities of interest

    - Filtering: Conversion of simulation data into data ready for rendering

    - Rendering: Conversion of shapes to pixels (Fragment processing)

    - Compositing: Combination of independently generated pixels into final frame

    Your typical scientific visualization system

    Simulation

    Visualization

    Analysis& Filtering

    Rendering Compositing Delivery

  • 13

    VISUALIZATION TOOLKIT (VTK)

    Focus on visualization, not (only) rendering

    Provides more complex operations on data (“filtering”)

    Visualization pipeline

    At the core of many high-level viz tools

    Paraview, Visit, ..

    Developed by Kitware, open source

    http://www.vtk.org

    Venerable backbone of scientific visualization

    Tue - 15:00 : S6193 - Visualization Toolkit: Improving Rendering and Compute on GPU's

  • 14

    VTK-M

    Ongoing development (Sandia, Kitware, ORNL, ..)

    VTK type filters and more fine-grained “worklets”

    Platform portable (GPU, multicore CPU)

    Thrust, TBB backend

    http://m.vtk.org

    Visualization algorithms on modern architectures

    Wed - 15:00 : S6352 - Adapting the Visualization Toolkit for Many-Core Processors with the VTK-m Library

  • 15

    TYPICAL HPC ENVIRONMENT

    Workstation on scientist’s desk

    Remote HPC center

    Compute nodes not directly accessible

    Output from compute nodes:

    - File transfer

    - X forwarding

    - Remote rendering

    Login Node

    Workstation

    GPU

    Compute Node

    GPU Compute Node

    GPU

  • 16

    X-FORWARDING

    ssh –Y loginnode.edu

    ssh –Y computenode

    No extra process on compute node

    Rendering by workstation GPU

    X server on workstation needed

    Often prohibitively slow

    Be prepared for latencies

    Login Node

    Workstation

    GPU

    Compute Node

    GPU Compute Node

    GPU

  • 17

    REMOTE RENDERING Know your latencies

    Login Node

    Workstation

    GPU

    Compute Node

    GPU Compute Node

    GPU

    Use compute node’s GPU for rendering

    Capture renderings and ship pixel data to user

    Compression is key

    Requires running X server on compute node*

    * Requirements will change with EGL

  • 18

    Stellar combustion visualized on

    Blue Waters (26 TB dataset)

    Remote visualization on Blue Waters

    Paul Woodward, U. Minnesota: HVR w/ OpenGL on Blue Waters

    Improvement in time to solution

    6 GPUs in local viz cluster

    128 GPUs in HPC center

    Data transfer Rendering 48 days

    1 day

    • Limited resources in the local viz cluster

    • Long data transfer times

    48x speed ups by using the Tesla GPUs

    in the HPC center

    18 Data courtesy of John Stone, UIUC

  • 19

    REMOTE RENDERING: VIDEO ENCODING

    Open source approaches: TurboVNC + VirtualGL

    Currently not leveraging HW H264 encoder

    Commercial tools (e.g. Nice DCV)

    Leverages HW encoder

    No application modification needed

    https://www.nice-software.com/products/dcv

    NvENC library to access HW H264 encoder (lossless on Maxwell)

    https://developer.nvidia.com/nvidia-video-codec-sdk

    Interactivity over large distances

    Tue - 13:00: S6253 - VMD: Petascale Molecular Visualization and Analysis with Remote Video Streaming

  • 20

    OPENGL: GPU ACCELERATED RENDERING

    •Primitives: points, lines, polygons

    •Properties: colors, lighting, textures, ..

    •View: camera position and perspective

    •Shaders: Rendering to screen/framebuffer

    •C-style functions, enums

    See e.g. “What Every CUDA Programmer Should Know About OpenGL”

    (http://www.nvidia.com/content/GTC/documents/1055_GTC09.pdf)

    Mon - 10:00 : S6817 - High-Performance, Low-Overhead Rendering with OpenGL and Vulkan

    Mon - 14:00 : H6139 - Hangout: Maximizing Performance of CUDA and OpenGL Applications

  • 21

    VIS TOOLS EMBRACE OPENGL ON EGL

    Prior to EGL: X server required for GPU accelerated OpenGL

    Full OpenGL on EGL announced at SC16

    With EGL: OpenGL without X

    Major enabler for GPU rendering in HPC, incl. Cray systems*

    Quick adoption by vis tool developers

    https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/

    * Driver version 358.7 or newer required

    Streamlined GPU accelerated off-screen rendering

    4/20/2016

    https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/

  • 22

    USE CASE: CSI ENSIGHT 10.1.6D

    “Customers post-process large collections of simulations offline

    Even though rendering takes place offscreen, EnSight requires that an X server is running, and is open enough to allow users to access it. At some sites, it is unacceptable to configure an X server to have wide open access.

    By using EGL to make our GL context and pbuffer, we can remove the need to start an Xserver, and solve all of these system management problems.”

    Dave Bremer, CEI

    EGL for batch renderer

    4/20/2016

    Image by Astec Inc

  • 23

    PARAVIEW: DESIGNED FOR HPC

    Supports all major scientific data formats

    Extensive collection of “operators”: isosurface, streamlines, volume renderer, reductions, custom operators, ..

    Approachable user interface

    Supports remote, distributed memory vis

    Open source, free

    Extensible via plugins

    Satisfying scientific computing needs

    4/20/2016

  • 24

    MODERN OPENGL FOR HPC VIZ

    VTK supports now OpenGL 3.2

    Enables advanced shaders (AO, VXGI, ..)

    Some algorithms well suited for

    distributed memory rendering

    GPU hardware support

    Mandatory to access advanced rendering features

    Data courtesy Florida Intl University & TACC

  • 25

    OPENGL RENDERING POWERHOUSE OpenGL vs OpenSWR

    Big

    ger

    is b

    ett

    er

  • 26

    OPENGL NOT LIMITED TO RENDERING TASKS

    CUDA->OpenGL typically one-way only

    EGL enables lighter weight access to OpenGL

    No X server needed

    Potential use of OpenGL for rasterization-like problems?

    Determine covered “pixels”

    3D ordering/occlusion via Z-buffer

    Interop goes both ways, esp with EGL

  • 27

    SCALABLE RENDERING AND COMPOSITING

    Large-scale (volume) data visualization

    Interactive visualization of TB of data

    Stand-alone or coupling into simulation

    HW Accelerated remote rendering

    Plugin for ParaView

    http://www.nvidia-arc.com/products/nvidia-index.html

    NVIDIA INDEX

    Mon - 10:00: S6590 - HPC Visualization Using NVIDIA IndeX™

  • 28

    RC coming out soon.. Email for enquiries

    SCALABLE VOLUME RENDERING IN PARAVIEW

    Index plugin addresses shortcomings in ParaView built-in volume renderer

    Beta version supports

    - 3D structured, scalar grids

    - 32bit float, 16 bit/8bit uint

    - Overlay of opaque ParaView geometries (e.g. streamlines)

    Free plugin, requires commercial IndeX license

    Plugin enables GPU accelerated volume rendering

    4/20/2016 Tue - 16:00 : S6670 - Toward Bridging the Gap Between High Quality and High Performance for HPC Visualization

  • 29

    Advanced Rendering in scientific visualization

    Two lights, no shadows

    Two lights,

    hard shadows, 1 shadow

    ray per light

    Ambient occlusion + two

    lights, 144 AO rays/hit

    • Ray tracing offers ambient occlusion lighting, shadows, high quality transparent surfaces

    Better insight with visual cues

    Courtesy of John Stone, UIUC

  • 30

    OPTIX RAY TRACING FRAMEWORK

    •GPU accelerated ray-tracing framework

    •Build your own RT application

    •Generic Ray-Geometry interaction

    •Rays with arbitrary payloads

    •Multi-GPU-support

    Tue - 14:00 : H6148 - Hangout: CUDA for HPC Simulation and Visualization Tue - 14:00 : H6150 - Hangout: OptiX Ray Tracing Library: Best practices and Use-Case Consultation Wed -10:30 S: S6320 - Opticks: Optical Photon Simulation for High Energy Physics with NVIDIA OptiX™

  • 31

    TELLING A BETTER STORY, VISUALLY

    Advanced rendering can help visual message, e.g. guiding the eye via depth of field

    Particularly useful for complex visualizations

    Interactive ray-tracing via NVIDIA Iray

    Post-processing of ParaView files

    Advanced rendering improves messaging

    Mon - 15:00 : H6142A - Hangout: Iray® Rendering for Developers

  • 32

    PARALLEL COMPOSITING WITH ICE-T

    •Each node renders fraction of image

    •Sort last compositing

    •Widely used (Paraview, VisIt .. )

    •Critical element for real-time viz

    •Up to 30 fps for 4k frames on 1024 nodes

    •Cray XC30, Piz Daint @ CSCS

    http://icet.sandia.gov

    Tue- 14:30 : S6808 - Image Compositing on GPU-Accelerated Supercomputers

    Modern networks remove compositing bottleneck

    http://icet.sandia.gov/http://icet.sandia.gov/

  • 33

    HIGH FRAMERATE = MINIMAL IMPACT ON SIMULATION

    Real-time visualization only one use case

    Batch processing will not go away

    Acceptable time budget for visualization/analysis

    Up to the I/O time, ~ 2 %

    More diagnostics in the same time

    E.g. ParaView Cinema

    FPS matter, even in HPC

  • 34

    VISUALIZATION-ENABLED SUPERCOMPUTERS

    http://blogs.nvidia.com/blog/2014/11/19/gpu-in-

    situ-milky-way/

    CSCS Piz Daint NCSA Blue Waters

    Galaxy formation

    http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/

    ORNL Titan

    Molecular dynamics

    Cosmology

    http://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-

    data-anal.html

    http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/https://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCK6c9KzT4sYCFY8UkgodQM8IpQ&url=https://bluewaters.ncsa.illinois.edu/&ei=FzapVe60Eo-pyATAnqOoCg&bvm=bv.98197061,d.aWw&psig=AFQjCNG6_XCiXpDJ0HATm6rT-rbi0v67CQ&ust=1437239182216512http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/https://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCIv4w43Z4sYCFcmDkAodQKkKIw&url=https://www.olcf.ornl.gov/titan/&ei=IDypVYvoBsmHwgTA0qqYAg&bvm=bv.97949915,d.Y2I&psig=AFQjCNFBZRN7n7brVSlelqbBCnTlZ96imw&ust=1437240647535095http://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.html

  • 35

    CO-PROCESSING PARTITIONED

    SYSTEM LEGACY

    WORKFLOW

    COMPUTE+VIS SUPPORTS MULTIPLE WORKFLOWS

    Separate compute & vis system

    Communication via file system

    Compute and visualization on same GPU

    Communication via host-device transfers or memcpy

    Different nodes for different roles

    Communication via high-speed network

  • 36

    IN-SITU VIS: ADVANTAGES AND OPPORTUNITIES

    - Minimize IO traffic

    - Exploit data locality

    - Less pressure on file system

    - Less time wasted in I/O

    - Reduce latency to first result

    - Monitoring, early termination

    - Enable real-time/interactive visualization

    - Novel workflows, new applications

    Workstation

    File System

    GPU-accelerated Supercomputer

    Tue – 9:00: S6633 - Navigating the In-Situ Visualization Landscape

  • 37

    COSMO WITH IN-SITU VISUALIZATION

    COSMO-1 model, operational weather model

    in Switzerland

    6 nodes Cray CS-Storm system

    8 K80 GPUs/node

    96 GPU sockets total

    ~ 20s per 0.7s

    NVIDIA IndeX for visualizaiton

    Tue - 13:30: S6628 - Co-Designing GPU-Based Systems and Tools for Numerical Weather Predictions

    Live, Interactive Weather Simulation

  • 38

    IN-SITU VISUALIZATION ON TITAN

    “When running PyFR at scale, it

    generates very large data sets that

    need analyzing for acoustics. The

    traditional post hoc method is simply

    not fit for purpose – in situ

    visualization and processing are

    critical. We see a potential for 50x

    speed ups with in situ, which

    significantly accelerates our scientific

    discovery”

    First prototype of ParaView in-situ

    visualization capabilities in pyFR (CFD)

    simulations, predicting jet engine acoustics

    Both compute and visualization running

    on Titan GPUs and streaming to a remote

    location

    - Dr. Peter Vincent Imperial College

    Thu-10:30 : S6329 - Petascale Computational Fluid Dynamics with Python on GPUs Tue-15:00 : S6193 - Visualization Toolkit: Improving Rendering and Compute on GPU's

  • 39

    VISUALIZATION IN HPC

    Leverage graphics capabilities on heterogeneous nodes

    GPUs offer features for visualization, rendering, remote viz

    Modern networks enable parallel compositing at massive scale

    Graphics capabilities may help for graphics-like algorithms

    Supported by popular visualization tools

    Fast rendering relevant even for batch processing

    In-situ visualization for monitoring, steering, and other novel workflows

  • April 4-7, 2016 | Silicon Valley

    THANK YOU

    JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join

    developer.nvidia.com/join