Cray Vision: Fusion of Supercomputing and Big (Fast) Data
Copyright 2015 Cray Inc. 2
Modeling The World
Data-IntensiveProcessing
Math Models
Simulation and modeling of the natural world via
mathematical equations.
Data Models
Analysis of large datasets for knowledge discovery, insight, and prediction.
Feeding scientific, sensor, and internet data into simulations
Analytic processing of simulation output
Compute Store Analyze
Chart source: Henri Calandra - Total
Elastic &
Visco Elastic
Full WE ApproximationHPC Evolution
N=1 TOP 500
1995 2000 2005 2010 2015
1 TF
10 TF
10 PF
100 TF
100 PF
1 PF
1 EF
1990
Paraxial WE
approximation
Kirchhoff beam
Post SDM
PreSTM
Acoustic &
Anisotropic
2020
HPC Evolution:
TOTAL EP
RTM-FWI
L2RTM
Cray system + new algorithms:
“Instead of thousands of years, we
can now process a full FWI survey in
a matter of weeks or days, depending
on the amount of data and complexity
of the rocks in the subsurface.”
Steve Derenthal,
in The Lamp, 2012-2.
Processing: Algorithmic Complexity Increasing
Copyright 2015 Cray Inc.3
Petroleum Geo-Services (PGS) Selects High End Cray XC Series Supercomputer for Seismic Processing
• One of the largest ever commercial supercomputers
• 5 PetaFlop XC40 Supercomputer Performance
• Seismic Processing and Imaging focus
– Subsurface maps and 3-D models
• PGS win based on Cray’s:
– Competitive advantage over other O&G service suppliers
• Performance - Increased processing capacity
• Throughput - Faster turn-around on seismic jobs
– Compute efficiency & reliability
– Supportive partnership
• Integrated Cray configuration includes:
– High performance XC40 configuration
– Integrated Sonexion 2000 storage system PGS researcher’s codes scaled and performed beyond the competition
“Abel”
Copyright 2015 Cray Inc.
#12 - June 2015 “Top 500”(#1 Commercial System)
The Evolution of RTM: One size does not fit all
5
As a Seismic Migration Application
• More Physics and Features
• (RTM(VTI,TTI), L2RTM, eRTM)
• Implementation Issues/Choices
• Possible strong migration artifacts
• High computational cost (W~N^4)
• Imaging condition
• Implementation Schemes (Explicit FD,
(pseudo)-spectral)
As Part of a Critical Workflow
• Preconditioned Data, Model Building, Post-
image Processing
• Integrated with complimentary migration
schemes (e.g. Kirchhoff)
• Wide range of Tradeoffs
• disk/snapshots for source wavefield
construction,
• in-memory processing
• Partial imaging, de/ re-migration
Copyright 2015 Cray Inc.
Execution Efficiency Operational Efficiency
GeoR&D
Dev Systems Facilities
AlgorithmsPerformance Technology
Productivity/PEI/T Processes & Standards
WLM/UtilizationStorage/FS/IO
Power/Cooling(Remote) Access
Copyright 2015 Cray Inc.
• IT technologies are rapidly changing
– How to future proof RTM?
• RTM is only one part of a complex processing sequence
– How to reduce the implementation complexity?
– How to build a robust RTM implementation?
Copyright 2015 Cray Inc.
The Need for Deeper Storage Hierarchies
Copyright 2015 Cray Inc.
Disk capacity growing faster than bandwidth (and much faster then IOPS)
Buy enough bandwidth, get too much capacity
Buy just enough capacity, get too little bandwidth
And the gap between CPU and disk keeps getting wider
Disk is the new tape. Flash is the new disk. New technologies comingto bridge memory-Flash gap
PCM ReRAMSTT-MRAM3D Xpoint
On Node
Off Node
On Node
Off Node
CPU(on-chip caches)
Memory(DRAM)
Storage(HDD)
Today
CPU(on-chip caches
Near Memory(HBM/HMC
Mid Storage(SSD
Far Memory(DRAM)
Far Storage(HDD)
Near Storage(NVDIMM)
Near Future
Copyright 2015 Cray Inc.
After DataWarp I/O Accelerator Before I/O Accelerator
Example: Cray DataWarp™Flash Storage IO Acceleration System for Cray XC40
Copyright 2015 Cray Inc.
• >11 GB/s per blade to 12.8 TB of storage
• 5x the performance of disk at same cost
Would like to code for future machines in a portable way
• Spatial and Temporal Portability
●Separation of labor●Programmer exposes parallelism and locality●Compiler, tools, and runtime map onto specific hardware●Optimized libraries for various platforms (e.g. GPUs) and apps (e.g.: O&G)
Future Processors
Copyright 2015 Cray Inc.
• Move to more threading on the node
– All-MPI won’t deliver maximum performance
• Vectorize low-level loops
– 8-30x performance improvement on array operations
• Avoid scalar code
– On “accelerated” nodes, creates traffic between accelerator and host, or runs 3-4x slower than on a serial-optimized core
– Inherently slower and less power-efficient
• Pay a lot more attention to locality within node
– Think about data placement and movement
– Consider “sub-optimal” algorithms that limit data motion
Copyright 2015 Cray Inc.
Identify Parallelism -> Express Parallelism -> Express Data Locality -> Optimize (Repeat until constraints reached)
Libraries
“Drop-in”
Acceleration
Languages
Maximum
Flexibility
Directives
Easily Accelerate
Applications
Accelerate Application Performance and Leverage New Technologies Faster
Copyright 2015 Cray Inc.
14
Critical Workflows
Low attack surface where multiple vectors can
attach• Many user input fields
• Mixed communication protocols
• Multiple interfaces
• Modular SW/Application functionalities that
interface with each other
Balanced Systems in many dimensions for
diverse workloads
Extensive Monitoring for optimization and
learning
As Part of a Critical Workflow
• Preconditioned Data, Model Building, Post-
image Processing
• Integrated with complimentary migration
schemes (e.g. Kirchhoff)
• Wide range of Tradeoffs• disk/snapshots for source wavefield construction,
in-memory processing
• Partial imaging, de/ re-migration
The Evolution of RTM: Part of a critical workflow
• Rapid expansion in apps, libraries and tools
– Especially in new, data-intensive communities
• Proliferation of tools
– Difficult to install, with long list of dependencies
– Difficult to port
• Specific requirements for certain libraries, compilers, and scripting tools
Copyright 2015 Cray Inc.
Robust, well tested stack with exact combination of dependencies can be tedious and challenging
• Easy to Build – Snap running system to start• Easy to Maintain – Handles conflicts,
dependencies, versions, etc.• High Performance – No overhead, unlike
hypervisors, etc.
Copyright 2015 Cray Inc.
See us in Booth 1952, to catch up on Containers in HPC
Software
•Integrated Development Environment
•Multi-ISV Ecosystem
•Open Linux OS and Optimized Libraries
•Full range of deployment options
Platforms
•Designed for HPC
•High throughput, low latency, intelligent interconnect
•Many-core optimized architectures
•Parallelism in every design aspect
Optimization
•Power now a major design constraint
•Systems integrated & tested before shipping
•Power Utilization
•Cooling
•Flexible Workload Mgmt.
•Efficient Storage Mgmt.
Expertise
•HPC at large scale
•Systems and application performance analysis and tuning
•Architecting, planning and design
•Worldwide, World-class Support
Cray Addresses O&G HPC Stakeholder Needs
Copyright 2015 Cray Inc.
19
Legal DisclaimerInformation in this document is provided in connection with Cray Inc. products. No license, express or implied, to any intellectual property rights is granted by this document.
Cray Inc. may make changes to specifications and product descriptions at any time, without notice.
All products, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
Cray hardware and software products may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Cray uses codenames internally to identify products that are in development and not yet publically announced for release. Customers and other third parties are not authorized by Cray Inc. to use codenames in advertising, promotion or marketing and any use of Cray Inc. internal codenames is at the sole risk of the user.
Performance tests and ratings are measured using specific systems and/or components and reflect the approximate performance of Cray Inc. products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance.
The following are trademarks of Cray Inc. and are registered in the United States and other countries: CRAY and design, SONEXION, URIKA and YARCDATA. The following are trademarks of Cray Inc.: ACE, APPRENTICE2, CHAPEL, CLUSTER CONNECT, CRAYPAT, CRAYPORT, ECOPHLEX, LIBSCI, NODEKARE, THREADSTORM. The following system family marks, and trademarks of Cray Inc.: CS, CX, XC, XE, XK, XMT and XT. The registered trademark LINUX is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Other names and brands may be claimed as the property of others. Other product and service names mentioned herein are the trademarks of their respective owners.
Copyright 2015 Cray Inc.Copyright 2015 Cray Inc.