hpc over cloud - gist · hpc over cloud july 16th, 2014 2014 ... computer: an introduction to the...
TRANSCRIPT
SCENT
HPC over Cloud
July 16th, 2014
2014 SCENT HPC Summer School @ GIST
SCENT (Super Computing CENTer)
GIST (Gwangju Institute of Science & Technology)
Dr. JongWon Kim [email protected]
SCENT
Interplay between Theory, Simulation,
and Experiment
SCENT- 3 -
Software-
Defined
Infrastructure
SCENT
• T-Shaped vs I-Shaped Development
• Software-defined X (Infrastructure, Interfaces…)
Open Innovation Platform
SCENT
Agile & Economic Service Realization over
Software-Defined Infrastructure
Open Data + Open API + Open Resource (Compute + Storage + Networking)
SCENT6
Open API & Platforms
(Big Data & Cloud)
SCENTTemplates
Software-Defined
Infrastructure Unified,
Programmable &
Virtualized Resources
SCENT
Compute
Storage
Networking X
Zero-touch Configuration
Flexible Control
(forwarding, …)
Instant Visibility
Collective Analysis
DevOps
Configuration/Control/Visibility Challenges &
Open APIs via Inter-connected Functions
SCENT- 9 -
Let’s Create Smart Things
and Realize Smart Services
Architecture Your Smart Things with API Tools Converged
(C/N/S) SmartX Box
with Program-mable &
Virtualized Resources
Build Open APIs with Functions
SCENT
Industrial Internet, Smart Planet, Internet
of Everything
SCENT
• Amazon AWS
• Microsoft Azure
• Google Cloud
• VMware Hybrid
• OpenStack (RH, IBM, HP, Dell, …)
• Network Service Provider Cloud (AT&T, NTT, …)
Cloud DC Traffic
Cisco Global Cloud Index
Cloud: Rapid Expansion & Competition
SCENT
Cloud WAN Fabric
(IP+Optical Integration)
Wireless + Mobile
Wireless + Mobile Wireless +
Mobile
Futuristic Multilayer-integrated & Convergent
Networks (Cloud WAN Fabric + Service-aware Edge)
Cloud Data
Centers
Cloud Data
Centers
Service-aware Edge (MiddleBox, …)
Cloud DC
Cloud DC
Cloud Data
Centers
IP??, More Switching + Simpler Routing?
Last
Modified
11/02/2013
SCENT
Super
Computing
Computing
Cloud
HPC over
Cloud
Higher
efficiency
Workload
bursting
Pay-per-use
Scalability Simplified
self-service
HPC over
Cloud
Benefits
Sophisticated
Migration
policies
Interfacing
With
External
components
Accelerated
Collaboration
HPC over Cloud: Target & Benefits
Economic (Easy & Improved Sharing), Green (Energy-efficient)
SCENT
National HPC Infrastructure
Infrastructure (Resource)
Platform (Tool)
Software (Service)
General
Computing
#1
General
Computing
#2
Integrated Resource Pools
Technical
Computing
Meta
-Op
era
tio
n
Unified HPC
Boxes
Target Problems
F3 Racing Team = SC Center
Dedicated
computing
Dedicated
Resources
HPC over Cloud
SCENT
Vision for National HPC Services (Human Resources instead of Hardware Resources)
Software
Platform
Infrastructure (Hardware)
Software
Platform
Infrastructure (Hardware)
SCENT
Institute HPC Centers
National HPC Cloud Centers
(Shared Infra/Platform)
HPC over Cloud
HPC SW Services
(Licenses) CPU/GPU-intensive
Memory-intensive
Data-intensive
Virtualized HPC Clusters
Dedicated HPC Centers
SCENT
Building Shared National HPC
Resources (every 6 years)
HPC Cloud Center #1
HPC Cloud Center #2
HPC
Cloud Center #3
HPC Cloud
Center #4
#5 #6
국가
지방
Coordinated and Operated by National HPC Center
기업
SCENT
CPU/GPU-intensive
Memory-intensive
BW-intensive
Dynamic On-demand Provisioning of
Virtualized HPC
Virtualized HPC with HPC Boxes
Pools of Unified HPC Boxes
Scale-in/out
SCENT- 20 -
SCENT
HPC over Cloud: HPC, HSC, …
http://www.cloudscaling.com/blog/cloud-computing/grid-cloud-hpc-whats-the-diff/
• High Performance Computing: Single program
• High Scalability Computing: Throughput focus; Can be distributed; e.g. Physics Simulation
SCENT
HPC over Cloud Examples: MegaRun
• Cycle Computing MegaRun: Schrödinger’s quantum chemistry software on 156,000 cores across 8 AWS regions (to sort through 205,000 compounds for solar panel). (Nov. 2013)
• $33,000 / 18 hours vs $68 million / 18 hours; $132,000 / 10 months (existing internal 300-core cluster)
SCENT
HPC over Cloud Examples: CERN /
Rackspace OpenLab
• OpenLab: Hybrid Cloud Model based on OpenStack (open-source cloud OS)
COMPUTE NETWORKING
STORAGE
Rackspace Public Cloud
CERN Private Cloud
Rackspace Private Cloud
@ CERN
Demonstration of federated identity and aggregated services between a Rackspace Private Cloud at CERN and at least one other cloud.
SCENT- 24 -
DevOps
Automation
SCENT
Development + Operation
OpenStack (Puppet, Chef, …)
DevOps for Software-Defined
Infrastructure
SCENT
Software Lifecycle and CI (Continuous
Integration) / CD (Continuous Deployment)
Software Lifecycle: Development
Testing (Staging) / QA
Production / Deployment 26
Master Software
Coding (for Cloud
OS Kernel + Service
Frameworks and
Tools) and Execute
Continuous
Integration for
Agile and
Economic Service
Realization
SCENT- 27 -
A Computer System
System: a set of interacting or
interdependent components
forming an integrated whole
SCENT
CoreLink Cache Coherent Network based on AMBA
(Advanced Microcontroller Bus Architecture) 5 CHI(Coherent Hub Interface)
On a single die (i.e., chip)
SCENT
Computers in Data Centers
- 29 -
SCENT
Rack Scale Computing
- 30 -
SCENT
Cloud Datacenter as a BIG Computer
- 31 -
• Luiz André Barroso, Urs Hölzle, “The Datacenter as a
Computer: An Introduction to the Design of Warehouse-
Scale Machines”
• Google Omega & Apache Mesos
SCENT
Containers vs VMs Packaging workload
tasks and Scaling …
SCENT
Docker: Open-Source Lightweight
Container
LXC (Linux Container)
Open-Source Container
Management Kubernetes
SCENT
Docker & Open-Source Container
Management Kubernetes • Kubernetes (koo-ber-nay'-tace): Microsoft, IBM and Red Hat,
CoreOS, SaltStack and Mesosphere, builds on top of Docker to
construct a clustered container scheduling service.
SCENT
Open-source Cluster
Management Software
Apache Hadoop YARN
Apache Mesos
SCENT
Cloud & PaaS
(Platform as a
Service)
• Model-driven PaaS: domain
specific languages
• PaaS (Products; Frameworks): aPaaS (App Server aaS) and
language runtimes / DBaaS
• Mini-PaaS (Foundational PaaS,
IaaS+): App. Containers + α
36 Dzone Cloud Report 2014
SCENT
Big Data &
Software-
Defined
Infrastructure
SCENT
Big Data: Data Types and Frameworks
• Structured (relational and legacy)
• Unstructured data
– Hadoop/ MapReduce
– NoSQL
SCENT
Cloud Datacenter as a BIG Computer
- 39 -
• Luiz André Barroso, Urs Hölzle, “The Datacenter as a
Computer: An Introduction to the Design of Warehouse-
Scale Machines”
• Google Omega & Apache Mesos
SCENT
OpenStack Cloud
OS for a Big Cloud
Computer
SCENT
OF@TEIN Testbed with SmartX Boxes: Cloud
Data Center in a Box
VMVM
VM
L2
A Virtual PlaygroundFor Experiment A
L2
L3
VM
OF@TEIN Underlay Network
VMVM
VM
L2
A Virtual PlaygroundFor Experiment B
L2
L3
VMVM
VM
VM
L2
A Virtual PlaygroundFor Experiment Z
L2
L3
VM∙∙∙Service layer
(experiment layer)
Virtual Resourcelayer
Physical resource layer
VCPU
Memory
VM#1
Kernel OSKVM(Hypervisor)
Opnestack
Storage(SSD/HDD) Memory CPU NIC
SmartX Box #1
NovaVCPU Vmemory
Cindervstorage
Neutronvswitch
VCPU
Memory
VM#2
VCPU
Memory
VM#3
∙∙∙
VCPU
Memory
VM#1
Kernel OSKVM(Hypervisor)
Opnestack
Storage(SSD/HDD) Memory CPU NIC
SmartX Box #2
NovaVCPU Vmemory
Cindervstorage
Neutronvswitch
VCPU
Memory
VM#2
VCPU
Memory
VM#1
Kernel OSKVM(Hypervisor)
Opnestack
Storage(SSD/HDD) Memory CPU NIC
SmartX Box #K
NovaVCPU Vmemory
Cindervstorage
Neutronvswitch
VCPU
Memory
VM#2
VCPU
Memory
VM#3
DevOps Templates for Virtual Playgrounds
+ OpenStack-based Functions APIs
+ SDN-Coordinated vNetworking
SCENT
OF@TEIN SmartX Box: Converged Resources -
Workload – Diversified Functions
42 Zero-touch Configuration
Flexible Control
Instant
Visibility
Auto Scaling
Continuous
Integration