virt1983bu making the complicated simple: cycle harvesting ... · our goal and approach •...

40
VIRT1983BU #VMworld #VIRT1983BU Making the Complicated Simple: Cycle Harvesting from the Virtual Desktop Infrastructure Estate for Financial Modeling and Simulation VMworld 2017 Content: Not for publication or distribution

Upload: others

Post on 06-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

VIRT1983BU

#VMworld #VIRT1983BU

Making the Complicated Simple: Cycle Harvesting from the Virtual Desktop Infrastructure Estate for Financial Modeling and Simulation

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Wayne Longcore, VP IT Client Services, Jackson National Life

Josh Simons, Chief Technologist for HPC, VMware

VIRT1983BU

#VMworld #VIRT1983BU

Making the Complicated Simple: Cycle Harvesting from the Virtual Desktop Infrastructure Estate for Financial Modeling and Simulation

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Agenda

1 Virtualized HPC

2 JNL Environment

3 Cycle Harvesting at JNL

4 Performance

5 Benefits

6 Futures

#VIRT1983BU CONFIDENTIAL 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Virtualized HPCScience, Research, Engineering, & Financial Applications on vSphere

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Our Goal and Approach

• Increase agility and decrease time to discovery for researchers, scientists, and engineers

• Provide IT with the ability to efficiently provision, allocate, manage and ensure compliance of research compute infrastructure across an increasingly broad range of technical and business requirements

• By leveraging VMware’s proven, enterprise-class virtualization and cloud technologies to meet the performance requirements of research computing and HPC workloads, and

• Bringing novel capabilities to bear to enable new capabilities not available in traditional HPC environments

#VIRT1983BU CONFIDENTIAL 6

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

#VIRT1983BU CONFIDENTIAL 7

• Scientific or technical workloads

• Often floating-point intensive

• Often storage intensive

• Often parallel

• Mechanical Design/Drafting

• Chemical Engineering

• Economics/Financial

• Weather

• Electronic Design Automation (EDA)

• Geosciences

• Defense

• Computer-Aided Engineering (CAE)

• Bioscience

• Government Lab

• University/Academic

HPCCluster MPI

Jobs

ThroughputJobs

MessagePassingInterface

HPC Workloads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Virtual Machine Benefits for HPC

8

hardware

hypervisor

VMOSApp

Virtual Machines offer:

• Heterogeneity• Multi-tenant data security• Fault isolation• Reproducibility• Fault resiliency• Dynamic load balancing• Performance

hardware

#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Virtualized HPC Performance

9

Representative throughput and MPI examples (performance ratios – higher is better)

0

0.2

0.4

0.6

0.8

1

Pe

rfo

rma

nc

e R

ati

o

Run 1

Run 2

Run 3

Run 4

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

pe

rfo

rma

nce

Ra

tio

s

FLUENT

GROMACS

LAMMPS

LS-DYNA

NAMD

OpenFOAM

Monte Carlo simulation Science & engineering applications

#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Introducing vSphere Scale-Out for Big Data and HPC Workloads

10

• Hypervisor, vMotion, vShield Endpoint, Storage vMotion, Storage APIs, Big Data Extensions, Distributed Switch, I/O Controls & SR-IOV, Host Profiles / Auto Deploy and more

Features

• Sold in Packs of 8 CPU at a cost-effective price pointPackaging

• EULA enforced for use w/ Big Data/HPC workloads onlyLicensing

New package that provides all the core features required for scale-out workloads at an attractive price point

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting

• Long history of harvesting spare cycles in HPC and more broadly

• SETI@home

• Performance

– On average, about 980 TFLOPs

– 104,000 active users

– 156,000 active hosts

11#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting

• Berkeley Open Infrastructure for Network Computing (BOINC)

– 100 projects

– On average, about 110 PetaFLOPs

– 203,000 active users

– 1,109,000 active hosts

• HTCondor – University of Wisconsin

– Well-established HPC distributed resource manager and opportunistic scheduler

– Support for VMware virtual machines

• These approaches require adding software infrastructure and complexity to achieve cycle harvesting

• Is there another way for VMware customers?

12Stats: http://boincstats.comBOINC: https://boinc.berkeley.edu #VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Jackson National Life EnvironmentMaking the complicated awesomely simple

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Architectural Imperatives

#VIRT1983BU CONFIDENTIAL 14

• Measure everything based on user experience

• Stay current (Win10 & Office 365 Pro Plus, etc.)

• Focus on user productivity (uptime & responsiveness)

• Shared Nothing Architecture

– Reduce fault domains to fewer than 30 users

– Increase throughput via parallelization

• High core count for peak usage

– Subpar performance is unacceptable even 1% of the day

– Do not believe in expensive Overstuffed Rackmounts with limited cores and overcommitted buses

• Extremely limited memory overcommit, even in a failover situation

• Spread departments and functions across data centers

• “Re-leveling Scripts” based on sizing factors that match our chargeback factors

– Charge based on size, not based on use (like a laptop)

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

User Specific Virtual Desktops Low Cost – High Performance

#VIRT1983BU CONFIDENTIAL 15

• Use low cost components with no bottlenecks to memory and disk

• $255/year AVG Per VDI for hardware, including all storage & GPUs (a laptop per year?)

• Differentiated service levels – Treat each Virtual Desktop like a different laptop

Model 2017 Description of Config

Bronze Desktop 2 cores, 5 GB RAM, Windows 10, 100Gig HD, 10,000-15,000 IOPs

Silver Desktop 2 cores, 8 GB RAM, Windows 10, 100Gig HD, 10,000-15,000 IOPs

Gold Desktop 4 cores, 8 GB RAM, Windows 10, 128Gig HD, 10,000-15,000 IOPs

Platinum Desktop 6 cores, 10GB RAM, GPU, Windows 10, 128-256Gig HD, 10,000-15,000 IOPs

Diamond Desktop 8 cores, 12GB RAM, GPU, Windows 10, 100+ Gig HD, 40,000-70,000 IOPs

Concierge Desktop I 8 Cores, 16GB RAM, GPU, Windows 10, 100+ Gig HD, 40,000-200,000 IOPs

Concierge Desktop II 14 Cores, 32GB RAM, GPU, Windows 10, 100+ Gig HD, 40,000-200,000 IOPs

Concierge Desktop III 14 Cores, 64GB RAM, GPU, Windows 10, 100+ Gig HD, 40,000-200,000 IOPs

Concierge Desktop IV 28 Cores, 96GB RAM, GPU, Windows 10, 100+ Gig HD, 40,000-526,000 IOPs

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Blade Design

SSD 1: 1.6 TB – single VMFS partition

NFS VMDK – exposed only

via NFS mount

Blade: 2 CPU, each from 8 to 14 cores

SSD 2: 1.6 TB – single VMFS partition

vSphere Replication fromSSD #1 on host

6001c

NFS VMDK – exposed only

via NFS mount

vSphere Replication to

SSD #2 on host 6001c

Hostname: 6001a

#VIRT1983BU CONFIDENTIAL 16

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

#VIRT1983BU CONFIDENTIAL 17

Typical Blade: 2 CPU, 8 – 14 cores each

Cores

CPUs

2 vCPUNFS VM

Various Sizes, up to 5:1 CPU oversubscriptionVDI VMs

VDI Design – vCPU Allocation

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Data Center Design: Active – Active

#VIRT1983BU CONFIDENTIAL 18

vSphere 6.0 Update 2, transitioning to vSphere 6.5

16 Blades

13 Enclosures+ 16 Rack Mounts

Identical to Data Center A

Multiple 10GB Fiber

Links

Data Center A Data Center C

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Environmental Summary

444 Hosts being cycle harvested

10,468 Cores

71TB of RAM being used by cycle harvesters in 155TB physical RAM

280TB of SSD being used out of 2007TB physical local SSD

Total capacity of 71.6 Million IOs Per Second (4K RW 50%)

Total throughput of 4.022Tb/s to disk

6768 VDI desktops

888 Cycle harvesting desktop VMs

#VIRT1983BU CONFIDENTIAL 19

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle HarvestingHarvesting unused VDI compute cycles for Risk Analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting Virtual Machine Overlay

Typical Blade: 2 CPU, 8 – 14 cores each

Cores

CPUs

2 vCPUNFS VM

Various Sizes, up to 5:1 CPU oversubscriptionVDI VMs

Equal to # cores in a CPU Equal to # cores in a CPUCycle Harvesting VMs

#VIRT1983BU CONFIDENTIAL 21

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting VM Configuration

Virtual CPUs Equal to number of cores per socket

Virtual RAM 7 GB RAM per core

Virtual Disk 30 GB per core of scratch

60 GB for BootWindows 10

MG Alfa, etc.

Drive C:/ Drive D:/

NFS mount (eventually will be VMDK on local SSD’s VMFS)

Boot Disk

VMDK - sits directly on local SSD’s VMFS

Scratch Disk:10 – 30 GB

MSFT DFS for job data

#VIRT1983BU CONFIDENTIAL 22

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting Virtual Disk Overlay

SSD 1: 1.6 TB – single VMFS partition

NFS VMDK – exposed only

via NFS mount

Cycle harvesting

VM 1 VMDK

Blade: 2 CPU, each from 8 to 14 cores

Cycle harvesting

VM 2 VMDK

SSD 2: 1.6 TB – single VMFS partition

vSphere Replication fromSSD #1 on host

6001c

NFS VMDK – exposed only

via NFS mount

vSphere Replication to

SSD #2 on host 6001c

Hostname: 6001a

Not replicated Not replicated

#VIRT1983BU CONFIDENTIAL 23

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Cycle Harvesting Job Submission

Cores

2 vCPUNFS VM

Shares set to 100 per vCPU Shares set to 100 per vCPUCycle Harvesting VMs

HPC Job Slots

Shares set to 1000 per vCPUVDI VMs

Shares set to 1000 per vCPU

Windows HPC Server

#VIRT1983BU CONFIDENTIAL 24

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Performance AnalysisRepresentative performance details from a single host

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

#VIRT1983BU CONFIDENTIAL 26

VD

I V

M C

PU

%U

SE

DVDI VM Activity

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

VDI and Harvester CPU Utilization

27#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

#VIRT1983BU CONFIDENTIAL 28

Peak utilization = 100% * (28 cores) * (1.25 HT factor) = 3500

All VMs on a 28-core Host

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Harvesters and VDI VMs

29#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Harvesters and VDI VMs – Detail

30#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

A Further Note on Licensing

• vSphere Desktop, the vSphere edition that underpins VMware Horizon, is restricted by EULA to run desktop VMs

• Jackson harvester VMs satisfy this requirement by

– Running a desktop OS (Windows 10)

– Running the Horizon View Agent

– Being tied to users within Jackson’s Named User limit

31#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Benefits

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Primary Benefits

• Cost savings due to avoided cloud costs

• No new skills required to administer solution

• Cycle harvesting is transparent to both VDI and HPC users

• Increased virtual infrastructure reliability

• Better VDI experience for desktop users

• Enables end-users (actuaries) to think beyond just batch scheduling

33#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Futures

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Future Directions

• Harvesting GPU compute cycles

– Testing OpenCL-based approaches

• 1M stream processors online by Q2 2018

• 60K Xeon core-equivalent capacity

35#VIRT1983BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Extreme Performance Series – Las Vegas

• SER2724BU Performance Best Practices

• SER2723BU Benchmarking 101

• SER2343BU vSphere Compute & Memory Schedulers

• SER1504BU vCenter Performance Deep Dive

• SER2734BU Byte Addressable Non-Volatile Memory in vSphere

• SER2849BU Predictive DRS – Performance & Best Practices

• SER1494BU Encrypted vMotion Architecture, Performance, & Futures

• STO1515BU vSAN Performance Troubleshooting

• VIRT1445BU Fast Virtualized Hadoop and Spark on All-Flash Disks

• VIRT1397BU Optimize & Increase Performance Using VMware NSX

• VIRT2550BU Reducing Latency in Enterprise Applications with VMware NSX

• VIRT1052BU Monster VM Database Performance

• VIRT1983BU Cycle Harvesting from the VDI Estate for Financial Modeling

• VIRT1997BU Machine Learning and Deep Learning on VMware vSphere

• FUT2020BU Wringing Max Perf from vSphere for Extremely Demanding Workloads

• FUT2761BU Sharing High Performance Interconnects across Multiple VMs

#VIRT1983BU CONFIDENTIAL 36

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Extreme Performance Series – Hand on Labs

Don’t miss these popular Extreme Performance labs:

• HOL-1804-01-SDC: vSphere 6.5 Performance Diagnostics & Benchmarking

– Each module dives deep into vSphere performance best practices, diagnostics, and optimizations using various interfaces and benchmarking tools.

• HOL-1804-02-CHG: vSphere Challenge Lab

– Each module places you in a different fictional scenario to fix common vSphere operational and performance problems.

#VIRT1983BU CONFIDENTIAL 37

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

Performance Survey

#VIRT1983BU CONFIDENTIAL 38

The VMware Performance Engineeringteam is always looking for feedback about your experience with theperformance of our products, ourvarious tools, interfaces and wherewe can improve.

Scan this QR code to access ashort survey and provide us directfeedback.

Alternatively: www.vmware.com/go/perf

Thank you!

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: VIRT1983BU Making the Complicated Simple: Cycle Harvesting ... · Our Goal and Approach • Increase agility and decrease time to discovery for researchers, scientists, and engineers

VMworld 2017 Content: Not fo

r publication or distri

bution