deep dive on amazon ec2 instances - january 2017 aws online tech talks
TRANSCRIPT
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adam Boeglin, HPC Solutions Architect
Wednesday, May 3, 2023
Choosing the Right EC2 Instance and Applicable Use Cases
Understanding the factors that going into choosing an EC2 instance
Defining system performance and how it is characterized for different workloads
A look into our current generation instances and their features
How Amazon EC2 instances deliver performance while providing flexibility and agility
How to make the most of your EC2 instance experience through the lens of several instance types
What to Expect from the Session
Host ServerHypervisor
Guest 1 Guest 2 Guest n
Amazon EC2 Instances
In the past
First launched in August 2006 M1 Instance
“One size fits all” M1
Amazon EC2 Instances History
2006 2008 2010 2012 20142016
m1.small
m1.largem1.xlarge
c1.mediumc1.xlarge
m2.xlarge
m2.4xlargem2.2xlarge
cc1.4xlarge
t1.micro
cg1.4xlarge
cc2.8xlarge
m1.medium
hi1.4xlarge
m3.xlargem3.2xlarge
hs1.8xlarge
cr1.8xlarge
c3.largec3.xlarge
c3.2xlargec3.4xlargec3.8xlargeg2.2xlarge
i2.xlargei2.2xlargei2.4xlargei2.4xlarge
m3.mediumm3.large
r3.larger3.xlarger3.2xlarge
r3.4xlarger3.8xlarge
t2.microt2.smallt2.med
c4.largec4.xlargec4.2xlargec4.4xlargec4.8xlarge
d2.xlarged2.2xlarged2.4xlarged2.8xlarge
g2.8xlarge
t2.largem4.large
m4.xlargem4.2xlargem4.4xlargem4.10xlarge
x1.32xlarge
t2.nano
m4.16xlarge
p2.xlargep2.8xlargep2.16xlarge
x1.16xlarger4.larger4.xlarger4.2xlarger4.4xlarger4.8xlarger4.16xlarget2.xlarget2.2xlarge
2015
Instance generation
c4.largeInstance family
Instance size
Choices and Flexibility
Choice of Processor Memory Storage Options Accelerated Graphics Burstable Performance
Servers are hired to do jobs Performance is measured differently depending on the job
Hiring a Server
?
Performance Factors
Resource Performance factors Key indicatorsCPU Sockets, number of cores, clock
frequency, bursting capabilityCPU utilization, run queue length
Memory Memory capacity Free memory, anonymous paging, thread swapping
Network interface
Max bandwidth, packet rate Receive throughput, transmit throughput over max bandwidth
Disks Input / output operations per second, throughput
Wait queue length, device utilization, device errors
Acceleration FPGA or GPU offloading from CPU Parallelism and Code Design
Broad Set of Compute Instance Types
M4
General purpose
Computeoptimized
C4
C3
Storage and I/O
optimized
I3
G2
GPU or FPGAenabled
Memoryoptimized
X1
P2
F1
R4
R3
C5
I2D2
Resource Utilization
For given performance, how efficiently are resources being used
Something at 100% utilization can’t accept any more work
Low utilization can indicate more resource is being purchased than needed
Example: Web Application MediaWiki installed on Apache with 140 pages of content Load increased in intervals over time
Example: Web Application Memory stats
Example: Web Application Disk stats
Example: Web Application Network stats
Example: Web Application CPU stats
Give back instances as easily as you can acquire new ones Find an ideal instance type and workload combination EC2 Instance Pages provide “Use Case” Guidance With EBS, storage and instance size don’t need to be coupled
Instance Selection = Performance Tuning
“Launching new instances and running tests in parallel is easy…[when choosing an instance] there is no substitute for measuring the performance of your full application.”
- EC2 Documentation
How not to choose an EC2 instance
Brute Force Testing Ignoring Metrics Favoring old generation instances Guessing based on what you already have
Instance sizing
c4.8xlarge 2 - c4.4xlarge
≈
4 - c4.2xlarge
≈
8 - c4.xlarge
≈
Choosing the right size
Understand your unit of work Web request Database / Table Batch Process
What is that unit’s requirements? CPU threads Memory Constraints Disk & Network
What are it’s availability requirements?
Utilization & Auto Scaling Granularity
Utilization & Auto Scaling Granularity
Utilization & Auto Scaling Granularity
General Purpose Instances
Review: M4 Instances General Purpose Instance Family Balance of Compute, Memory, and Network Resources 2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or
2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
Model vCPU Memory (GiB)
Storage EBS Bandwidth (Mbps)
m4.large 2 8 EBS Only 450m4.xlarge 4 16 EBS Only 750m4.2xlarge 8 32 EBS Only 1,000
m4.4xlarge 16 64 EBS Only 2,000m4.10xlarge 40 160 EBS Only 4,000m4.16xlarge 64 256 EBS Only 10,000
Databases, Data Processing, Caching, SAP, SharePoint, and other enterprise applications.
Review: T2 Instances Lowest cost EC2 instance at $0.0065 per hour Burstable performance Fixed allocation enforced with CPU credits
Model vCPU Baseline CPU Credits / Hour
Memory (GiB)
Storage
t2.nano 1 5% 3 .5 EBS Onlyt2.micro 1 10% 6 1 EBS Onlyt2.small 1 20% 12 2 EBS Only
t2.medium 2 40%** 24 4 EBS Onlyt2.large 2 60%** 36 8 EBS Onlyt2.xlarge 4 90%** 54 16 EBS Onlyt2.2xlarge 8 135%** 81 32 EBS Only
General Purpose, Web Serving, Developer Environments, Small Databases
How Credits Work
A CPU credit provides the performance of a full CPU core for one minute
An instance earns CPU credits at a steady rate An instance consumes credits when active Credits expire (leak) after 24 hours
Baseline rate
Credit balance
Burstrate
Tip: Monitor CPU credit balance
Compute Optimized Instances
Review: C4 Instances – “Compute” Custom Intel E5-2666 v3 at 2.9 GHz Turbo to 3.5 Ghz P-state and C-state controls
Model vCPU Memory (GiB) EBS (Mbps)c4.large 2 3.75 500c4.xlarge 4 7.5 750c4.2xlarge 8 15 1,000c4.4xlarge 16 30 2,000c4.8xlarge 36 60 4,000
Batch & HPC workloads, Game Servers, Ad Serving, & High Traffic Web Servers
C5 Instance Preview
• Next Generation “Skylake” Intel® Xeon® Processor family
• AVX 512 Instruction Set• Up to 72 vCPUs in a single instance• 144 GB of RAM
• Coming early 2017
Memory Optimized Instances
Review: R4 Instances – “RAM” Intel Xeon E5-2686 v4 (Broadwell) Processors @ 2.3 Ghz DDR4 Memory
Model vCPU Memory (GiB) Network Performance
r4.large 2 15.25 Up to 10 Gigabitr4.xlarge 4 30.5 Up to 10 Gigabitr4.2xlarge 8 61 Up to 10 Gigabitr4.4xlarge 16 122 Up to 10 Gigabitr4.8xlarge 32 244 10 Gigabitr4.16xlarge 64 488 20 Gigabit
In-memory databases, big data processing, HPC workloads
Review: X1 Instances
Largest memory instance with 2 TB of DRAM Quad socket, Intel E7 processors with 128 vCPUs
Model vCPU Memory (GiB) Local Storage Networkx1.16xlarge 64 976 1x 1920GB SSD 10Gbpsx1.32xlarge 128 1952 2x 1920GB SSD 20Gbps
In-memory databases, big data processing, HPC workloads
Storage Optimized Instances
Review: D2 Instances – “Dense Storage” Intel Xeon E5-2676 v3 (Haswell) @ 2.4 GHz 3.5 GBps read and 3.1 GBps write throughput w/ 2 MiB block size
Model vCPU Memory (GiB)
Storage Read Throughput(2 MiB Block size)
d2.xlarge 4 30.5 3 x 2 TB 438 MB/sd2.2xlarge 8 61 6 x 2 TB 875 MB/sd2.4xlarge 16 122 12 x 2 TB 1,750 MB/sd2.8xlarge 36 244 24 x 2 TB 3,500 MB/s
MapReduce & Hadoop, Distributed File systems, Log and Data Processing
Review: I2 Instances – “IOPS” 16 vCPU: 3.2 TB SSD; 32 vCPU: 6.4 TB SSD 365K random read IOPS for 32 vCPU instance
Model vCPU Memory (GiB)
Storage Read IOPS Write IOPS
i2.xlarge 4 30.5 1 x 800 SSD 35,000 35,000i2.2xlarge 8 61 2 x 800 SSD 75,000 75,000i2.4xlarge 16 122 4 x 800 SSD 175,000 155,000i2.8xlarge 32 244 8 x 800 SSD 365,000 315,000
NoSQL Databases, Clustered Databases, Online Transaction Processing (OLTP)
Coming Soon: I3 Instance
• Up to 64vCPUs• Intel® Xeon® E5-2686 v4 “Broadwell” @ 2.3 GHz
• 488 GB of Memory• 15.2 TB of NVMe Based SSD’s
• 3.3 Million Random IOPS @ 4KB• 16GB/s sequential throughput
• Early 2017
EBS Performance
Instance Size Matters Match your volume size and
type to your instance Use EBS Optimization if EBS
performance is important
Accelerated Instances
Review: G2 Instances – “GPU”
• Up to 4 NVIDIA GRID K520 GPUs in a single instance• Each with 1,536 CUDA cores and 4GB of video memory
• High-performance platform for graphics applications using DirectX or OpenGL
G2
Instance Size
GPUs vCPUs Memory (GiB)
SSD Storage
g2.2xlarge 1 8 15 1 x 60GBg2.8xlarge 4 32 60 2 x 120GB
Video creation services, 3D visualizations, streaming graphics, server-side graphics workloads
Review: P2 GPU Instances – “Parallel”• Up to 16 K80 GPUs in a single instance• Supports CUDA 7.5 and above, OpenCL 1.2, and the
GPU Compute APIs• Including peer-to-peer PCIe GPU interconnect
P2
Model GPUs GPU Peer to Peer
vCPUs Memory (GiB)
GPUCores
GPUMemory
Network Bandwidth*
p2.xlarge 1 - 4 61 2,496 12 Gib Highp2.8xlarge 8 Y 32 488 19,968 96 Gib 10Gbpsp2.16xlarge 16 Y 64 732 39,936 192 Gib 20Gbps
*In a placement group
Deep learning, HPC simulations, and Batch Rendering
Review: F1 Instances – “FPGA”
• Up to 8 Xilinx Virtex UltraScale Plus VU9p FPGAs in a single instance with four high-speed DDR-4 per FPGA
• Largest size includes high performance FPGA interconnects via PCIe Gen3 (FPGA Direct), and bidirectional ring (FPGA Link)
• Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing
F1
Instance Size FPGAs FPGA Link
FPGA Direct
vCPUs Memory (GiB)
NVMe Instance Storage
Network Bandwidth*
f1.2xlarge 1 - 8 122 1 x 480 5 Gbpsf1.16xlarge 8 Y Y 64 976 4 x 960 30 Gbps
*In a placement group
Next steps
Visit the Amazon EC2 documentation Launch an instance and try your app!
Thank you!