ece 571 { advanced microprocessor-based design lecture...

46
ECE 571 – Advanced Microprocessor-Based Design Lecture 23 Vince Weaver http://web.eece.maine.edu/ vweaver [email protected] 26 November 2019

Upload: others

Post on 25-Jun-2020

2 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

ECE 571 – AdvancedMicroprocessor-Based Design

Lecture 23

Vince Weaver

http://web.eece.maine.edu/ vweaver

[email protected]

26 November 2019

Page 2: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Announcements

• Project Status – remember to include related work

• Related: never pay for an academic paper

◦ No one involved with the paper ever sees that money

◦ Also the UMaine library pays a lot of money to

subscribe to the papers, so access them through the

library website

• Homeworks – will be graded

1

Page 3: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtualization

• Running multiple copies of operating system on one

machine

• Often designed to be transparent – operating systems

do not realize they are not running on bare metal

2

Page 4: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtualization – Why do it?

• Server consolidation – take many lightly loaded servers,

combine on one machine (reduce maintenance costs)

• Security/Sandboxing – hackers can hack an OS but not

whole machine

• Reliability – one OS image goes down, doesn’t take rest

with it

• Virtual memory images can be easily copied and

brought up on various systems (hypervisor hides hardware

differences)

3

Page 5: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtualization – Downsides

4

Page 6: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtualization Types

• Slower/overhead

• Security – if someone breaks out and into the VM can

compromise them all

• Security – timing attacks

• Reliability – one machine dying can take out many OS

images

Different levels of abstraction.

• Simulation – perfectly emulate hardware/CPU – slow

5

Page 7: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

• Full-virtualization – fully simulate hardware, OS generally

does not realize is being virtualized

• Paravirtualization – full hardware not simulated, virtual

I/O interfaces provided

guest oses have to be aware of this (but less hardware

emulation slowdown)

• Containers – operating system userspace separation

(processes see independent OS setup, filesystem, etc)

but still talking to one OS image via syscalls

6

Page 8: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Terms

• Guest

• Host

• VM (virtual machine)

• Hypervisor

7

Page 9: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Are you running on real hardware?

• VM (some power machines, ps3, never run on raw

hardware)

• Nested VM

• SMM mode (system maintenance mode)

8

Page 10: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Simulation

• Simulation

9

Page 11: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Full Virtualization

• Virtualize the CPU, some sort of simulation of hardware

• Trap on access to hardware and simulate (with Qemu or

similar)

• KVM

• VMware

10

Page 12: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Popek and Goldberg virtualizationrequirements

Formal requirements for virtualizable third generation

architectures, Communications of the ACM, 1974.

• equivalence (fidelity): a program running under a VM

should behave identical to running on bare metal monitor

(VMM) should

• resource control (safety): the VM must control all

resources

11

Page 13: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

• efficiency (performance): most instructions must execute

without intervention

12

Page 14: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Hardware Virtualization Extensions (CPU)

• IBM System/370 in 1972

• x86 chips by default were not, leak too much info.

13

Page 15: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Intel VT-x and AMD-V

• See A Comparison of Software and Hardware Techniques

for x86 Virtualization by Adams and Agesen, ASPLOS

2006.

• VMware managed full virt on 32-bit x86 using dynamic

binary instrumentation (to handle trapping privileged

instructions) and segmentation (to handle memory)

• De-privledging: any attempt to read privileged info traps

and can be intercepted

• Shadow structures: need copies of things that can’t be

14

Page 16: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

intercepted at CPU level, like page-tables. Need to trap

on access to these. True vs hidden page faults.

• x86 issues (assume protected mode)

◦ visible privileged state (see privileged mode when read

CS register; CPL (privilege level) lower 2 bits)

◦ Lack of traps when privileged instructions run at user-

level.

◦ popf (pop flags) changes both ALU and system flags

(IF, enable interrupts). When run non-privileged

ignores this, doesn’t trap.

• x86-64 mostly removed segmentation (At least at first)

15

Page 17: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

so old ways wouldn’t work

• Intel VT-x and AMD-V

◦ Adds virtual machine code block

◦ Intel: extended page tables (nested page tables)

◦ VMCS shadowing: allow nested VMs

16

Page 18: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Second Level Address Translation (SLAT)

• HW alternate to SW managed shadow page tables

• Virtual memory host serves as Phys memory of guest, so

need to translate pages twice on every page fault, etc

17

Page 19: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Other things

• KSM – kernel same-page mapping, de-duplicate

(consolidate) identical pages in system to save RAM

• Balloon driver, to balance RAM across VMs only as

needed

• GPU virtualization

• IOMMU

• Interrupt virtualization

18

Page 20: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

KVM

• Requires CPU with hardware virtualization extensions

• Kernel acts as hypervisor

• /dev/kvm interface

◦ Set up VM and add memory, provide firmware

◦ Set up I/O traps and handlers

◦ Map video display

• Hardware and I/O emulation often handled by Qemu

19

Page 21: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Paravirtualization

• Hypervisor creates a special API that the guest OS uses

(operating system must be modified)

• Can be faster (talk directly to hypervisor, no need to

emulate hardware)

• Xen – uses stripped down Linux as hypervisor?

• Need specially compiled kernel that knows about

hypervisor interfaces

20

Page 22: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Containers

• ;Login article

• Look like you have own copy of OS, but just walled

off more thoroughly than normal Unix process. More

lightweight than VM

21

Page 23: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Traditional HPC

AB

↓C

22

Page 24: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Cloud-based HPC

AB

↓C

23

Page 25: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Cloud Tradeoffs

Pros

• No AC bill

• No electricity bill

• No need to spend $$$

on infrastructure

Cons

• Unexpected outages

• Data held hostage

• Infrastructure not

designed for HPC

24

Page 26: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Measuring Performance in the Cloud

First let’s just measure runtime

This is difficult because in virtualized environments

�o1 Time Loses All Meaning¤O1

25

Page 27: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Simplified Model of Time Measurement

Hardware

Operating System

Application

Time

26

Page 28: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Then the VM gets involved

Hardware

Time

Application

Operating System

VM Layer

27

Page 29: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Then you have multiple VMs

Hardware

Time

VM Layer

App. ? ?

OS1 OS2 OS2OS1

28

Page 30: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

So What Can We Do?

Hope we have exclusive access and measure wall-clock time.

29

Page 31: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Measuring Time Externally

• Ideally have local hardware access, root, and hooks into

the VM system

• Otherwise, you can sit there with a watch

• Danciu et al. send UDP packet to remote server

• Most of these are not possible in a true “cloud” setup

30

Page 32: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Measuring Time From Within Guest

• Use gettimeofday() or clock gettime()

• This might be the only interface we have

• How bad can it be?

31

Page 33: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Cloud Performance Measurement

With High Performance Computing moving to the cloud,

virtualization-aware performance measurement tools are a

necessity.

32

Page 34: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Performance API (PAPI)

• Widely-used, Cross-platform, Open-Source Performance

Measurement Library

⇒ Linux, AIX, FreeBSD, Solaris

⇒ x86, Power, ARM, MIPS

⇒ BlueGene P/Q, Cray

• Use directly or via high-level tools (TAU, Perfsuite,

Vampir, Scalasca, HPCToolkit)

33

Page 35: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

PAPI-V

Virtualization-aware PAPI, or “PAPI-V” extends PAPI to

be useful in cloud environments.

• Report virtual system info

• Provide enhanced timing info

• Virtualization-related components

• Virtualized Counters

34

Page 36: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtual System Info

• Virtualization vendor obtained via CPUID, reported in

hw info.virtual vendor string

• Supported by KVM, Xen, VMware, etc.

• Info for user, helps with bug reports

35

Page 37: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

The Timing Problem

• Time is an important component of most performance

measurements

• The concept of “time” gets fluid once virtualization is

involved

• Ideally you want wallclock time; this is hard to get

within a VM guest

36

Page 38: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

PAPI Timing Interface

On Linux the timing functions use the POSIX timer

interface

• PAPI get real usec();

⇒clock gettime(CLOCK REALTIME);

• PAPI get virtual usec();

⇒clock gettime(CLOCK THREAD CPUTIME ID);

37

Page 39: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Timing Behavior on Bare Metal

0 2 4 6 8 10Other CPU-hogging Apps Running

0

500000

1000000

Tim

e (

us)

Time to run MMM, Actual Core2 Hardware

PAPI_get_real_usec()

PAPI_get_virt_usec()

38

Page 40: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Timing Behavior on Virtualized System

0 2 4 6 8 10Other CPU-hogging VMs Running

0

500000

1000000

Tim

e (

us)

Time to run MMM, Same Core2, KVM Guest

PAPI_get_real_usec()

PAPI_get_virt_usec()

39

Page 41: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Stealtime

What is needed is a way for accounting for time the VM

is scheduled out.

• Since 2.6.11 Linux can provide this stealtime information

• It is system wide, not per-process, which makes auto-

adjusting PAPI timing measurements problematic

• PAPI 5.0 provides a stealtime component

40

Page 42: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Timing Adjusted with Stealtime

0 2 4 6 8 10Other CPU-hogging jobs Running

0

50000

100000

Tim

e (

us)

Time to run MMM, Core2, KVM Guest

PAPI_get_real_usec()

PAPI_get_virt_usec()

System Stealtime

PAPI_get_virt_usec() adjusted for stealtime

41

Page 43: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Network Components

PAPI also has components for measuring Network I/O.

• Generic network component

• Infiniband component

• Myrinet component

42

Page 44: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Infiniband DirectPath Comparison

43

Page 45: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

VMware Component

PAPI supports a component that provides access to

VMware-specific interfaces

• pseudo-performance counters – extra timing info via

rdpmc

• VMware guest SDK (ESX only) – provides various other

performance related measurements, including stealtime

44

Page 46: ECE 571 { Advanced Microprocessor-Based Design Lecture 23web.eece.maine.edu/~vweaver/classes/ece571_2019f/ece571... · 2019-11-26 · Intel VT-x and AMD-V See A Comparison of Software

Virtualized Performance Counters

The VM host can virtualize performance counter access by

trapping access to the MSRs, and saving/restoring values

when suspending/resuming VMs.

• KVM supports this as of Linux 3.2 with a sufficiently

recent version of the QEMU/KVM tool (with some

limitations)

• Xen supports this as of Linux 3.5

• VMware support is underway

45