o el2, and beyond! - linux conferences and linux events...
TRANSCRIPT
![Page 1: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/1.jpg)
connect.linaro.org
LEADING COLLABORATION
IN THE ARM ECOSYSTEM
To EL2, and Beyond!Optimizing the Design and Implementation of KVM/ARM
Christoffer Dall <[email protected]>Shih-Wei Li <[email protected]>
![Page 2: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/2.jpg)
–Popek and Golberg [Formal requirements for virtualizable third generation architectures ’74]
“Efficient, isolated duplicate of the real machine”
“…a statistically dominant subset of the virtual processor’s instructions be executed directly by the real processor, with no
software intervention by the VMM.”
![Page 3: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/3.jpg)
IBM 360/91Columbia University Computer Center machine room in
February or March 1969
![Page 4: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/4.jpg)
PDP-10KL10 CPU and MH10 memory cabinets
Originally installed 1985 at Sikorsky Aircraft
![Page 5: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/5.jpg)
Gigabyte R270-T6196 Cores
Dual Cavium ThunderX
![Page 6: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/6.jpg)
Hardware
OS Kernel
App AppApp
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Native Virtual Machines
Virtualization
Privileged
Non-privileged
Privileged
Non-privileged
![Page 7: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/7.jpg)
Non-virtualizable architectures
![Page 8: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/8.jpg)
ARM Hardware Virtualization Support
VT-x!=
Virtualization Extensions
![Page 9: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/9.jpg)
x86 Virtualization Support
Root (Hypervisor) Non-Root (VM)
VM Exit
VMCS
VM Entry
![Page 10: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/10.jpg)
ARM Virtualization Extensions
Kernel
UserEL0
EL1
HypervisorEL2
![Page 11: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/11.jpg)
EL2
• Separate CPU mode designed to run hypervisors
• Not designed to run full operating systems
• Reduced virtual memory support compared to EL1
• Limited support for interacting with userspace in EL0
![Page 12: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/12.jpg)
ARM VE and Hypervisors
Xen
Dom0
Linux
App App
DomU
Linux
App AppEL0
EL1
EL2?
![Page 13: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/13.jpg)
KVM/ARM
• KVM is integreated with Linux
• Linux is a full operating system designed to run in EL1
• KVM cannot run VMs without EL2
![Page 14: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/14.jpg)
KVM/ARM Split-ModeHost
Linux
AppApp
VM
Kernel
AppApp
KVM
KVM lowvisor
EL0
EL1
EL21. Hypercall
2. Return3. Hypercall4. Return
switchstate
![Page 15: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/15.jpg)
What if we could do this?
Host
Linux
AppApp
VM
Kernel
AppApp
KVM
EL0
EL1
EL21. Hypercall 2. Return
![Page 16: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/16.jpg)
ARMv8.1 VHE
• Virtualization Host Extensions
• Supports running unmodified OSes in EL2 without using EL1
Linux
EL0
EL1
EL2
AppApp
![Page 17: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/17.jpg)
VHE #1: Backwards Compatible
• HCR_EL2.E2H complete enables and disables VHE
• When disabled, completely backwards compatible with ARMv8.0
• Example: Xen disables VHE
![Page 18: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/18.jpg)
VHE #2: Expands Functionality of EL2
• Expanded EL2 functionality
• Inherits all EL1 MMU features
• New virtual EL2 timer
• A corresponding EL2 system register for each EL1 system register
![Page 19: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/19.jpg)
VHE #3: Support Userspace in EL0
• TGE: Trap General Exceptions
• Routes all exceptions to EL2
• VHE no longer disables stage 1 MMU in EL0Linux
EL0
EL1
EL2
AppApp
Exceptions
![Page 20: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/20.jpg)
VHE #4: EL2&0 Translation Regime
• Same page table format as EL1
• Used in EL0 with TGE bit set
![Page 21: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/21.jpg)
• Linux is written to run in EL1
• EL<x> is controlled by EL<x> system registers
• VHE runs Linux in EL2
• Unmodified!
VHE #5: System Register Redirection
Linux
EL0
EL1
EL2
AppApp
EL1 Registers
EL2 Registers
Linux
![Page 22: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/22.jpg)
• Linux is written to run in EL1
• VHE runs Linux in EL2
• Unmodified!
VHE #5: System Register Redirection
Linux
EL0
EL1
EL2
AppApp
EL1 Registers
EL2 Registers
Linux
![Page 23: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/23.jpg)
VHE: System Register Redirection
mrs x0, ESR_EL1
![Page 24: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/24.jpg)
VHE #5: System Register Redirection
ESR_EL1
mrs x0, ESR_EL1
VHE Disabled
ESR_EL2
![Page 25: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/25.jpg)
VHE #5: System Register Redirection
ESR_EL1
mrs x0, ESR_EL1
ESR_EL2
VHE Enabled
![Page 26: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/26.jpg)
VHE #5: System Register Redirection
ESR_EL1
mrs x0, ESR_EL12VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
![Page 27: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/27.jpg)
VHE #5: More System Register Redirection
• Some registers change bit position to be similar between EL1 and EL2
• Example:
• VHE: CNTKCTL_EL1 redirects to CNTHTCL_EL2
• But they have different layouts
• VHE: EL2 register changes layout to EL1 register (with extra bits)
![Page 28: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/28.jpg)
Legacy KVM/ARM without VHE
HypervisorLinux
EL2
EL1KVM
Lowvisor
Trap
Run VM
![Page 29: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/29.jpg)
KVM/ARM with VHE
HypervisorLinux
EL2 KVM world switch
Function Call
Run VM
![Page 30: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/30.jpg)
No VHE hardware
• How do we measure VHE performance?
• None available at start of this work
• Still no publicly available hardware
![Page 31: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/31.jpg)
Linux in EL2Modify Linux to:
1. Access EL2 registers
2. Use EL2 virtual memory system
3. Support user space applications in EL0
EL0
EL1
EL2 Linux
Userspace
KVM
![Page 32: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/32.jpg)
System Registers Accesses
• Lots of:#ifndef CONFIG_EL2_KERNEL msr tcr_el1, x0 #else msr tcr_el2, x0 #endif
![Page 33: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/33.jpg)
EL1 VA Space (39 bits)
Userspace
0x7f ffffffff
0x0
Kernel
0xffffffff ffffffff
0xffffff80 00000000
TTBR0_EL1 TTBR1_EL1
![Page 34: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/34.jpg)
EL2 VA Space (39 bits)0x7f ffffffff
0x0
TTBR0_EL2
Where do we put the kernel and
userspace?
![Page 35: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/35.jpg)
EL2 Split VA Space
Kernel
0x7f ffffffff
0x0
TTBR0_EL2
Userspace
• Problem A: address space compression
• Problem B: Page table formats
• Problem C: requires TLB invalidation
0x3f ffffffff0x40 00000000
*Only problems on non-VHE hardware!
![Page 36: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/36.jpg)
Sharing Page Tables in EL0 and EL2
• Same page table between user and kernel
• Different page table format in EL0 and EL2
Descriptor bit EL0 EL2AP[2] R/W R/WAP[1] User access RES1
UXN/XN UXN XNPXN PXN RES0
![Page 37: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/37.jpg)
Descriptor bit EL0 EL2AP[2] R/W R/WAP[1] User access RES1
UXN/XN UXN XNPXN PXN RES0
The AP[1] bit and Linux in EL2• AP[1] controls if userspace can access the page
• Must be set to 0 for kernel mappings
• RES1 in EL2
![Page 38: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/38.jpg)
RES1 definition
ARMv8.0 hardware must treat non-register RES1 bits as:
“reads-as-written with no effect on the behaviour of the CPU”
![Page 39: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/39.jpg)
Descriptor bit EL0 EL2AP[2] R/W R/WAP[1] User access RES1
UXN/XN UXN XNPXN PXN RES0
UXN/XN and PXN for Linux in EL2
• PXN has no effect outside EL1
• UXN/XN means ‘execute never’ in both modes
• Cannot separate user and kernel executable
![Page 40: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/40.jpg)
No ASID Support in EL2
• Address Space Identifiers (ASID)
• Avoids TLB aliasing by tagging accesses with per-context ID
• No ASID support in EL2
• Must invalidate EL2 TLB on host process context switch
![Page 41: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/41.jpg)
Routing Exceptions to EL2
Kernel
UserEL0
EL1 Exceptions from kernel
Exceptions from userspace
Kernel
UserEL0
EL2 Exceptions from kernel
Exceptions from userspaceEL1
Linux in EL1 Linux in EL2
![Page 42: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/42.jpg)
Routing Exceptions to EL2
• HCR_EL2.TGE traps general exceptions to EL2
• Does NOT work, because TGE without VHE disables MMU in userspace
![Page 43: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/43.jpg)
Routing Exceptions to EL2
• Forward exceptions with software using a small shim
Kernel
UserEL0
EL2
EL1 shim
![Page 44: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/44.jpg)
Linux in EL2 on non-VHE hardware
The bad (and the ugly)
• Less secure than Linux in EL1
• Relies on strictly correct implementation of RES1 page table bits
• Potentially worse performance for host workloads
The Good
• Good prototyping tool!
• Closely emulates performance of VHE for running VMs
![Page 45: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/45.jpg)
Experimental Setup
• AMD Seattle B0 ARM Server
• 64-bit ARMv8-A
• 2.0 GHz AMD A1100 CPU
• 8-way SMP
• 16 GB RAM
• 10 GB Ethernet (passthrough)
*Measurements obtained using Linux in EL2.
![Page 46: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/46.jpg)
VHE Performance at First Glance
CPU Clock Cycles non-VHE VHE*
Hypercall 3.181 3.045
*Measurements obtained using Linux in EL2.
![Page 47: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/47.jpg)
The KVM Run Loop
vcpu_load
vcpu_put
vcpu run loop
while (1) { prepare(); run_vcpu(); handle_exit(); }
![Page 48: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/48.jpg)
KVM/ARM Optimization
• Move logic out of the run loop and into vcpu_load and vcpu_put
• Only possible with VHE (or Linux in EL2)
vcpu_load
vcpu_put
vcpu run loop
![Page 49: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/49.jpg)
ARM Generic Timers• Also known as “Architected
Timers”
• Timer hardware directly programmable by guest
• Expired timers generate physical interrupts for the hypervisor
![Page 50: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/50.jpg)
KVM/ARM TimersVCPU entry• Programs timer with guest state
VCPU is running• When the timer fires it causes an exit to the hypervisor
VCPU exit• Reads guest timer state to memory• Disables hardware timer• In software: If timer is expired, inject virtual interrupt
![Page 51: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/51.jpg)
Optimized KVM/ARM TimersVCPU load
• Programs timer with guest state
VCPU is running• When the timer fires it causes an exit to the hypervisor
KVM is running• When the time fires, the timer ISR injects virtual interrupts to the guest.
VCPU put• Reads guest timer state to memory• Disables hardware timer
![Page 52: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/52.jpg)
EL1 System RegistersVM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Defer saving/restoring EL1 system register state to vcpu_load and vcpu_put
![Page 53: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/53.jpg)
Virtualization FeaturesVM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Legacy KVM/ARM design enabled/disabled virtualization features on every transition
• Virtual/Physical interrupts
• Stage 2 memory translation KVM Lowvisor
Disable traps
Enable traps
![Page 54: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/54.jpg)
Virtualization FeaturesVM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
Optimized version:
• Leave virtualization features enabled
• Host EL2 never uses stage 2 translations and always has full hardware access.
![Page 55: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/55.jpg)
Rewrite the World-Switch
• Rewrite the world switch code
• Very simple VHE function
• Complicated non-VHE function
kvm_arch_vcpu_ioctl_run { ... while (1) { ... if (has_vhe()) /* static key */ ret = kvm_vcpu_vhe_run(vcpu); else ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); ... } ... }
![Page 56: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/56.jpg)
Experimental Setup
• AMD Seattle B0 ARM Server
• 64-bit ARMv8-A
• 2.0 GHz AMD A1100 CPU
• 8-way SMP
• 16 GB RAM
• 10 GB Ethernet (passthrough)
*Measurements obtained using Linux in EL2.
• Dell r320 x86 Server
• 64-bit Intel
• 2.1 GHz Xeon E5-2450
• 8-way SMP
• 16 GB RAM
• 10 GB Ethernet (passthrough)
![Page 57: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/57.jpg)
Microbenchmark Results
CPU Clock Cycles non-VHE VHE OPT * x86
Hypercall 3.181 752 1.437
I/O Kernel 3.992 1.604 2.565
I/O User 6.665 7.630 6.732
Virtual IPI 14.155 2.526 3.102
*Measurements obtained using Linux in EL2.
![Page 58: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/58.jpg)
Application Workloads
Application Description
Kernbench Kernel compile
Hackbench Scheduler stress
Netperf Network performance
Apache Web server stress
Memcached Key-Value store
![Page 59: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/59.jpg)
Application Workloads
0.00
0.50
1.00
1.50
2.00
KernbenchHackbench
TCP_STREAM
TCP_MAERTSTCP_RR
ApacheMemcached
non-VHE VHE OPT* x86
*Measurements obtained using Linux in EL2. See BKK16 talk.
Normalized overhead (lower is better)
![Page 60: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/60.jpg)
Conclusions• Optimize and redesign KVM/ARM for VHE
• Significant improvement in microbenchmark results
• Significant improvement in application benchmark results
• Similar (or better) performance characteristics compared to x86
• Published in USENIX ATC’17:https://www.usenix.org/system/files/conference/atc17/atc17-dall.pdf
![Page 61: o EL2, and Beyond! - Linux Conferences and Linux Events ...events17.linuxfoundation.org/sites/events/files/slides/To...Gigabyte R270-T61 es Dual Cavium ThunderX re el p p p re isor](https://reader036.vdocument.in/reader036/viewer/2022071603/613dced12809574f586e3245/html5/thumbnails/61.jpg)
Code• Timer optimization patches (v4):
https://lists.cs.columbia.edu/pipermail/kvmarm/2017-October/027836.html
• Core optimization patches:https://lists.cs.columbia.edu/pipermail/kvmarm/2017-October/027523.html
• Linux in EL2 (not for upstream, not supported, don’t come crying…): https://github.com/chazy/el2linux
• Target is v4.16
• Reviews are welcome