Download - ARM-KVM: Weather Report
![Page 1: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/1.jpg)
1Samsung Open Source Group
ARM-KVM: Weather ReportKorea Linux Forum
Mario Smarduch
Samsung Open Source GroupSenior Virtualization Architect
![Page 2: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/2.jpg)
2Samsung Open Source Group
ARM-KVM This Year
�Key contributors Linaro, ARM
� Access to documentation & specialized HW an issue
� ARM64 subtree – 12+ hw vendors
�Some of the new features added since last year:• QEMU/Guest – cache-coherency resolved
• GICv2m – interrupt controller (GICv3 spec not public)• Device Pass-through
• Virtual Platforms with kernel platform selection• 16-k page size support
• Guest Debug Support
![Page 3: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/3.jpg)
3Samsung Open Source Group
What is KVM?
![Page 4: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/4.jpg)
4Samsung Open Source Group
Where is KVM in the Cloud?
�Host Kernel, KVM Module, QEMU, and Guest working together• Kernel – KVM reuses kernel MMU, synch, scheduling, timers, interrupts,
etc.
• Kernel matures – KVM reuses
• KVM - runs vCPU loop, traps/fix/resume guest, emulate
• QEMU/Kvmtool – platform emulation, Guest Management, I/O
• Guest – kernel, disk image, I/O – unaware of virtual platforms
![Page 5: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/5.jpg)
5Samsung Open Source Group
vCPU Scheduling
![Page 6: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/6.jpg)
6Samsung Open Source Group
vCPU Scheduling
�Physical CPU – can be in host or guest mode
• Guest mode uses HW Extension support�Guest CPU – its a thread in Guest mode aka vCPU
�Transitions
• Host > Guest a VM Enter - save host, load guest context
• Guest > Host a VM Exit - save guest, restore host, resolve exit, and later
VM Enter
�vCPUs are threads so you can:
• Use taskset, chrt, numactl, ps
• Use KVM to leverage kernel scheduler code for preempt notifiersand vCPU scheduling
![Page 7: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/7.jpg)
7Samsung Open Source Group
NFV Example
LTE Network Element - Isolation
![Page 8: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/8.jpg)
8Samsung Open Source Group
Guest Memory Management
![Page 9: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/9.jpg)
9Samsung Open Source Group
Guest Memory Management
�QEMU backs guest memory with mmap() region
• Register QEMU VA/GPA range with KVM
�Guest access – 2nd stage fault
• KVM – (1) GPA > QEMU VA > get a page > update 2nd
stage, QEMU
• Guest resolves stage 1
�4 – tables
• QEMU process, Kernel, 1st, 2nd stage tables
�KVM leverages kernel MMU code
• paging, mmu notifiers, page allocation, and topology
(flat, numa)
![Page 10: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/10.jpg)
10Samsung Open Source Group
I/O
�Virtio – Dominant in cloud
• QEMU/Guest map the same memory
• Tx, Rx, Ctrl – Virt-Qs used
• QEMU translates GPA to/from QEMU VA
�QEMU MT – vCPUs + IO Thread(s)
• IO thead – frontend – virt-q & backend host
OS transport
![Page 11: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/11.jpg)
11Samsung Open Source Group
KVM vCPU Loop
![Page 12: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/12.jpg)
12Samsung Open Source Group
KVM in the Cloud
![Page 13: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/13.jpg)
13Samsung Open Source Group
KVM in the Cloud
�IaaS Admin – Compute node provides access to
• Create private/public networks
• Install Images, create block storage• Backend/mgmt network access
�QEMU/KVM on Compute node• Cloud Controller interfaces with Libvirt
• Libvirt launches guest, QEMU, and Image• Virtio: attached nework, storage
• Libvirt uses QMP for QEMU mgmt• halt, mem balloon – infl/defl
![Page 14: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/14.jpg)
14Samsung Open Source Group
ARM64 Memory Refresh
Register Set Basic
Procedure Call
Exception Model
![Page 15: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/15.jpg)
15Samsung Open Source Group
ARM64 Memory Refresh
Bit Width and
Exceptions
Address Size
![Page 16: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/16.jpg)
16Samsung Open Source Group
ARM and x86
![Page 17: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/17.jpg)
17Samsung Open Source Group
Guest/QEMU Coherency
![Page 18: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/18.jpg)
18Samsung Open Source Group
Guest/QEMU Coherency
�Blocked progress for some areas
�Strict guest device attributes prevail
• Dealing with normal memory
• Devices break emulation
• Driver observes device, QEMU memory attr.
• In-coherent view
� LCD
• Guest updates not observed by QEMU
![Page 19: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/19.jpg)
19Samsung Open Source Group
An Issue with Coherency
�Flash emulation broke
• Reads from memory
• Writes mmio unlock/write/lock
• QEMU/Guest coherency issue
• Several attempts to resolve include using
fake guest attributes, modifying QEMU MMU
• KVM Forum Solution - Expose devices as
DMA cacheable
![Page 20: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/20.jpg)
20Samsung Open Source Group
Interrupts High Level
Host
QEMU Guest
Device Emulation
Injects Interrupts
Emulated
IO Interrupt Controller
Per INTID- CPU target reg- Level/Trig- Dis/Ena- Grp 0/1 S/NS
CPU Interface – HW Extensions
VFIO
Int Ack, EOIR, RPR, PMR
![Page 21: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/21.jpg)
21Samsung Open Source Group
Interrupts
�ARM GICv2 interrupt IDs 16 - SGI, 16 - PPI, 992 – SPI
• Interrupt Space Limited, no MSI support
�MSI/MSI-x
• MSI up to 32 interrupts/function – address/data
• MSI-x table up to 2048 entries address/data per entry
• Edge triggered - re-enable delivery on device
• Interrupt source identified easily
• Messages instead of hw lines
• Devices can target many CPUs & vectors. E.g. 8-
CPUs, 128-Int IDs
![Page 22: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/22.jpg)
22Samsung Open Source Group
GICv2m
�MSI/MSI-x – using SPIs
• Up to 32-clusters 8 CPUs/cluster
• Affinity Routing enabled to target CPUs
• Generate MSI/MSIx peripheral writes – using
SPIs
• GICD_{SET|CLR}SPI_NSR –
• Few other regs to program
![Page 23: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/23.jpg)
23Samsung Open Source Group
GICv2m
�MSI/MSI-x – using LPIs
• Interrupt Translation Services • Huge LPI space of 57K+ interrupt IDs
• GITS_TRANSLATER – dev id + LPI id – generate INTID
• Device can target many CPUs & vectors. E.g. 16-CPUs, 128-Int IDs each
• ITS – Guest programs peripherals directly
• ITS translates from virt interrupt id to phys interrupt id• KVM injects virt interrupt
• For Guest support must emulate Re-Dis, ITS,
& Distributor
![Page 24: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/24.jpg)
24Samsung Open Source Group
GICv2m
![Page 25: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/25.jpg)
25Samsung Open Source Group
Device Pass-Through
�Device pass through using – PCI
• PCI pass through – ‘device vfio-
pci,host=xx.xx.xx’�QEMU
• Reads device PCI Config from
kernel i.e xx.xx.xx
• Qemu Picks B/D/F programs it
• Guest enumerates – accesses
PCI Config
• Maps memory – BARs i.e. 2nd
stage
• IOMMU – guest memory
• Sets up interrupts
![Page 26: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/26.jpg)
26Samsung Open Source Group
Device Pass-Through
�Device Pass Through using device tree
• -device vfio-<device name>
• QEMU enhancements
• Add device handler – handle –
device option
• Gather device info – create
node
• Add to Guest device tree
• Guest parses and accesses
device
• From node – i.e. mmio regions,
irq, ..
• SMMU map guest
• Setup interrupt pass through
![Page 27: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/27.jpg)
27Samsung Open Source Group
Virt Machine Model
�-M virt
• Kernel builds against “Dummy Virtual Machine” –
ARCH_VIRT
• Supports arm32/arm64 guests
�Instantiates a FDT, no need to pass dtb file
�Defines physical map for
• Flash – bios
• GICv2, GICv2m, GCIv3UART, RTC
• Platform bus device pass-through
• UART
• Builds ACPI tables i.e. hw discovery
![Page 28: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/28.jpg)
28Samsung Open Source Group
Virt Machine Model
�virtio_mmio: for virtio transport enable virtio-mmio
in kernel
• The backend is agnostic to transport
• The guest finds mmio transport
![Page 29: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/29.jpg)
29Samsung Open Source Group
Virt Machine Model
�Boot loaders for arm32/arm64
• Tiny boot loader support
• Will boot an Image, Image.gz, zImage,
uImage
• quick boot
• Few devices emulated - low mmio exits
![Page 30: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/30.jpg)
30Samsung Open Source Group
Several Page Sizes
�4K, 64K – page sizes
• Huge page – 2MB, 512MB
�Now 16K page size – added
• Huge Page – 32MB
�More flexibility in the future
• 4k guest on 64k host – without huge pages
• Or 16k page guest
• Good for TLBs
![Page 31: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/31.jpg)
31Samsung Open Source Group
Several Page Sizes
�Live migration & dirty page logging
• 64k hosts are a good option due to less
memory copy
![Page 32: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/32.jpg)
32Samsung Open Source Group
Guest Debug Support
QEMU
EL1
HW BKPKT HW VALUE WP BKPKT WP VALUE
EL1 Regs
gdb
Host Guest Guest
KVM- Set hw bpkt- Set wp- SS
vmlinux
![Page 33: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/33.jpg)
33Samsung Open Source Group
Guest Debug Support
�QEMU has a gdb server (connect -gdb tcp::… , -S stop cpu)
• gdb <vmlinux> > connect remote:…�Hyp debug support extensions to trap on debug events
�Arm64 provides a variety of self hosted debug regs
• Paired HW control and value registers • Control – VA, CONTEXID/VMID match
• Value reg – VA, VMID, CONTEXID• Paired watch point control and value regs
• Control – on load/store, byte selects
• Value – VA• Single Stepping – PSTATE, debug control reg.
![Page 34: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/34.jpg)
34Samsung Open Source Group
Guest Debug Support
�Complex integration into QEMU gdb server
infrastructure
• Accept SS, bkpt, watch point commands
• Take debug exit on bpkt and return state to
QEMU
• Handle concurrent guest/host QEMU debug
![Page 35: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/35.jpg)
35Samsung Open Source Group
Questions?
![Page 36: ARM-KVM: Weather Report](https://reader031.vdocument.in/reader031/viewer/2022021502/5882bcfa1a28abb2478b527f/html5/thumbnails/36.jpg)
36Samsung Open Source Group
Thank You!
Mario Smarduch
Samsung Open Source GroupSenior Virtualization Architect