os virtualization (1)

Upload: mickaelmarin

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Os Virtualization (1)

    1/107

    1Ivan Boule

    Virtualization

    System Virtualizationand

    OS Virtual Machines

  • 8/10/2019 Os Virtualization (1)

    2/107

    2Ivan Boule

    Virtualization

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

  • 8/10/2019 Os Virtualization (1)

    3/107

    3Ivan Boule

    Virtualization

    History of Virtual Machines

    VM introduced in the sixties on IBM/370 seriesCo-Designed VM: IBM AS/400

    High level ISA including I/Os

    Proprietary CISC PowerPCApplication VMs

    Sun Java, Microsoft Common Language Infrastructure

    OS VMsVMware (Windows/Linux on Intel)

    Connectix (Windows/PC emulation on Mac OS)

  • 8/10/2019 Os Virtualization (1)

    4/1074Ivan Boule

    Virtualization

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

  • 8/10/2019 Os Virtualization (1)

    5/1075Ivan Boule

    Virtualization

    Goals of System Virtualization

    Reduction of Total Cost of Ownership (TCO)

    Increase utilisation of server resources

    Reduction of Total Cost of Functioning

    Energy consumption

    Cooling

    Occupied Space

    Hardware ConsolidationReduction of Build Of Material (BOM) for high-volume

    low-end products

  • 8/10/2019 Os Virtualization (1)

    6/107

    Virtualization in high-throughput networkequipments

    6

  • 8/10/2019 Os Virtualization (1)

    7/1077

    Virtualization in Multimedia devices

    Reduction of Build Of Material (BOM) for high-volume low-end

    products

    No need for a General Purpose Processor

    ~ 20 to 25 % BOM reduction

    Run Linux together with OS supporting

    Codecs on a single TI DSP

    Leverage Linux environment

    Reuse existing DSP software

    Vi t li ti

  • 8/10/2019 Os Virtualization (1)

    8/1078Ivan Boule

    Virtualization

    Usages of Virtual Machines

    Server virtualization

    Web sites hosting

    OS fault recovery

    OS kernel development

    Test machine = development host

    OS/kernel education & training

    Keep backward compatibility of legacy software

    Hardware no more available

    Run applications not supported by host OS

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    9/1079Ivan Boule

    Virtualization

    Recovery Servers

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    10/10710Ivan Boule

    Virtualization

    Multi-Core CPU Issues (1)

    CPU power gain

    No more achieved through Frequency/Speed increase

    But obtained with higher density & multi-core chips

    Many RTOS designed with mono-processor assumption

    Adding multi-processor support is complex & costly

    Scaling requires time, at best...

    Legacy RT applications also designed for mono-processor

    Adaptation to multi-pro even more difficult than RTOS

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    11/107

    11Ivan Boule

    Virtualization

    Multi-Core CPU Issues (2)

    OS virtualization allows to run simultaneously on

    a multi-cores CPU multiple [instances of] mono-

    processor OS's

    Each OS instance is run in a [mono-processor]

    Virtual Machine assigned to a single CPU core

    No need to change legacy software

    Scalability managed at virtualization level

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    12/107

    12Ivan Boule

    Virtualization

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    13/107

    13Ivan Boule

    Virtualization

    System Virtualization Principles

    Run multiple OS's on the same machine

    By design, an OS assumes to have full control

    over all physical resources of the machine

    Manage sharing/partitioning of machine

    resources between Guest OS's

    CPU

    Physical memory & MMU

    I/O devices

  • 8/10/2019 Os Virtualization (1)

    14/107

    14

    Machine Interfaces

    Hardware

    OS

    App App App

    System Calls User ISA

    System ISAISA

    Hardware

    OS

    App App App

    System Calls User ISA

    System ISAABI

    ISA = Instruction Set Architecture

    System level interfaceAll CPU instructions, memory architecture, I/O

    ABI = Application Binary InterfaceProcess level interface

    User-level non privileged ISA instructions + OS systems calls

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    15/107

    15Ivan Boule

    Virtualization Taxonomy

    Process level virtualization

    Emulation of Operating System ABI

    Virtual Servers

    System level virtualization

    Standalone / Hosted Virtualization

    Machine Emulation / Machine Virtualization

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    16/107

    16Ivan Boule

    Hosted versus Standalone

    Virtualization

    Hosted Virtualization

    Hosted VM Monitor (VMM) runs on top of native OS

    VMware WKS, Microsoft VirtualPC, QEMU, UML

    Standalone Virtualization

    VMM directly runs on bare hardware

    VMware ESX, IBM/VM, Xen, VLX, KVM

    OS run in a VM is named a Guest OS

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    17/107

    17Ivan Boule

    Hosted Virtualization

    Hardware

    Native OS

    VMM

    Guest OS

    Applications

    VMM

    Guest OS

    Applications

    VMM

    Guest OS

    Applications

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    18/107

    18Ivan Boule

    Example : VMware Workstation

    Hosted VM

    Unmodified OSes

    Specific devicedrivers

    X86 only

    Guest OS executed

    in user mode

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    19/107

    19Ivan Boule

    Standalone Virtualization

    Hardware

    VMM

    Guest OS

    Applications

    Guest OS

    Applications

    Guest OS

    Applications

    Virtualization

    VM ESX

  • 8/10/2019 Os Virtualization (1)

    20/107

    20Ivan Boule

    VMware ESX

    Standalone VM

    Supports unmodified OS binaries

    Configuration with appropriate device drivers

    X86 only

    OS hosted

    in user mode

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    21/107

    21Ivan Boule

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    22/107

    22Ivan Boule

    Process level ABI Emulation

    Goal: execute binary applications of a given

    system Xon the ABI of another system Y

    Emulate system XABI on top of system YABIEmulation done by application-level code

    System Ymust provide services equivalent to

    those of system X(file system, sockets, etc.)

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    23/107

    23Ivan Boule

    Process Level (ABI) Emulators

    Wine - Windows Emulator on Unix/LinuxWindows API in userland

    Adobe Potosop! "oo#le Pi$asa! %%%

    &'#winUnix emulation on Windows

    POSI( librar'

    )as sell * man' Unix $ommands"+U de,elopment tool $ain #$$! #db.

    X Window, GNOME, Apache, sshd, ...

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    24/107

    24Ivan Boule

    Virtual Ser,ers

    Single OS kernel / Multiple resource instancesIsolated kernel execution environments

    Root file system

    IP tablesProcess for signals

    Solaris 10 Containers

    Linux-VServer

    FreeBSD Jail

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    25/107

    25Ivan Boule

    Virtual Ser,ers

    Kernel Code

    P7

    P5

    P6

    P3 P8 P9

    /roots/vm1

    10.16.0.0/16

    P5 P6

    /roots/vm2

    10.17.0.0/16

    P7

    /roots/vm3

    10.18.0.0/16

    P3

    P8 P9

    /

    74.125.0.0/16

    P1 P2

    P1

    P2

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    26/107

    26Ivan Boule

    Virtual Servers

    Pro's

    CPU independent

    Lightweight

    Low memory footprint

    Low CPU overhead

    Scalable

    &ons

    +o OS etero#eneit' no "POS/01OS $ombination.Sin#le OS binar' instan$e $ommon point o2 2ailure.

    Intrusi,e3 must modi2' OS 4 2ollow OS updates

    Virtualization

    Plan

  • 8/10/2019 Os Virtualization (1)

    27/107

    27Ivan Boule

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    28/107

    28Ivan Boule

    Transparent Hardware Emulation

    0un unmodi2ied OS binaries

    Includes emulation of physical devices

    Cross ISA EmulationQEMU

    Same ISA Emulation

    VirtualBox (Intel x86)

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    29/107

    29Ivan Boule

    Transparent Hardware Emulation

    Emulate machine X on top of machine Y

    Interpretation

    1 instruction of X executed by N instructions of Y

    Huge slow down method (1/1000 if X Y)

    Dynamic Binary Translation

    Convert blocs of X instructions in Y instructions

    Application-level emulator runs on a native OS

    One VM running a single Guest OS

    Virtualization

    567U A it t

  • 8/10/2019 Os Virtualization (1)

    30/107

    30Ivan Boule

    3

    567U Ar$ite$ture

    Solaris (Native OS)

    QEMU

    PowerPC

    RT-OS

    Real-Time

    Applications

    QEMU

    PC-x86

    Linux

    LinuxApplications

    Solaris ABI

    QEMU

    PC-x86

    Windows

    Windows

    Applications

    Sun - Sparc

    PowerPC ISA x86 ISA x86 ISA

    Solaris

    Process

  • 8/10/2019 Os Virtualization (1)

    31/107

    Virtualization

    Plan

  • 8/10/2019 Os Virtualization (1)

    32/107

    32Ivan Boule

    Plan

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    33/107

    33Ivan Boule

    Transparent Hardware Virtualization

    Share machine resources among multiple VMs

    Execute native/unmodified OS binary images

    Provide in each VM a complete simulation of

    hardware

    Full CPU instruction set

    Interrupts, exceptions

    Memory access and MMU

    I/O devices

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    34/107

    34Ivan Boule

    Full CPU Virtualization

    Present same functional CPU to all Guest OSes

    VMM manages a CPU context for each VM

    saved copy of CPU registers

    representation of software-emulated CPU context

    VMM shares physical CPUs among all VMs

    VMM includes a VM scheduler

    round-robin

    priority-based

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    35/107

    35Ivan Boule

    Full CPU Virtualization

    Relationships between a VMM and VMs similar to

    relationships between native OS and applications

    Guarantee mutual isolation between all VMs

    Protect VMM from all VMs

    Directly execute native binary images of Guest

    OS's in non-privileged mode

    VMM emulates access to protected resources

    performed by Guest OSs

    Virtualization

    CPU Virtualization

  • 8/10/2019 Os Virtualization (1)

    36/107

    36Ivan Boule

    CPU Virtualization

    Run each Guest OS in non-privileged modeEx: ring compression on Intel x86

    Applications

    Guest OS

    Applications

    Guest OS

    Applications

    Guest OS

    VM0 VM1 VM2

    Virtual Machine Monitor (VMM)Ring 0

    Ring 1

    Ring 3

    Virtualization

  • 8/10/2019 Os Virtualization (1)

    37/107

    37Ivan Boule

    Hardware-Sensitive Instructions

    Interact with protected hardware resourcesPrivileged Instructions

    Critical Instructions

    Cannot be directly executed by Guest OS's

    Must be detected and faked by VMM

    Dynamic Binary Translation of kernel code

    Done once, saved in Translation Cache

    Example: Vmware

    Virtualization

    Privileged Instructions Virtualization

  • 8/10/2019 Os Virtualization (1)

    38/107

    38Ivan Boule

    Privileged Instructions Virtualization

    Only allowed in supervisor mode

    Ex: cli/stito mask/unmask interrupts on Intel x86

    When executed in non-privileged mode

    CPU automatically detects a privilege violation

    Triggers a privilege-violation exception

    Caught by VMM which fakes the expected effect

    of the privileged instruction

    Ex: cli/sti

    VMM does not mask/unmask CPU interrupts

    records interrupt mask status in context of VM

    Virtualization

    Critical Instr ctions Virt ali ation (1)

  • 8/10/2019 Os Virtualization (1)

    39/107

    39Ivan Boule

    Critical Instructions Virtualization (1)Hardware-sensitive instructions

    Ex: Intel IA-32pushf/popf

    pushf /* save EFLAG reg. to stack */

    cli /* mask interrupts => clear EFLAG.IF */

    popf /* restore EFLAG reg. => unmask interrupts */

    When executed in non-privileged mode

    The cliinstruction triggers an exception caught by VMM=> VMM record interrupts masked for current VM

    But no exception for popf=> VMM not aware of Guest

    OS action (unmask interrupts)

    Virtualization

    C iti l I t ti Vi t li ti (2)

  • 8/10/2019 Os Virtualization (1)

    40/107

    40Ivan Boule

    Critical Instructions Virtualization (2)

    Must be detected and emulated by VMMVMM dynamically analyses Guest OS binary code

    to find critical instructions

    VMM replaces critical instructions by a trap instruction to enter the VMM

    VMM emulates expected effect of critical

    instruction, if any.

    Ex: pushf/popf combined with cli/stiinstructions

    Virtualization

    Full Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    41/107

    41Ivan Boule

    Full Memory Virtualization

    CPU include a Memory Management Unit (MMU)Isolated memory addressing spaces

    Independant of underlying physical memory layout

    Run mutually protected applications in parallelVirtual Memory managed by OS kernel

    Provides a virtual address space to each process

    4 GB on most 32-bit architectures (Intel x86, PowerPC)

    Manages virtual page physical case mappings

    Manages swap space to extend physical memory

    Virtualization

    77U 4 Virtual Address Spa$e

  • 8/10/2019 Os Virtualization (1)

    42/107

    42Ivan Boule

    1ranslationLoo8aside

    )u22er

    77U 4 Virtual Address Spa$e

    Virtual Address Spa$es

    P'si$al7emor'

    77U

    case N

    pte4GB

    page

    case 0

    Virtualization

    Intel x9: 77U

  • 8/10/2019 Os Virtualization (1)

    43/107

    43Ivan Boule

    Intel x9: 77U

    ;< >>

    DirectoryIndex

    Page Table

    Index

    PageOffset

    cr/st

    1023

    cr/st

    0

    1023

    10 bits 10 bits 12 bitsVirtual

    Address

    Directory Page

    CR3

    Directory Address

    Physical Memory

    32-bit word

    cr/st =control & status

    Page Table Entry (PTE)cr/st

    4KB page

    Translation Lookaside Buffer (TLB) = cache for PTEs

    0

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    44/107

    44Ivan Boule

    Memory Virtualization

    Machine Physical Memory

    Physical memory available on the machine

    Guest OS Physical Memory

    Part of machine memory assigned to a VM by VMM

    Guest Physical Memory can be > Machine Memory

    VMM uses swap space

    Guest OS Virtual Memory

    Guest OS manages virtual address spaces of its

    processes

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    45/107

    45Ivan Boule

    Memory Virtualization

    Guest OS manages Guest Physical PagesManages MMU with its own page entries

    Translates Virtual Addresses into Guest Physical

    Addresses (GPA)VMM transparently manages Machine Physical

    Pages

    Guest Physical AddressMachine Physical Address

    VMM dynamically translates Guest Physical Pages

    into Machine Physical Pages

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    46/107

    46Ivan Boule

    Memory Virtualization

    2000

    6000

    1000

    4000

    7000

    1000

    5000

    1000

    8000

    10003000

    unmapped virtual page

    0

    3000

    7000

    unmapped Guest page

    mapped virtual page

    machine

    memory

    Process

    virtual

    space

    VM 2

    Guest

    Physical

    Memorymapped Guest page

    VM 1

    P1.1 P1.2 P2.1

    Machine physical page

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    47/107

    47Ivan Boule

    Memory Virtualization

    VMM maintains Shadow Page Tables

    Copies of Guest OS translation tables

    VMM catches updates operations of translation

    tables performed by a Guest OS

    Write-protect all guest OS page tables

    Emulates operation in shadow page table

    Updates effective MMU page table entry, if needed

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    48/107

    48Ivan Boule

    Memory Virtualization

    PTE entries can be tagged with a context ID

    Avoids to flush TLB when switching current address

    space upon scheduling of a new process

    usually PTE tag = OS process identifier

    Processes of different Guest OSes can be

    assigned the same Process ID

    => VMM must flush TLB when switching VMs

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    49/107

    49Ivan Boule

    Memory Virtualization

    VMM must respect Guest OS virtual page faults

    Not map virtual pages unmapped by Guest OS

    When Guest OS unmaps a virtual page:

    VMM must delete the associated real-page/physical page

    mapping, if any.

    Conversely, VMM can transparently:

    Introduce & resolve real-page faults for Guest OSes

    Share physical pages between Guest OS's

    Pages with same content's (e.g. zero-ed pages)

    Virtualization

    Memory Virtualization

  • 8/10/2019 Os Virtualization (1)

    50/107

    50Ivan Boule

    Memory Virtualization

    VMM can swap real pages of a VMon swap space managed by VMM

    VMM can dynamically distribute physical memory

    among VM'sNeeds a specific support in Guest OS (Linux module)

    VMM asks Guest OS to release memory

    Guest OS self-allocates [real] pages=> no more available for normal [kernel] allocation service

    VMM assigns same amount of physical pages to other VM's

    Virtualization

    Plan

  • 8/10/2019 Os Virtualization (1)

    51/107

    51Ivan Boule

    History

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

    Paravirtualization

  • 8/10/2019 Os Virtualization (1)

    52/107

    52Ivan Boule

    Paravirtualization

    OS adaptation to avoid binary translationoverhead

    Requires access to OS source code

    Include drivers of virtual devices

    Examples

    Xen

    User Mode Linux (UML)

    Virtualization

    Paravirtualization Principles

  • 8/10/2019 Os Virtualization (1)

    53/107

    53Ivan Boule

    Still run each Guest OS in non-privileged mode

    But with minimal virtualization overhead

    => Modified Guest OS kernel

    Remove Hardware-Sensitive InstructionsUse fast VMM system calls instead, if needed

    Minimise usage of Privileged Instructions

    Only affect Machine/CPU dependant part of OSOS portage on new architecture with same CPU

    Without system ISA

    Virtualization

    Paravirtualization (2)

  • 8/10/2019 Os Virtualization (1)

    54/107

    54Ivan Boule

    ( )

    Guest OS only use Virtual I/O Devices

    Front-end driver in Guest OS

    Back-end driver in VMM

    Data transfer through asynchronous I/O rings

    Avoid extra I/O data copies of Full Virtualization

    VMM multiplex VM Virtual Devices on physical

    devicesVirtual Ethernet

    Virtual Disks

    Virtualization

    Virtual I/O ?e,i$es

  • 8/10/2019 Os Virtualization (1)

    55/107

    55Ivan Boule

    Ethernet NIC

    Vdisk

    (be)

    Applications

    Guest OS

    Vdisk

    (fe)Veth

    (fe)

    Veth

    (be)

    Applications

    Guest OS

    Vdisk

    (fe)Veth

    (fe)

    Applications

    Guest OS

    Vdisk

    (fe)Veth

    (fe)

    NIC

    driver

    Disk

    driver

    Veth

    (be)

    Veth

    (be)

    Vdisk

    (be)

    Vdisk

    (be)

    Net Bridging

    VMM

  • 8/10/2019 Os Virtualization (1)

    56/107

    Virtualization

    Plan

  • 8/10/2019 Os Virtualization (1)

    57/107

    57Ivan Boule

    History

    Virtualization UsagesVirtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware Emulation

    Transparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

    Hardware Assisted Virtualization

  • 8/10/2019 Os Virtualization (1)

    58/107

    58Ivan Boule

    Hardware-Assisted Virtualization

    Support o2 Virtualization in Hardware

    Run unmodified OS binaries

    Wit minimal ,irtualization o,eread

    Simpli2' V77 de,elopment

    6xamples

    @V7 Intel-V1! A7?-V.V7ware Intel-V1.

    Virtualization

    Hardware Assisted Virtualization

  • 8/10/2019 Os Virtualization (1)

    59/107

    59Ivan Boule

    CPU virtualizationAMD-V

    Intel VT-x (x86), Intel VT-i (Itanium) architectures

    ARM Cortex-A15MMU virtualization

    Intel Extended Page Tables (EPT)

    A7? Nested Pa#e Tables +P1.

    Virtualization

    Hardware Assisted Virtualization

  • 8/10/2019 Os Virtualization (1)

    60/107

    60Ivan Boule

    Hardware Assisted Virtualization

    DMA virtualizationIO-MMU

    I/O Device virtualization

    Self-Virtualizing devices

    SingleRoot I/OVirtualization and Sharing

    Specification (SR-IOV)

    Extensions to PCIe (PCI Express) Busstandard

    Virtualization

    Intel V1-x Ar$ite$ture

  • 8/10/2019 Os Virtualization (1)

    61/107

    61Ivan Boule

    e $ e$ u e

    Support unmodi2ied "uest OS wit no need 2orpara,irtualization and/or binar' $ode translation

    Simpli2' V77 tas8s 4 impro,e V77 per2orman$es

    7inimize V77 memor' 2ootprintSuppress sadowin# o2 "uest OS pa#e tables

    6nable "uest OS to dire$tl' mana#e I/O de,i$es

    Witout per2orman$e lost

    Wile en2or$in# V7 isolation and mutual prote$tion

    Virtualization

    Intel V1-x Ar$ite$ture O,er,iew

  • 8/10/2019 Os Virtualization (1)

    62/107

    62Ivan Boule

    VMM

    VMX root mode

    VMX non-root mode

    Intel-VT Hardware

    VM Exit VM Enter

    Guest OS

    kernel

    Applicationsring 3

    ring 0 Guest OS

    kernel

    Applicationsring 3

    ring 0Guest OS

    kernel

    Applicationsring 3

    ring 0

    VM 1 VM 3VM 2

    rings 0 - 3

    Virtualization

    Intel VT-x CPU Virtualization

  • 8/10/2019 Os Virtualization (1)

    63/107

    63Ivan Boule

    Virtual Ma$ine eXtension V7(.

    1wo new meta-modes o2 &PU operationV7( root mode

    )ea,iour similar to IA-=< witout V1

    Intended 2or V77 exe$ution

    V7( non-root mode

    Alternati,e IA-=< exe$ution en,ironment

    &ontrolled b' a V77

    ?esi#ned to run un$an#ed "uest OS in a V7

    )ot modes support rin#s ;-= pri,ile#e le,els

    Allow V77 to use se,eral pri,ile#e le,els

    Virtualization

    Intel VT-x CPU Virtualization

  • 8/10/2019 Os Virtualization (1)

    64/107

    64Ivan Boule

    1wo additional &PU mode transitions

    rom V7( root-mode to V7( non-root mode

    +amed VM Enter

    rom V7( non-root mode to V7( root mode

    +amed VM Exit

    V7 entries 4 V7 exits use a new data stru$ture

    Virtual Ma$ine Control Stru$ture V7&S. per V7

    0e2eren$ed wit a memor' p'si$al addressormat and la'out idden

    +ew V1-x instru$tions to a$$ess a V7&S

    Virtualization

    Intel VT-x CPU Virtualization

  • 8/10/2019 Os Virtualization (1)

    65/107

    65Ivan Boule

    "uest State Area

    Sa,ed ,alue o2 re#isters loaded b' V7 6xitse%#%! Se#ment 0e#isters! &0=! I?10.

    Hidden &PU state e%#%!CPU Interruptibility State.

    Host State Area

    V7 &ontrol ields

    Interrupt Virtualization

    6x$eptions bitmaps

    I/O bitmaps7odel Spe$i2i$ 0e#ister 0/W bitmaps

    6xe$ution ri#ts 2or &PU Pri,ile#ed Instru$tions

    Virtualization

    Intel VT-x Interrupt Virtualization

  • 8/10/2019 Os Virtualization (1)

    66/107

    66Ivan Boule

    V7&S 6xternal Interrupt 6xitin#All external interrupts $ause VM Exits

    "uest OS $annot mas8 external interrupts

    wen exe$utin# Interrupt 7as8in# instru$tionsV7&S Interrupt Window 6xitin#

    VM Exito$$urs wene,er "uest OS read' toser,e external interrupts

    Used b' V77 to $ontrol V7 interrupts

    Virtualization

    Intel VT-x MMU Virtualization

  • 8/10/2019 Os Virtualization (1)

    67/107

    67Ivan Boule

    Extended Pa#e Tables 6P1.Se$ond le,el o2 Pa#e 1ables in 77U

    1ranslate "uest OS P'si$al Address into7a$ine P'si$al Address

    &ontrolled b' V77

    Virtual Pro$essor IDenti2ier VPI?.

    Used to ta# 1L) entries

    A,oid to 2lus 1L) upon V7 swit$

    Virtualization

    Virtual 7emor' Virtualization

  • 8/10/2019 Os Virtualization (1)

    68/107

    68Ivan Boule

    '

    Process 1 Process 2 Process 1 Process 2VM 1 VM 2

    Guest OS

    Virtual Memory

    Guest OS

    Physical Memory

    Machine

    Physical Memory

    Virtualization

    Intel VT-x Extended Page Tables

  • 8/10/2019 Os Virtualization (1)

    69/107

    69Ivan Boule

    V77 $ontrols 6xtended Pa#e 1ables

    6P1 used in V7( non-root operation

    A$ti,ated on VM Enter

    ?esa$ti,ated on VM exit

    6P1P re#ister points to 6xtended Pa#e 1ables

    Instan$iated b' V77

    Sa,ed in V7&S

    Loaded 2rom V7&S onVM entry

    Virtualization

    Intel VT-x Extended Page Tables

  • 8/10/2019 Os Virtualization (1)

    70/107

    70Ivan Boule

    Translation Lookaside Buffer (TLB)

    Guest PTEs (Guest VA Guest PA)

    Extended PTEs (Guest PA Host PA)

    Guest

    Page Table

    EPT

    Page Table

    Guest CR3

    Guest VA Guest PA Machine PA

    EPTR Base Pointer

    Virtualization

    1L) lus Issue

  • 8/10/2019 Os Virtualization (1)

    71/107

    71Ivan Boule

    1ranslationLoo8aside

    )u22er

    Virtual Address Spa$es

    P'si$al7emor'

    77U

    case N

    pte

    4GB

    case 0

    Virtualization

    Intel VT-x Virtual Processor Identifier

  • 8/10/2019 Os Virtualization (1)

    72/107

    72Ivan Boule

    >:-bit VPI? used to ta# 1L) entries

    6nabled b' V77 in V7&S

    UniBue VPI? is assi#ned b' V77 to ea$ V7

    VPI? ; reser,ed 2or V77

    &urrent VPI? is ;x;;;; wen

    Outside V7( operation

    In V7( root mode operation

    In V7( non-root mode i2 VPI? disabled in V7&S

    VPI? loaded 2rom V7&S on VM Enter

    Virtualization

    ?7A Virtualization

  • 8/10/2019 Os Virtualization (1)

    73/107

    73Ivan Boule

    6nable "uest OS to mana#e I/O de,i$es

    I/O de,i$es assi#ned b' V77 to "uest OSes

    1ransparent mode

    Use nati,e de,i$e dri,er o2 "uest OS

    Unaware o2 p'si$al memor' Virtualization

    6n2or$e isolation between "uest Oses

    "uest OS onl' ,iew ardware ressour$esassi#ned b' V77 memor'! de,i$es.

    Virtualization

    ?7A Prin$iples

  • 8/10/2019 Os Virtualization (1)

    74/107

    74Ivan Boule

    CPUBus

    Controller

    Memory

    Device 1 Device 2 Device 3

    DMA

    Request

    System

    Bus

    I/O Bus (PCI)

    Virtualization

    ?7A Virtualization

  • 8/10/2019 Os Virtualization (1)

    75/107

    75Ivan Boule

    Device1

    Applications

    Guest OS

    Applications

    Guest OS

    Applications

    Guest OS

    Device2

    VM 1 VM 3VM 2

    Isolation

    Domain

    VMM

    Machine Physical

    Memory

    Virtualization

    ?7A Virtualization Issue

  • 8/10/2019 Os Virtualization (1)

    76/107

    76Ivan Boule

    "uest OS dri,er setup I/O re#isters o2

    de,i$e wit "uest P'si$al Address o2I/O bu22ers

    "uest P'si$al Address must be

    translated into its $orrespondin#7a$ine P'si$al Address wen used 2or?7A operations b' de,i$e

    "PA 1ranslation $annot be done b' V77V77 $annot $at$ de,i$e-spe$i2i$ dri,eroperations to setup I/O bu22ers addresses

  • 8/10/2019 Os Virtualization (1)

    77/107

    Virtualization

    Intel V1-d ?7A 1ranslation

  • 8/10/2019 Os Virtualization (1)

    78/107

    78Ivan Boule

    V1-d ardware treats address spe$i2ied in

    ?7A reBuest as D7AVirtual Address ?VA.

    ?VA C "PA o2 te V7 to wi$ te I/Ode,i$e is assi#ned

    V1-d translates te ?VA into its$orrespondin# 7a$ine P'si$al Address

    Support o2 multiple Prote$tion ?omains

    ?VA to 7PA translation table per Prote$tion?omain

    7ust identi2' te de,i$e issuin# a ?7A reBuest

    Virtualization

    V1-d P&I 6xpress +ort )rid#e

  • 8/10/2019 Os Virtualization (1)

    79/107

    79Ivan Boule

    CPUMemory

    Device 1 Device 2 Device 3

    System

    Bus

    PCI Express Bus

    North Bridge

    VT-d

    PCIe root ports

    Virtualization

    P&I ?7A 0eBuester Identi2i$ation

  • 8/10/2019 Os Virtualization (1)

    80/107

    80Ivan Boule

    7appin# between P&I ?e,i$e and Prote$tion

    ?omains>:-bit P&I ?7A 0eBuester Identi2ier

    Assi#ned b' P&I $on2i#uration so2tware

    )us D indexes )us &ontext 1able in 0oot

    &ontext 1able?e,i$e D! un$tion D. indexes ?e,i$eProte$tion ?omain in )us &ontext 1able

    15 8 7 3 2 0

    PCI Bus # Device # Function #

    Virtualization

    ?e,i$e / Prote$tion ?omain 7appin#

    (Dev 0 Func 0)

  • 8/10/2019 Os Virtualization (1)

    81/107

    81Ivan Boule

    Bus 255

    Bus N

    Bus 0Context Table of Bus 0

    (Dev 0, Func 0)

    (Dev 0, Func 1)

    (Dev 31, Func 7)

    Root Context TableContext Table of Bus N

    (Dev 0, Func 0)

    (Dev 0, Func 1)

    (Dev 31, Func 7)

    Protection Domain 0

    Protection Domain 1

    VDA MPATranslation

    Tables

    VDA MPA

    Translation

    Tables

    Virtualization

    Virtual ?7A Address 1ranslation

  • 8/10/2019 Os Virtualization (1)

    82/107

    82Ivan Boule

    V?A E 7PA V1-d Pa#e 1ables similar toIA-=< pro$essor Pa#e 1ables

    F@) or lar#er pa#e size #ranularit'

    0ead/Write permissions

    Prote$tion ?omains mana#ed b' V77

    Initialized at V7 $reation time

    Wit same translations o2 te V7 6xtendedPa#e 1able

    Virtualization

    ?e,i$e Virtualization

  • 8/10/2019 Os Virtualization (1)

    83/107

    83Ivan Boule

    Sare I/O de,i$e amon# multiples V7sWit no per2orman$e lost

    Wile en2or$in# V7 isolation and prote$tion

    7o,e de,i$e ,irtualization 2rom te V77to te de,i$e itsel2

    0eBuires support 2rom te de,i$e

    6xample o2 6ternet $ontrollers

    Virtualization

    6ternet ?e,i$e Virtualization

    VM 1 VM 2

  • 8/10/2019 Os Virtualization (1)

    84/107

    84Ivan Boule

    VM 1 VM 2

    VMM

    GuestOS 1 GuestOS 2

    pNIC Driver

    vNIC Driver vNIC Driver

    Virtual

    Function

    Virtual

    Function

    LAN

    Virtualization

    Intel Sin#le Root I/OVirtualization

  • 8/10/2019 Os Virtualization (1)

    85/107

    85Ivan Boule

    S0-IOV $apable P&I ?e,i$e $an bepartitionned into multiple Virtual un$tions

    S0-IOV ?e,i$e appears in P&I $on2i#urationspa$e as multiple P&I Virtual un$tions

    6a$ ?e,i$e Virtual un$tion in$ludes

    P&I $on2i#uration re#isters

    ?7A streams

    Interrupts

    0eBuires V1-d 2or ?7A ,irtualization

    Virtualization

    Intel S0-IOV

    V77 i l P&I d i

  • 8/10/2019 Os Virtualization (1)

    86/107

    86Ivan Boule

    V77 mana#es p'si$al P&I de,i$e

    &reate a P&I Virtual un$tion 2or ea$ V7

    In$lude it into V7 P&I $on2i#uration spa$e tobe probed b' V7 "uestOS 8ernel

    7ap it to Prote$tion ?omain o2 V7

    Pro#rams te sarin# o2 p'si$al de,i$esressour$es between Vs

    P&I ?e,i$e Virtual un$tions dire$tl'mana#ed b' spe$i2i$ V-Aware "uestOSdri,ers 8ind o2 Para-Virtualization.

    Virtualization

    Intel S0-IOVVM 1 VM 2 VM n

  • 8/10/2019 Os Virtualization (1)

    87/107

    87Ivan Boule

    MAC/PHY

    Layer 2 Filtering (Mac Address)

    OS vNIC

    Driver

    Packet

    Queue 1

    VF1

    Packet

    Queue 2

    VF2

    Packet

    Queue n

    VFn

    OS vNIC

    Driver

    VMM

    OS vNIC

    Driver

    Intel Niantic

    10GbE Port

    LAN

    Virtualization

    Intel S0-IOV - 6ternet example

  • 8/10/2019 Os Virtualization (1)

    88/107

    88

    Ivan Boule

    Intel @awela >"). / +ianti$ >;"). 6ternet +I&s

    7ultiple 0(/1( pa$8et Bueues per port

    Virtual De,i$e Ma$ine Queues

    > 0( paBuet Bueue per V

    ilters multiple uni$ast 6ternet Addresses

    La'er-< paBuet 2ilterin# based on 6ternet?estination Address

    ?upli$ate )road$ast / 7ulti$ast pa$8ets 2or all Vs

    Load balan$in# between 1( paBuets sent b' Vs

    Virtualization

    PlanHistory

  • 8/10/2019 Os Virtualization (1)

    89/107

    89

    Ivan Boule

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware EmulationTransparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted Virtualization

    Virtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

    Old Embedded Systems

  • 8/10/2019 Os Virtualization (1)

    90/107

    90

    Ivan Boule

    Relatively simple architectureSingle-purpose devices

    Dominated by hardware constraints

    Memory, battery charge

    Dedicated functionalities, with moderated

    software size and complexity

    Real-time constraints

    Virtualization

    Old Embedded Systems (2)

  • 8/10/2019 Os Virtualization (1)

    91/107

    91Ivan Boule

    Closed environment ( black boxes )Fixed hardware configuration

    Full software provided by device vendor

    No dynamic loading of applications

    Software updates rareful

    Virtualization

    Embedded Systems Now (1)

    Take on features of general purpose OS's

  • 8/10/2019 Os Virtualization (1)

    92/107

    92Ivan Boule

    Take on features of general-purpose OS s

    Growing functionalities

    => growing complexity and size

    Run applications originally developed for PC's

    Sophisticated Human Machine Interfaces (HMI)

    Safari Web browser on iPhones

    Dynamic loading of applications

    Iphone

    Google Android

    Virtualization

    Embedded Systems Now (2)

    Dynamically load device's owner specific applications

  • 8/10/2019 Os Virtualization (1)

    93/107

    93Ivan Boule

    Dynamically load device s owner specific applications

    Games

    Applications developped by engineers with no expertise

    in embedded systems

    Java applicationsNeed for exchanges with external world

    USB, Bluetooth, Wi-Fi

    TCP/IPNeed for open API's, and openness in general

    Need for high-level systems (Linux, Windows)

    Virtualization

    Embedded Systems Challenges

    Still Real-Time systems (part of it)

  • 8/10/2019 Os Virtualization (1)

    94/107

    94Ivan Boule

    Still Real Time systems (part of it)

    Baseband stack of mobile phones

    Still hardware constraints

    Battery

    Memory (to minimize device's cost)

    Also used in mission/life critical situations

    Weapons

    Cars

    High requirements on reliability and security

    Mobile HandsetsRun Android/Linux

    applications on baseband

  • 8/10/2019 Os Virtualization (1)

    95/107

    95

    OS Virtualization La'er

    7odem

    01OS

    Wireless Sta$8s3

    "S7/"P0S

    6d#e/U71S

    1elepon'Ser,i$es

    Appli$ations

    Android/Linux

    H7I

    PI7

    Internet

    "ames

    7ultimedia

    Pone HW Sin#le &PU.

    ppprocessor

    Re-use existing legacymodem software stack withits RTOS (no changes)

    Support of Linux at a

    minimal development costOperating System

    independence for futureevolutions

    Security & Protection

    through OS isolation

    HMI: Human-Machine-InterfacePIM: Personal Information Mngt.

    Virtualization

    Virtualization in Embedded Systems

    Support for heterogeneous OS's environments

  • 8/10/2019 Os Virtualization (1)

    96/107

    96Ivan Boule

    Real-time OS

    Legacy software

    Dedicated applications whose real-time constraints

    cannot be achieved by General-Purpose systemsLicence issues ( GPL contamination )

    General Purpose OS

    Openness

    HMI

    Virtualization

    Virtualization in Embedded Systems

    Concurrent execution of RTOS and GP OS on

  • 8/10/2019 Os Virtualization (1)

    97/107

    97Ivan Boule

    Concurrent execution of RTOS and GP-OS on

    the same CPU

    Reduces cost (Bill Of Material)

    Requires the underlying VMM to provideMemory isolation between OS's

    CPU scheduling among OS's, with higher priority to

    the RTOS

    Device partitionning

    Communication mechanism between OS's

    Virtualization

    Virtualization in Embedded Systems

    Leverage multi-cores support with virtual

  • 8/10/2019 Os Virtualization (1)

    98/107

    98Ivan Boule

    ag s s pp a

    machine abstraction

    1 core per OS => no need for CPU scheduling

    2 low-performance cores consume less power

    than a single high performance CPU=> simplify power management

    New model of software distribution, shipping

    application with its own OSNo OS configuration/version incoherency

    Virtualization

    Security Through Virtualization

  • 8/10/2019 Os Virtualization (1)

    99/107

    99Ivan Boule

    Notion of Trusted Computing Base (TCB)Part of the system that provides security foundations

    Should only include hardware and VMM

    May also include RTOS, for performance/legacyreasons

    Run GP OS in an isolated Virtual Machine

    Avoid damaged GP OS to compromise the secure

    parts (data, services) of the system

    Virtualization

    Embedded + Virtualization

    Challenges (1)

  • 8/10/2019 Os Virtualization (1)

    100/107

    100Ivan Boule

    g ( )

    ull isolation o2 V7s does not 2it$ooperation reBuirements between OSs

    622i$ient $ommuni$ation me$anismsbetween V7s

    "lobal s$edulin#! wit interlea,edpriorities

    "lobal 6ner#' 7ana#ement

    Virtualization

    Embedded + VirtualizationChallenges (2)

  • 8/10/2019 Os Virtualization (1)

    101/107

    101Ivan Boule

    622i$ient $ommuni$ation me$anisms betweenV7s

    Virtual Ethernet device not adapted

    Need VMM-controlled shared memory transfers

    Example: Video streaming on a Smartphone

    Video data received via the baseband managed by RTOS

    Video data displayed by a Media Player running on GPOS

    Avoid copy of video data transfered between the 2 OS's !

    Virtualization

    Task Scheduling Issues

    Standard server-oriented Virtualization model

  • 8/10/2019 Os Virtualization (1)

    102/107

    102Ivan Boule

    The VMM schedules VM's on the CPU

    The OS on each VM runs its own scheduler

    Interleaved priorities in Embedded Systems

    Baseband task of RTOS with a high priority

    But GPOS Media-Player must have a higher

    priority than some low-priority tasks of RTOS

    Enable a VM to yield the CPUUse a RT task as a proxy of GP OS application, and

    make it yield the CPU

    Virtualization

    Multi-Users Devices

    Mobile phone has 3 types of users, each with

  • 8/10/2019 Os Virtualization (1)

    103/107

    103Ivan Boule

    specific private data to protect from the othersThe person owning the device, with address book,

    emails, documents, etc.

    Different wireless providers, for example private

    and professionnal: network access properlyauthenticated, ensure correct billing !

    Third-party service providers, for instance

    multimedia providers.

    Owner and third-parties must be granted

    secure financial transactions

    Virtualization

    Virtualization in Hardware

    Only way to build a real TCB

  • 8/10/2019 Os Virtualization (1)

    104/107

    104Ivan Boule

    Only way to build a real TCB

    Without penalizing performances

    Should include support for

    Memory Partitionning

    Physical Memory / Machine Memory mapping

    Coupled with multi-cores

    Device Partitioning

    Interrupt routing

    I/O DMA coupled with memory partitioning &

    Physical Memory / Machine Memory mapping

    Virtualization

    PlanHistory

    Virtualization Usages

  • 8/10/2019 Os Virtualization (1)

    105/107

    105Ivan Boule

    Virtualization Usages

    Virtualization Taxonomy

    Process Level Virtualization

    Transparent Hardware EmulationTransparent Hardware Virtualization

    Paravirtualization

    Hardware-Assisted VirtualizationVirtualization and Embedded Systems

    Evolution of Virtualization

    Virtualization

    Evolutions of Virtualization

  • 8/10/2019 Os Virtualization (1)

    106/107

    106Ivan Boule

    VMM shipped with hardwareBy CPU/Motherboard constructor

    With extensions of Machine constructor

    Plays BIOS role

    VMM still a boot option ?

    VMM available as free software ?

    VMM available as Open Source ?

    Virtualization

    Evolutions of Virtualization

    Public & Open VMM API

  • 8/10/2019 Os Virtualization (1)

    107/107

    107Ivan Boule

    Public & Open VMM API

    Ease port of OS on top of VMM

    Device Virtualization

    Included in VMM APIAllow generic device drivers in OS's

    Increase OS portability