checkpoint/restore in the palacios virtual machine monitor

22
Checkpoint/Restore in the Palacios Virtual Machine Monitor EECS 441 – Resource Virtualization Steven Jaconette, Eugenia Gabrielova, Nicoara Talpes Instructor: Peter Dinda

Upload: cili

Post on 05-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Checkpoint/Restore in the Palacios Virtual Machine Monitor. EECS 441 – Resource Virtualization Steven Jaconette, Eugenia Gabrielova, Nicoara Talpes Instructor: Peter Dinda. Agenda. Background Motivation Design Implementation Future work. Virtual Machine Monitors. Virtual Machine: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Checkpoint/Restore in the Palacios Virtual Machine Monitor

EECS 441 – Resource Virtualization Steven Jaconette, Eugenia Gabrielova, Nicoara Talpes

Instructor: Peter Dinda

Page 2: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Agenda

• Background

• Motivation

• Design

• Implementation

• Future work

Page 3: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Virtual Machine Monitors

• Virtual Machine:o Software emulation or virtualization of a machine

 • Virtual Machine Monitor (VMM):

o Allow multiple OS's to access physical machine resources

 • Each guest OS believes it is running directly on

hardware

Page 4: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Palacios VMM

• Virtual Machine Monitor developed at Northwestern • Targeted at the Red Storm supercomputer at Sandia

National Laboratories • Linked into a Host OS, allows both 64 and 32 bit

guests. • Provides guests with functionality of Intel/AMD

processor, memory, interrupts, and hardware devices.

Page 5: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Palacios VMM Structure

Page 6: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Checkpoint / Restore

• Checkpoint: suspending a running OS instance, and copy it somewhere else (kernel, disc)

• Restore: copy the OS instance to its destination• Used as part of OS migration• Useful when you know a machine will fail and want to

move memory to a different place fast

Page 7: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Checkpoint / Restore

Page 8: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Checkpoint / Restore

Page 9: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Motivation

• Palacios cannot currently put guests to sleep, restore with memory intact

  • This functionality is the first step toward live-migration of

guests • Both checkpointing and live-migration have important

applications in supercomputing

Page 10: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Checkpoint / Restore in Other Systems• Used in live OS migration

• VMware Virtual Center: quiescing the VM after the pre-copy state

• Xen Virtual Machine Monitor, same procedure as VMWare:

• OS instance suspends itself, is moved to destination host. Then the suspended copy of the VM state is resumed

Page 11: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Guest State in Palacios

• Structures that make up VM: VMCB, Registers, pointers from guest info

 •  Devices

 •  Interrupts

 •  Static Information

 •  Pointers

Page 12: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Guest Statestruct guest_info {  ullong_t rip;

  uint_t cpl;

  addr_t mem_size; // Probably in bytes for now....  v3_shdw_map_t mem_map;

  struct vm_time time_state;

  v3_paging_mode_t shdw_pg_mode;  struct shadow_page_state shdw_pg_state;  addr_t direct_map_pt;  // nested_paging_t nested_page_state;

  // This structure is how we get interrupts for the guest  struct v3_intr_state intr_state;

  v3_io_map_t io_map;

  struct v3_msr_map msr_map;

  // device_map  struct vmm_dev_mgr  dev_mgr;

  struct v3_host_events host_event_hooks;

  v3_vm_cpu_mode_t cpu_mode;  v3_vm_mem_mode_t mem_mode;  

  struct v3_gprs vm_regs;  struct v3_ctrl_regs ctrl_regs;  struct v3_dbg_regs dbg_regs;  struct v3_segments segments;

  v3_vm_operating_mode_t run_state;  void * vmm_data;

  uint_t enable_profiler;  struct v3_profiler profiler;

  void * decoder_state;

  v3_msr_t guest_efer;

  /* Do we need these ? */  v3_msr_t guest_star;  v3_msr_t guest_lstar;  v3_msr_t guest_cstar;  v3_msr_t guest_syscall_mask;  v3_msr_t guest_gs_base;};

Page 13: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Design 1: Serialization

• "Flatten" guest state information at checkpoint

• Not all guest information should be checkpointedo Devices, Interruptso Static data from XML files

• Restore from saved guest state informationo Similar to configuring virtual machine at boot

Page 14: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Design 2: Per-Guest Heap with Pointer Tagging• For each guest's heap: checkpoint heap and restore it to

address space  • Starting address for heap could be different after copy

 • Make sure pointers are not pointing to the wrong memory

addresses by fixing them up • During copy, record start of heap and track the pointers for

the addresses in the heap and save them as offsets • Problem: mallocs in external libraries and void pointers

Page 15: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Per-Guest Heap w/ Pointer Tagging

Page 16: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Per-Guest Heap with Pointer Tagging

Page 17: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Design 3: Per-Guest Heap withUser Space Mapping• Create a "per-guest" heap for each VM, as before.

  • Do not tag/fix pointers.

 • Map the heap to a well-known address in user space.

o Mark the pages as "system" to prevent modification o On a checkpoint, copy from this addresso For a restore, copy back to it

 •   Change between VMs through process context switches.

Page 18: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Per-Guest Heap in User Space

Page 19: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Implementation

• In order to create a per-guest heap, we must allocate a chunk of memory to represent the heap

  • The Host OS provides Palacios with malloc/free

functionso Currently these are kitten kernel memory allocator

functions   • In order to allocate out of our chunk, we needed to

define new allocation functions

Page 20: Checkpoint/Restore in the Palacios Virtual Machine Monitor

• Checkpoint / Restoreo Checkpoint: Find next available location in user

space, then copy relevant infoo Queue of previously checkpointed guest data

locationso Restore: Get checkpoint address from queue,

copy back to guest heap 

Implementation

Page 21: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Future Work

• More coding is needed to test this design. • Has potential to greatly simplify

checkpoint/restore of virtual machines • What's next: 

o Live-migration of guestso User space per-guest heaps in a different host

OS

Page 22: Checkpoint/Restore in the Palacios Virtual Machine Monitor

Questions?