our work on virtualization

Our work on virtualization

Chen Haogang, Wang Xiaolin

{hchen, wxl}@pku.edu.cn

Institute of Network and Information Systems

School of Electrical Engineering and Computer Science

Peking University

2008.11

http://ncis.pku.edu.cn

Agenda Work at PKU

Remote paging for VM Transparent paravirtualization Virtual resource sharing Cache management in Multi-Core and

Virtualization environment


REMOCA: Hypervisor Remote Disk Cache

Motivation To improve paging performance for memory-

intensive or I/O-intensive workloads by utilizing free memory resource on another physical machine

Solution The remote memory plays the role of a storage

cache between a VM’s virtual memory and its virtual disk devices.In most cases, the network latency is much

lower than the disk latency (1~2 magnitude)

REMOCA: The design of REMOCA

Local Module: a ghost buffer REMOCA is an exclusive cache

Remote Module: the memory service


Summary of REMOCA REMOCA can efficiently alleviate the impact of

thrashing behavior, and also significantly improve the performance for real-world I/O intensive applications.

Future work Cluster-wide memory balancing Predicting miss ratio before allocating


Remote cache size: 768MB


主要内容 Work at PKU




Transparent paravirtualization

Some limitation of current hardware-assistant virtualization Too many VM exits incur significant overhead.

Most VM exits are related to page fault or I/O operation

KVM Xen (HVM)

Page fault I/OTotal

Page fault I/OTotal

count % count % count % count %

Kernel Compile

7459199 83.36 768433 8.59 8948235 11219395 81.23 1247207 9.03 13812089

SpecJBB 329286 2.99 6560472 59.57 11012466 352520 4.22 5000529 59.89 8350197

SpecCPU 34215800 18.69 102468771 55.98 183041637 113793259 50.41 56215828 24.90 225727994

SpecWeb 5668822 15.50 14921351 40.79 36584589 11835408 31.46 12428086 33.04 37618375

Reason and count of VM exits

Top 10 trap instructions with high VM exits frequency in KVM-54

Kernel Compile SpecJBB SpecCPU SpecWeb

1 26.21% (pf) 32.48% (io) 12.54% (pf) 7.65% (io)

2 26.07% (pf) 32.47% (io) 9.02% (io) 7.46% (io)

3 3.71% (pf) 32.47% (io) 8.47% (io) 7.46% (io)

4 2.44% (rd cr) 0.31% (io) 8.17% (ot) 7.46% (io)

5 2.44% (rd cr) 0.31% (pf) 8.01% (io) 6.08% (rd cr)

6 2.16% (pf) 0.31% (ot) 7.44% (io) 5.77% (io)

7 1.91% (pf) 0.31% (io) 7.44% (io) 5.71% (ot)

8 1.14% (pf) 0.31% (io) 7.44% (io) 5.67% (hlt)

9 0.99% (io) 0.31% (io) 4.54% (io) 5.53% (clts)

10 0.99% (io) 0.18% (ot) 2.07% (ot) 5.50% (rd cr)

TotalTotal 68.06%68.06% 99.46%99.46% 75.14%75.14% 64.30%64.30%


(io: I/O operation, pf: page fault, rd cr: read control registers, clts and hlt: x86 instructions, ot: others)


Hot Instructions detection and translation

How to reduce VM exits

Paravirtualization Xen and KVM apply paravirtualization to improve performance.

The needs to change the source code damages its applicability.

Transparent paravirtualization Detecting Hot Instructions

An efficient mechanism to catch 97% with top 64 instructions

Replacing Hot Instructions

New or even complex assistant mechanisms should be introduced into VMM to make the replacement safe and possible

Implanting Replaced Instructions to Guest OS

Adaptive code implantation

Implementation in KVM


Host OS

KVM HIK

IOCTL

Guest OS

VM exits

Page Directory

Function implant

Implanted Function Space

TMPEngine

Hot Instruction Killer:•Analyze hot instructions•Trace call stacks•Generate code fragments and new functions

TMP Engine:•Turn on/off TMP•Detect hot instructions•Adaptive code implantation

Page Table Page

Transparent Memory Paravirtualization

A New Memory Virtualization Mechanism Transfers the guest OS page table to map guest virtual addresses

directly to host physical addresses. The transferred guest page table, called direct page table, is

directly registered with the hardware MMU. A process using direct page table is called as a para-virtualized

process.

To provide the guest OS an independent view of its own physical address space as used for guest OS memory management. When the guest OS accesses the direct page table, it expects guest

physical addresses rather than host physical addresses as currently presented in the direct page table.


Transparent Memory Paravirtualization


Page Directory

RecoveryTable

page

page

PTE

P-PTE

Original PTE

2

1

Page TablePages

The Direct page table structure of A New Memory Virtualization Mechanism


Evaluation


Future work TMP Evaluation

Impact on cache hits Compares with: EPT/NPT ， Shadow Page Table Compares with: KVM Para-MMU ， Xen Para-MMU

Transparent MMU Extension Linux Windows Emulate all Guest OS page faults

TMP Transparent Para-IO Other hot instructions

Limitation of Transparent Paravirtualization

Security vs. performance

Transparent Paravirtualization


Agenda Work at PKU




Virtual resource sharing

Motivation

In a homogeneous environment, how to achieve high-degree of resource sharing while preserving isolation?

Example: Network classroom @ PKU Zhongzhi

Teaching Windows, MS Office or VC++ programming

About 30 students per class

Homogeneous OS, software,data and application instances

Terminal Terminal Terminal Terminal

Server Server

Virtual resource sharing Limitations of current solutions

Terminal server: bad isolation Preferred to run a single OS per student

VM live clone: cannot provide data persistency Content-based sharing: high scanning overhead Difference Engine (OSDI ’08): unable to share

during OS startup or application startup

Goal

Fast startup of VMs and applications Accurate resource sharing Low management overhead

Virtual resource sharing Solution: a bottom-up approach

Starts from disk sharing Map identical disk blocks to a single storage

location Manage a shared disk cache within the VMM Replace disk reads with page remapping

Fast application startup Challenges

How to discover identical disk blocks? CoW disk / CAS

How to handle sharable application data, especially the “zero pages”?

Agenda

Work at PKU Remote paging for VM Transparent para-virtualization Virtual resource sharing Cache management in Multi-Core and




Motivation Current VMM cannot make efficient use of

the cache hierarchy in a multi-core platform

Objectives Explore new compiling and profiling techniques

to analyze and predict memory access behavior of a program

Implement the cache-aware memory allocation and CPU scheduling in the VMM

Dynamic memory balancing among VMs

Cache management in Multi-Core


Lower-level cache partitioning Avoid cache contention for concurrent VMs Using page-coloring technique

Restricting the number of cache sets that a VM can use

Transparent to the guest OS



Challenges Predicting the performance impact to the

application before partitioning Online profiling and dynamic re-partitioning Reducing page migration overhead Cooperating with VM scheduling, especially

CPU allocation and migration New micro-architectures

Example: Intel Nehalem256 KB dedicated L2 per core and shared L3


Thanks！

Q&A

Discussion

our work on virtualization

Documents

memory servicehttp

vm exitsparavirtualizationxen

high vm

rd cr0

free memory resource

vms virtual memory

x86 instructions

trap instructions