verifying the microsoft hyper-v h vcc · microsoft hyper-v hypervisor •thin layer of software...

VERIFYING THE MICROSOFT HYPER-V HYPERVISOR WITH VCC

Stephan Tobies

joint work with the colleagues from Microsoft and Verisoft XT

HYPERVISOR VERIFICATION PROJECT (2007-2010)

• European Microsoft Innovation Center

• German Research Center for Artificial Intelligence

• Microsoft Research

• Microsoft’s Windows Div.

• Saarland University

co-funded by the German Ministry of Education and Research http://www.verisoftxt.de

http://www.verisoftxt.de/

MICROSOFT HYPER-V HYPERVISOR

• Thin layer of software between hardware and OS

• Turns an x64 multi-processor machine with virtualization extensions into a set of virtual multi-processor x64 machines

– Without virtualization extensions, but

– With an additional level of address translation

– With additional instructions (Hypercalls)

• Ships since March 2009

HYPERVISOR VERIFICATION: WHY?

• Industrial software with tractable code size (ca. 100 000 lines of C, 5 000 lines of x64 asm)

• Prototypical code

• Correctness is important

• Testing and debugging are difficult

• Complex implementation, simple specification

• Self-contained

HYPERVISOR VERIFICATION: GOALS

• Functional correctness – Correct virtualization

– Requires: memory safety, race freedom, etc.

• Code level verification for full blown C – Not just abstract algorithms

• Integrate into existing development process – Keep verification artifacts close to the code

– Annotations should be comprehensible and maintainable by a suitably trained developer

HYPERVISOR VERIFICATION: CHALLENGES

• Fixed code base

– No code changes just to simplify verification

• Concurrency

– Lock protected and lock-free

– Guests and hardware (TLB, APIC) also execute concurrently

• Assembly code and realistic compiler model

– Need to model ABI

HYPERVISOR VERIFICATION: CHALLENGES

• Realistic x64 hardware model

– Caches, TLBs, store buffers

– Weak memory model

• Explicit management of virtual memory

– HV runs in translated mode

– and maintains its own page tables

HV CORRECTNESS A guest operating system cannot distinguish (with some exceptions)

whether a machine instruction is executed a) directly on the hardware OR b) through the HV

Hardware

OS

Hypervisor

Virtual Hardware

Virtual Hardware

Virtual Hardware …

OS

Hardware

SIMULATION PROOFS

• Typically

– Add external, existentially quantified variable representing the simulated state

– Good for abstract programs (e.g., transition relations)

• Here

– State updates scattered throughout the codebase

Keep updates of simulated state close to updates in code

– Represent abstract state as ghost code

• Explicit updates to ghost state provide existential witnesses

HARDWARE MODEL

Virtual Hardware ≈ Hardware ‐ Virtualization extension + Hypercalls + Hypervisor MSRs + Synic (extended APIC) + Shared Memory

Real Hardware

x64 x64 x64 …

Memory

Virtual Hardware

HV x64

HV x64

HV x64 …

Memory

Hypercalls

Realistic model of the x64 architecture including caches, TLB, APIC

SIMULATION RELATION • Simulation Relation

– Single state coupling invariant links abstract and implementation state

– Two state invariant expresses legal transitions of top level model

Hypervisor

x64 …

Memory

x64 x64

Hypervisor Implementation

Top Level Model (Ghost Data)

Coupling invariant

2-state invariant

Root Partition

…

Memory

HV x64

HV x64

HV x64

Hyper calls

Child Partition

…

Memory

HV x64

HV x64

HV x64

Hyper calls

Child Partition

…

Memory

HV x64

HV x64

HV x64

Hyper calls

…

CHALLENGES FOR VERIFICATION OF CONCURRENT C

1. Memory model that is adequate and efficient to reason about

2. Modular reasoning about concurrent code

3. Invariants for (large and complex) C data structures

4. Huge verification conditions to be proven automatically

5. “Live” specifications that evolve with the code

MICROSOFT VCC – VERIFICATION OF CONCURRENT C

• Source Language – ANSI C + – Design-by-Contract Annotations + – Ghost state + – Theories + – Metadata Annotations

• Program Logic – Dijkstra’s weakest preconditions

• Automatic Verification – verification condition generation (VCG) – automatic theorem proving (SMT)

VCC TOOLS: SIMPLIFIED VIEW struct C { int z; invariant (z >= 0) } void F(C* this, int a )

requires(invariant(this) && a > 0) ensures (invariant(this)) { z = 100/ a ; }

assume(select(M,this,z) >= 0 && a > 0); assert (a != 0); M := store(M,this,z,100/a); assert (select(M,this,z) >= 0)

(select(M,this,z) ≥ 0 ⋀ a > 0) ⇒ (a ≠ 0 ⋀ select(store(M, this, z, 100/a), this, z) ≥ 0)

Annotated C program passed to VCC tool, generates

Boogie PL program passed to Boogie tool, generates

Verification condition passed to Z3, generates verdict

VCC: TAKE TYPES SERIOUSLY

• pointers = pairs of memory address and type

• maintain the set of currently valid pointers

– check validity at every access

struct A {

int x;

int y;

};

struct B {

struct A a;

int z;

};

x

y

z

⟨42, A⟩

⟨42, int⟩

⟨42, B⟩

⟨46, int⟩

⟨50, int⟩

a

Object Invariants

Invariants are predicates describing consistent states of a struct or union

struct S { int a, b; invariant(b > a) }

For safety reasoning, we basically want to prove a bunch of invariants. But

• When does an invariant hold (e.g. can’t hold on initialization)

– invariants hold when objects are closed

– invariants might hold when objects are open (aka mutable)

• What can an invariant reference (an invariant can span multiple objects)

– Anything as long as there are no dependency loops

• What stops object update from breaking an invariant of another object?

– invariants must be admissible

• Each closed object has a unique owner object; open objects are owned by a thread

• A thread can open (unwrap)/ close (wrap) objects it owns; – Unwrapping/wrapping

transfers ownership to/from the thread;

– unwrapping/wrapping assumes/asserts the invariant

• Writing to the root of an

ownership tree gives permission to write anything in its domain

INVARIANTS, OWNERSHIP, FRAMING (INFLUENCED BY SPEC#)

open object, modification allowed

closed object invariant holds

system invariant:

hierarchical opening & closing using unwrap/wrap

GOALS OF CONCURRENCY VERIFICATION

modular verification of

1. lock protected access to concurrent data ownership transfer to and from current thread

properties of locks known in context of use

2. lock free access to volatile data, e.g., implementing concurrency primitives atomic access to volatile data

control possible updates of threads to volatile data

CONCURRENT ACCESS: VOLATILE

• invariants are the only place to guarantee properties of volatile data

• invariants hold only for closed objects:

allow modification of closed objects

but only to fields marked with volatile

two-state invariants, for two consecutive states of the machine, constraint how volatile data evolves

Example: acquiring a spinlock – requires that the spinlock stays closed during the function call, not just at call time

But how can a thread assert that an object is closed if it doesn’t own it? – use the invariant of an object that you do own!

A claim • is a first class object, which references other objects (e.g.locks) • can be owned and used by arbitrary threads or objects • can state property (e.g. lock stays closed while closed handles exist) • is just syntactic sugar Example cont’d: Pass pointers to claims as ghost arguments to functions. They • serve as stronger preconditions, which can constrain volatile shared state • hold until the claim is destroyed

KEEPING OBJECTS CLOSED: CLAIMS (INFLUENCED BY SEPARATION-LOGIC PERMISSIONS)

METHODOLOGY IS NOT ENOUGH

Real code is large and complex! Most verification attempts fail!

A practically useful verification tool supports • concise specification of relevant properties

– methodology and annotation language

• meaningful feedback for failed proof attempts – error model mapped back to source code – profile of proof search when hitting resource constraints – live monitoring of prover work for long running proofs

• acceptable turnaround times for verify&fix-cycles

PARTITION OWNERSHIP

33 groups

ownership hierarchy

The root of the ownership tree

rundowns

locks

Sequential access Concurrent access

claims

fields

VCC WORKFLOW most of the time

Monitor proof search with Z3 Inspector

Annotate C code

Compile with regular C compiler

Verify with VCC

Executable Error

Analyze counterexample with Model Viewer

Fix code or specs with VCC VS plugin

Timeout

Analyze Z3 log with Z3 Axiom Profiler

erified

need fast turnaround

PERFORMANCE, PERFORMANCE, PERFORMANCE

Experience from the Hyper-V verification

• successful verifications:

– typical: 0.5–500s, average 25s

– current max: 2 000s

– all time max: 50 000s (down to 1 000s with Z3v2)

• acceptable time for interactive work: < 30s

• failing proof attempts often take much longer than the finally successful verification

VCC PERFORMANCE TRENDS NOV 08 – MAR 09

0.1

1

10

100

1000

Attempt to improve Boogie/Z3 interaction

Modification in invariant checking

Switch to Boogie2

Switch to Z3 v2

Z3 v2 update

• Annotation approx >> 1 spec line per 1 code line

• Verifying approx 3 functions per day; done in a year

• Verified as of today approx – 4500 lines of assembler (e.g., most hardware access) – 30.000 lines of C (e.g., all primitives) – invariants of global data structures in place

• Uncovered some concurrency and design bugs that – were not discovered in testing and – are virtually impossible to trace back from field failure data

HV VERIFICATION ENGINEERING STATUS

• Source Language – ANSI C + Design-by-Contract + Meta-Information – Compiler: http://vcc.codeplex.com

• Program Logic – Verification condition generation (VCG) via Boogie – Boogie: http://boogie.codeplex.com

• Automatic Verification – Automatic theorem proving (SMT) via Z3 – Z3:

http://research.microsoft.com/projects/z3/download.html

VCC TOOLS ARE AVAILABLE

http://vcc.codeplex.com/

http://boogie.codeplex.com/

http://research.microsoft.com/projects/z3/download.html

And everyone in the project: Artem Alekhin, Eyad Alkassar, Mike Barnett, Nikolaj Bjørner, Sebastian Bogan, Sascha Böhme, Matko Botinĉan, Vladimir Boyarinov, Ernie Cohen, Markus Dahlweid, Ulan Degenbaev, Lieven Desmet, Sebastian Fillinger, Mark Hillebrand, Tom In der Rieden, Bruno Langenstein, K. Rustan M. Leino, Wolfgang Manousek, Stefan Maus, Michał Moskal, Leonardo de Moura, Andreas Nonnengart, Steven Obua, Wolfgang Paul, Hristo Pentchev, Elena Petrova, Norbert Schirmer, Sabine Schmaltz, Wolfram Schulte, Peter-Michael Seidel, Andrey Shadrin, Stephan Tobies, Alexandra Tsyban, Sergey Tverdyshev, Herman Venter, and Burkhart Wolff.

verifying the microsoft hyper-v h vcc · microsoft hyper-v hypervisor •thin layer of software...

Documents