verifying the microsoft hyper-v h vcc · microsoft hyper-v hypervisor •thin layer of software...
TRANSCRIPT
VERIFYING THE MICROSOFT HYPER-V HYPERVISOR WITH VCC
Stephan Tobies
joint work with the colleagues from Microsoft and Verisoft XT
HYPERVISOR VERIFICATION PROJECT (2007-2010)
• European Microsoft Innovation Center
• German Research Center for Artificial Intelligence
• Microsoft Research
• Microsoft’s Windows Div.
• Saarland University
co-funded by the German Ministry of Education and Research http://www.verisoftxt.de
MICROSOFT HYPER-V HYPERVISOR
• Thin layer of software between hardware and OS
• Turns an x64 multi-processor machine with virtualization extensions into a set of virtual multi-processor x64 machines
– Without virtualization extensions, but
– With an additional level of address translation
– With additional instructions (Hypercalls)
• Ships since March 2009
HYPERVISOR VERIFICATION: WHY?
• Industrial software with tractable code size (ca. 100 000 lines of C, 5 000 lines of x64 asm)
• Prototypical code
• Correctness is important
• Testing and debugging are difficult
• Complex implementation, simple specification
• Self-contained
HYPERVISOR VERIFICATION: GOALS
• Functional correctness – Correct virtualization
– Requires: memory safety, race freedom, etc.
• Code level verification for full blown C – Not just abstract algorithms
• Integrate into existing development process – Keep verification artifacts close to the code
– Annotations should be comprehensible and maintainable by a suitably trained developer
HYPERVISOR VERIFICATION: CHALLENGES
• Fixed code base
– No code changes just to simplify verification
• Concurrency
– Lock protected and lock-free
– Guests and hardware (TLB, APIC) also execute concurrently
• Assembly code and realistic compiler model
– Need to model ABI
HYPERVISOR VERIFICATION: CHALLENGES
• Realistic x64 hardware model
– Caches, TLBs, store buffers
– Weak memory model
• Explicit management of virtual memory
– HV runs in translated mode
– and maintains its own page tables
HV CORRECTNESS A guest operating system cannot distinguish (with some exceptions)
whether a machine instruction is executed a) directly on the hardware OR b) through the HV
Hardware
OS
Hypervisor
Virtual Hardware
Virtual Hardware
Virtual Hardware …
OS
Hardware
SIMULATION PROOFS
• Typically
– Add external, existentially quantified variable representing the simulated state
– Good for abstract programs (e.g., transition relations)
• Here
– State updates scattered throughout the codebase
Keep updates of simulated state close to updates in code
– Represent abstract state as ghost code
• Explicit updates to ghost state provide existential witnesses
HARDWARE MODEL
Virtual Hardware ≈ Hardware ‐ Virtualization extension + Hypercalls + Hypervisor MSRs + Synic (extended APIC) + Shared Memory
Real Hardware
x64 x64 x64 …
Memory
Virtual Hardware
HV x64
HV x64
HV x64 …
Memory
Hypercalls
Realistic model of the x64 architecture including caches, TLB, APIC
SIMULATION RELATION • Simulation Relation
– Single state coupling invariant links abstract and implementation state
– Two state invariant expresses legal transitions of top level model
Hypervisor
x64 …
Memory
x64 x64
Hypervisor Implementation
Top Level Model (Ghost Data)
Coupling invariant
2-state invariant
Root Partition
…
Memory
HV x64
HV x64
HV x64
Hyper calls
Child Partition
…
Memory
HV x64
HV x64
HV x64
Hyper calls
Child Partition
…
Memory
HV x64
HV x64
HV x64
Hyper calls
…
CHALLENGES FOR VERIFICATION OF CONCURRENT C
1. Memory model that is adequate and efficient to reason about
2. Modular reasoning about concurrent code
3. Invariants for (large and complex) C data structures
4. Huge verification conditions to be proven automatically
5. “Live” specifications that evolve with the code
MICROSOFT VCC – VERIFICATION OF CONCURRENT C
• Source Language – ANSI C + – Design-by-Contract Annotations + – Ghost state + – Theories + – Metadata Annotations
• Program Logic – Dijkstra’s weakest preconditions
• Automatic Verification – verification condition generation (VCG) – automatic theorem proving (SMT)
VCC TOOLS: SIMPLIFIED VIEW struct C { int z; invariant (z >= 0) } void F(C* this, int a )
requires(invariant(this) && a > 0) ensures (invariant(this)) { z = 100/ a ; }
assume(select(M,this,z) >= 0 && a > 0); assert (a != 0); M := store(M,this,z,100/a); assert (select(M,this,z) >= 0)
(select(M,this,z) ≥ 0 ⋀ a > 0) ⇒ (a ≠ 0 ⋀ select(store(M, this, z, 100/a), this, z) ≥ 0)
Annotated C program passed to VCC tool, generates
Boogie PL program passed to Boogie tool, generates
Verification condition passed to Z3, generates verdict
VCC: TAKE TYPES SERIOUSLY
• pointers = pairs of memory address and type
• maintain the set of currently valid pointers
– check validity at every access
struct A {
int x;
int y;
};
struct B {
struct A a;
int z;
};
x
y
z
⟨42, A⟩
⟨42, int⟩
⟨42, B⟩
⟨46, int⟩
⟨50, int⟩
a
Object Invariants
Invariants are predicates describing consistent states of a struct or union
struct S { int a, b; invariant(b > a) }
For safety reasoning, we basically want to prove a bunch of invariants. But
• When does an invariant hold (e.g. can’t hold on initialization)
– invariants hold when objects are closed
– invariants might hold when objects are open (aka mutable)
• What can an invariant reference (an invariant can span multiple objects)
– Anything as long as there are no dependency loops
• What stops object update from breaking an invariant of another object?
– invariants must be admissible
• Each closed object has a unique owner object; open objects are owned by a thread
• A thread can open (unwrap)/ close (wrap) objects it owns; – Unwrapping/wrapping
transfers ownership to/from the thread;
– unwrapping/wrapping assumes/asserts the invariant
• Writing to the root of an
ownership tree gives permission to write anything in its domain
INVARIANTS, OWNERSHIP, FRAMING (INFLUENCED BY SPEC#)
open object, modification allowed
closed object invariant holds
system invariant:
hierarchical opening & closing using unwrap/wrap
GOALS OF CONCURRENCY VERIFICATION
modular verification of
1. lock protected access to concurrent data ownership transfer to and from current thread
properties of locks known in context of use
2. lock free access to volatile data, e.g., implementing concurrency primitives atomic access to volatile data
control possible updates of threads to volatile data
CONCURRENT ACCESS: VOLATILE
• invariants are the only place to guarantee properties of volatile data
• invariants hold only for closed objects:
allow modification of closed objects
but only to fields marked with volatile
two-state invariants, for two consecutive states of the machine, constraint how volatile data evolves
Example: acquiring a spinlock – requires that the spinlock stays closed during the function call, not just at call time
But how can a thread assert that an object is closed if it doesn’t own it? – use the invariant of an object that you do own!
A claim • is a first class object, which references other objects (e.g.locks) • can be owned and used by arbitrary threads or objects • can state property (e.g. lock stays closed while closed handles exist) • is just syntactic sugar Example cont’d: Pass pointers to claims as ghost arguments to functions. They • serve as stronger preconditions, which can constrain volatile shared state • hold until the claim is destroyed
KEEPING OBJECTS CLOSED: CLAIMS (INFLUENCED BY SEPARATION-LOGIC PERMISSIONS)
METHODOLOGY IS NOT ENOUGH
Real code is large and complex! Most verification attempts fail!
A practically useful verification tool supports • concise specification of relevant properties
– methodology and annotation language
• meaningful feedback for failed proof attempts – error model mapped back to source code – profile of proof search when hitting resource constraints – live monitoring of prover work for long running proofs
• acceptable turnaround times for verify&fix-cycles
PARTITION OWNERSHIP
33 groups
ownership hierarchy
The root of the ownership tree
rundowns
locks
Sequential access Concurrent access
claims
fields
VCC WORKFLOW most of the time
Monitor proof search with Z3 Inspector
Annotate C code
Compile with regular C compiler
Verify with VCC
Executable Error
Analyze counterexample with Model Viewer
Fix code or specs with VCC VS plugin
Timeout
Analyze Z3 log with Z3 Axiom Profiler
erified
need fast turnaround
PERFORMANCE, PERFORMANCE, PERFORMANCE
Experience from the Hyper-V verification
• successful verifications:
– typical: 0.5–500s, average 25s
– current max: 2 000s
– all time max: 50 000s (down to 1 000s with Z3v2)
• acceptable time for interactive work: < 30s
• failing proof attempts often take much longer than the finally successful verification
VCC PERFORMANCE TRENDS NOV 08 – MAR 09
0.1
1
10
100
1000
Attempt to improve Boogie/Z3 interaction
Modification in invariant checking
Switch to Boogie2
Switch to Z3 v2
Z3 v2 update
• Annotation approx >> 1 spec line per 1 code line
• Verifying approx 3 functions per day; done in a year
• Verified as of today approx – 4500 lines of assembler (e.g., most hardware access) – 30.000 lines of C (e.g., all primitives) – invariants of global data structures in place
• Uncovered some concurrency and design bugs that – were not discovered in testing and – are virtually impossible to trace back from field failure data
HV VERIFICATION ENGINEERING STATUS
• Source Language – ANSI C + Design-by-Contract + Meta-Information – Compiler: http://vcc.codeplex.com
• Program Logic – Verification condition generation (VCG) via Boogie – Boogie: http://boogie.codeplex.com
• Automatic Verification – Automatic theorem proving (SMT) via Z3 – Z3:
http://research.microsoft.com/projects/z3/download.html
VCC TOOLS ARE AVAILABLE
And everyone in the project: Artem Alekhin, Eyad Alkassar, Mike Barnett, Nikolaj Bjørner, Sebastian Bogan, Sascha Böhme, Matko Botinĉan, Vladimir Boyarinov, Ernie Cohen, Markus Dahlweid, Ulan Degenbaev, Lieven Desmet, Sebastian Fillinger, Mark Hillebrand, Tom In der Rieden, Bruno Langenstein, K. Rustan M. Leino, Wolfgang Manousek, Stefan Maus, Michał Moskal, Leonardo de Moura, Andreas Nonnengart, Steven Obua, Wolfgang Paul, Hristo Pentchev, Elena Petrova, Norbert Schirmer, Sabine Schmaltz, Wolfram Schulte, Peter-Michael Seidel, Andrey Shadrin, Stephan Tobies, Alexandra Tsyban, Sergey Tverdyshev, Herman Venter, and Burkhart Wolff.