safe programming of asynchronous interaction: can we do it for real? shaz qadeer
DESCRIPTION
Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer Research in Software Engineering Microsoft Research. Asynchronous interaction. Collection of state machines communicating asynchronously via message buffers distributed algorithms - PowerPoint PPT PresentationTRANSCRIPT
Safe Programming of Asynchronous Interaction:Can we do it for real?
Shaz QadeerResearch in Software EngineeringMicrosoft Research
Asynchronous interaction
• Collection of state machines communicating asynchronously via message buffers– distributed algorithms– cloud infrastructure, services, and applications– event-driven JavaScript/AJAX programs– device drivers– …
Challenging characteristics
• Decomposition of a logical task into pieces
• Temporally overlapped execution of tasks
• Failure tolerance is important
• Coordination via protocols
Safety-critical is so 20th century• Software should just “work”
– as cloud computing becomes common– as devices get embedded into everyday life
• First-order concerns– software reliability– programming, testing, and debugging productivity– cost of achieving reliability and productivity
• Need programming techniques to improve reliability and productivity
Outline
• Formal design of USB device driver stack in Windows 8
• Challenges (or inspiration) for the future
• Domain-specific language, compiler, and verifier for protocol programming
What is USB?
• Universal Serial Bus• Primary mechanism for connecting
peripherals to PCs– 2 billion USB devices sold every year (as of 2008)– voted most important PC innovation of all time
(PC magazine)
1996 2000 2008
USB 1.0 USB 2.0 USB 3.0
USB device driver stack in Win8
HSM
PSMPSMPSM
OS, drivers
Hardware
DSMDSMDSM
Design methodology (Aull-Gupta)
State Machine In
Visio
State Table, Transitions And
State Entry Functions In C
Operations In C
State Machine Engine In C
Script
State Table, Transitions And
State Entry Functions In Zing
State Machine Engine In Zing
Document Operations, Rules And
Assumptions
Program Operations, Rules And Assumptions
In Zing
Script
Assumptions/Guarantees
• Upon calling TimerStart(), machine could receive TimerFired event– S1, S2, and S3 need to handle TimerFired
• Upon receiving TimerFired, machine will not receive TimerFired– S4 does not need to handle TimerFired
State S1TimerStart()
State S2
State S3 State S4
X
TimerFiredY
StartTimer
EmptyFunction()
WaitingForCommand
UsbTimerStart()
StartingTimer
OperationSuccess
EmptyFunction()
WaitingForTimerToExpire
TimerFired
StopTimer
UsbTimerStop()
StoppingTimer
OperationSuccess
SignalTimerCompletion()
SignallingTimerCompletion
OperationSuccess
OperationFailure
EmptyFunction()
WaitingForTimerToFlushOnStop
TimerFired
Timer state machine
Zing error traceCheck failed ******************************************************************************* Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StartTimer') Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StartTimer') AttributeEvent: Handled Event ___StartTimer, Old State: ___WaitingForCommand, New State: ___StartingTimer Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___TimerFired') Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StopTimer') AttributeEvent: Handled Event ___OperationSuccess, Old State: ___StartingTimer, New State: ___WaitingForTimerToExpire Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___TimerFired') AttributeEvent: Handled Event ___TimerFired, Old State: ___WaitingForTimerToExpire, New State:
___SignallingTimerCompletion AttributeEvent: Handled Event ___OperationSuccess, Old State: ___SignallingTimerCompletion, New State:
___WaitingForCommand
Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StopTimer') AttributeEvent: HSM-1: Unhandled Event ___StopTimer, State ___WaitingForCommand]
Error in state:Zing Assertion failed: Expression: false Comment: Unhandled Event
Depth on error 208
Impact• Unprecedented use of formal design in Windows• Model is the Source• Over 200 rules to catch regression bugs even before C
Code is compiled• Over 300 bugs found and fixed
– unhandled messages, property violations
State machine # states # transitions #bugs
HSM 196 361 90
PSM 3.0 295 752 12
PSM 2.0 457 1386 97
DSM 1919 4238 120
Benefits
• Model verification complements testing– validates states that are hard to reach with testing– debugging is significantly easier
• Explicit specification of contracts – solid design– better documentation and maintenance
Difficulties faced by programmers
• Visio inadequate container for state diagrams
• Semantics of modeling language embedded inside scripts
• No automation for managing properties, models, and lemmas
From modeling to programming
• State machine models are programs in a domain-specific language (DSL)
• Develop a modern programming environment for a DSL inspired by state machines– Simple syntax/semantics for programs and properties– Code generator and runtime library for execution– Verifier for property checking
Ping Pong
machine Ping receives pong { var x: Pong
state ( start, x := new Pong(y = this); raise unit ) ( ping1, send(x, ping); return )
transition ( start, unit, ping1 ) ( ping1, pong, ping1 )}
machine Pong receives ping { var y: Ping
state ( start, return ) ( pong1, send(y, pong); raise unit )
transition ( start, ping, pong1 ) ( pong1, unit, start )}
x := new Pong;raise unit
send(x, ping);return
unit
pong
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
ping
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
pong
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
pong
x := new Pong;raise unit
send(x, ping);return
unit
pong
return
send(that, pong);raise unit
ping unit
Unhandled events
• Suppose state s only provides the transitions (s, e1, s1) and (s, e2, s2)
• Retrieving e3 from input queue results in UnhandledEventException
• Absence of UnhandledEventException must be verified
Deferred events
• State (s, Stmt, {e1, e2})• s is in the middle of critical processing waiting
for e• Presence of e1 and e2 in the buffer does not
cause UnhandledEventException• e1 and e2 are skipped over while retrieving e
Sub-state machines
• Statement “call s” pushes state s on the machine stack– s will handle a sub-protocol
• Sub-computation inherits deferred events from the caller
• Caller given a chance to handle UnhandledEventException
Memory management
• When is it safe to free up the memory for a state machine?
• Reference counting: Increment, Decrement• A machine is freed only when
– its reference count is zero– it is quiescent
• Accessing a freed machine causes IllegalAccessException whose absence must be verified
Runtime library
• Provides support for– machine creation and deletion– input buffer management– execution of transitions and entry functions
• Reactive event-driven computation piggybacked on external threads– locking for coordination among multiple external
threads executing within the runtime
Verification
• How do we verify the absence of UnhandledEventException and IllegalAccessException?
• How do we verify program-specific properties?
• How do we specify interfaces?
Automata
Automata are used to model implementation and specification.
AB(𝑆 , Σ , 𝛿 , 𝑖)
Set of states
Alphabet
Transitions:
Initial state { A, B }
Automata
Parallel composition isthe synchronous product.(trace intersection)
AB AC
A
B
B
C
C
𝑠𝛼→𝑠′ 𝑡 𝛼
→𝑡 ′
(𝑠 , 𝑡 )𝛼→
(𝑠 ′ , 𝑡 ′ )
𝑠𝛼→𝑠′𝛼∉Σ𝑇
(𝑠 , 𝑡 )𝛼→
(𝑠 ′ , 𝑡 )
𝛼∉Σ𝑆𝑡 𝛼→𝑡 ′
(𝑠 , 𝑡)𝛼→
(𝑠 ,𝑡 ′ )
Shared transition
Local transition
𝑆 𝑇
𝑆 ||𝑇
Parallel composition
Properties
Specifications are monitors that define the set of allowed traces.An implementation is correct if it refines the specifications.Refinement is trace inclusion.
AB
B
≼ABB
Properties
Semantic gap
• How do we connect a program to a finite collection of automata communicating via rendezvous over a finite alphabet?
• Challenges– dynamic creation of machines– asynchronous message passing– unbounded input buffers
Solution
• Dynamic machine creation– finite verification scenario
• Asynchronous message passing– separate events for sending and receiving– events tagged by sender and receiver machine ids
Send AReceive B Receive ASend B
Send A
Receive A
Send B
Receive B
Implementations(machines and channels)
Ping
Ping Buffer
Pong
Pong Buffer
Solution
• Dynamic machine creation– finite verification scenario
• Asynchronous message passing– separate events for sending and receiving– events tagged by sender and receiver machine ids
• Unbounded input buffers– compositional verification – finite-state buffer abstractions
is a set of specification automata. is a set of implementation automata.
We want to prove (difficult).
Compositional verification tells us how we can do:
where are subsets of and are subsets of
Compositional verification
Simple hierarchical caseHierarchical compositional rule
Send AReceive B Receive ASend B
Send A
Receive A
Send B
Receive B
Send A
Receive ASend B
Receive B
Implementations(machines and channels)
Specification
Decomposing by weakening
AB Weaken by A AB
A
A
S Weaken(S, A)
S = Weaken(S, A) || Weaken(S, B)
Given a spec S, and a set of implementation machines I:
If for all E in alphabet of S,there is such that
Then .
Circular compositional rule
Receive ASend B
Send A
Receive ASend B
Receive B
Send A
Receive ASend B
Receive B
Send B
Send BSend B
Send B
refines
Pong
Review
• A domain-specific language for programming protocol aspects of asynchronous computations– operational semantics– compiler/runtime for device driver domain– verification
Work in progress
• Deliver working prototype to Windows and third-party driver developers
• Other applications– cloud infrastructure, services, and applications– networking software– asynchronous web programming – …
Opportunity
• Transform protocol design and implementation across a variety of application domains
• Target the greatest threat to software reliability in the era of pervasive devices and pervasive distributed computing