accountability - zoo | yale...
TRANSCRIPT
© 2010 Andreas Haeberlen
Accountability
Most of the content is borrowed from Andreas Haeberlen’s SOSP’07 and OSDI’10 talks.
© 2010 Andreas Haeberlen
! Cheating is a serious problem in itself ! Multi-billion-dollar industry
! A more general problem: ! Alice relies on software that runs on a third-party machine ! Examples: Competitive system (auction), federated system... ! How does Alice know if the software running as intended?
Network Alice Bob
Software Motivation
© 2010 Andreas Haeberlen
Dealing with faults is difficult in practice
! How to detect faults? ! How to identify the faulty nodes? ! How to convince others that a node is (not) faulty?
Incorrect message
Responsible admin
© 2010 Andreas Haeberlen
Learning from the 'offline' world ! Relies on accountability ! Example: Banks
! Can be used to detect, identify and convince ! Recall: Fault-tolerance work focused on tolerance
! Goal: A general+practical system for accountability
Requirement Solution Commitment Signed receipts Tamper-evident record Double-entry bookkeeping Inspections Audits
© 2010 Andreas Haeberlen
Outline
! Introduction ! What is accountability? ! PeerReview ! Accountable VM
© 2010 Andreas Haeberlen
Ideal accountability
Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node
" Fault := Node deviates from expected behavior " Recall that our goal is to
" detect faults " identify the faulty nodes " convince others that a node is (or is not) faulty
" Can we build a system:
© 2010 Andreas Haeberlen
Ideal accountability
Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node
" Fault := Node deviates from expected behavior " Recall that our goal is to
" detect faults " identify the faulty nodes " convince others that a node is (or is not) faulty
" Can we build a system:
© 2010 Andreas Haeberlen
Ideal accountability
Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node
" Fault := Node deviates from expected behavior " Recall that our goal is to
" detect faults " identify the faulty nodes " convince others that a node is (or is not) faulty
" Can we build a system:
© 2010 Andreas Haeberlen
Can we detect all faults? " Problem: Faults that
affect only a node's internal state
" Focus on observable faults:
" Log information
" This allows us to detect faults without introducing any trusted components
A
X
C
1001010110 0010110101 1100100100
0
© 2010 Andreas Haeberlen
Can we always get a proof?
" Three possible causes: " A never sent X " B refuses to accept X " X was lost by the network
" Cannot get misbehavior proof! " Generalize to verifiable evidence:
" a proof of misbehavior, or " a challenge that the node cannot answer
" What if, after a long time, no response has arrived? " Does not prove the fault, but we can suspect the node
A
B
C
?
I sent X!
I never received X!
?!
© 2010 Andreas Haeberlen
Practical accountability
" We propose the following definition of a distributed system with accountability: " This is useful:
Any (!) fault that affects a correct node is eventually detected and linked to a faulty node
Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node
© 2010 Andreas Haeberlen
Outline
! Introduction ! What is accountability? ! PeerReview ! Accountable VM
© 2010 Andreas Haeberlen
" Adds accountability to a given system " Implemented as a library " Provides secure record, commitment, auditing, etc.
" Assumptions:
PeerReview
1. System can be modeled as collection of deterministic state machines
2. Nodes have reference implementations of the state machines
3. Correct nodes can eventually communicate 4. Nodes can sign messages
© 2010 Andreas Haeberlen
" Adds accountability to a given system " Implemented as a library " Provides secure record, commitment, auditing, etc.
" Assumptions:
PeerReview
1. System can be modeled as collection of deterministic state machines
2. Nodes have reference implementations of the state machines
3. Correct nodes can eventually communicate 4. Nodes can sign messages
© 2010 Andreas Haeberlen
M
PeerReview in High-Level ! All nodes keep a log of
their inputs & outputs ! Including all messages
! Each node has a set of witnesses, who audit its log periodically
! If the witnesses detect misbehavior, they ! generate evidence ! make the evidence avai-
lable to other nodes
! Other nodes check evi-dence, report fault
A's log
B's log
A
B
M
C D
E
A's witnesses
M
© 2010 Andreas Haeberlen
PeerReview detects tampering
A B
Message
Send(X)
Recv(Y)
Send(Z)
Recv(M)
H0
H1
H2
H3
H4
B's log
ACK
" What if a node modifies its log entries?
" Log entries form a hash chain Inspired by secure histories
[Maniatis02]
" Signed hash is included with every message ⇒ Node commits to its current state ⇒ Changes are evident
Hash(log)
Hash(log)
© 2010 Andreas Haeberlen
PeerReview detects inconsistencies
! What if a node ! keeps multiple logs? ! forks its log?
! Check whether the signed hashes form a single hash chain
H3'
Read X
H4'
Not found
Read Z
OK
Create X
H0
H1
H2
H3
H4
OK
"View #1" "View #2"
© 2010 Andreas Haeberlen
Module B
PeerReview detects faults ! How to recognize
faults in a log? ! Assumption:
! Node can be modeled as a deterministic state machine
! To audit a node: ! Replay inputs to a
trusted copy of the state machine
! Check outputs against the log
Module A Module B
=?
Log Network
Input
Output
Stat
e m
achi
ne
if ≠
Module A
© 2010 Andreas Haeberlen
M
Recall: Working Process ! All nodes keep a log of
their inputs & outputs ! Including all messages
! Each node has a set of witnesses, who audit its log periodically
! If the witnesses detect misbehavior, they ! generate evidence ! make the evidence avai-
lable to other nodes
! Other nodes check evi-dence, report fault
A's log
B's log
A
B
M
C D
E
A's witnesses
M
© 2010 Andreas Haeberlen
PeerReview ! Accountability is an approach to handling
faults in decentralized systems ! detects faults ! identifies the faulty nodes ! produces evidence
! PeerReview: A system that enforces accountability ! Offers provable guarantees and is widely applicable
© 2010 Andreas Haeberlen
PeerReview is widely applicable ! App #1: NFS server in the Linux kernel
! Many small, latency-sensitive requests ! Tampering with files ! Lost updates
! App #2: Overlay multicast ! Transfers large volume of data
! Freeloading ! Tampering with content
! App #3: P2P email ! Complex, large, decentralized
! Denial of service ! Attacks on DHT routing
© 2010 Andreas Haeberlen
How much does PeerReview cost?
! Dominant cost depends on number of witnesses W ! O(W2) component
Baseline 1 2 3 4 5
100
80
60
40
20
0
Avg
traf
fic (
Kbps
/nod
e)
Number of witnesses
Baseline traffic
Signatures and ACKs
Checking logs
W dedicated witnesses
© 2010 Andreas Haeberlen
What is the problem of PeerReview
© 2010 Andreas Haeberlen
Outline
! Introduction ! What is accountability? ! PeerReview ! Accountable VM
© 2010 Andreas Haeberlen
Scenario: Multiplayer game
! Alice decides to play a game of Counterstrike with Bob and Charlie
Alice Bob
Charlie
Network
I'd like to play a game
© 2010 Andreas Haeberlen
Could Bob be cheating?
! In Counterstrike, ammunition is local state ! Bob can manipulate counter and prevent it from decrementing ! Such cheats (and many others) do exist, and are being used
Charlie
Network
Alice Bob
Ammo
35 36 37
© 2010 Andreas Haeberlen
Goal: Accountability
! We want Alice to be able to ! Detect when the remote machine is faulty ! Obtain evidence of the fault that would convince a third party
! Challenges: ! Neither Alice nor Bob may understand how the software works
! Binary only - no specification of the correct behavior
Network Alice Bob
Software
© 2010 Andreas Haeberlen
! Bob runs Alice's software image in an AVM ! AVM maintains a log of network in-/outputs
! Alice can check this log with a reference image ! AVM correct: Reference image can produce same network
outputs when started in same state and given same inputs ! AVM faulty: Otherwise
Network
Alice Bob
Virtual machine image
AVMM
AVM
Accountable Virtual Machine
(AVM)
Accountable Virtual Machine Monitor (AVMM)
Log
What if Bob manipulates
the log?
Alice must trust her own
reference image
How can Alice find this execution, if it exists?
© 2010 Andreas Haeberlen
Firing
Tamper-evident logging
! Message log is tamper-evident [PeerReview] ! Log is structured as a hash chain ! Messages contain signed authenticators
! Result: Alice can either... ! ... detect that the log has been tampered with, or ! ... get a complete log with all the observable messages
473: SEND(Charlie, Got ammo)
472: RECV(Alice, Got medipack)
471: SEND(Charlie, Moving left)
...
474: SEND(Alice, Firing)
Moving right
AVMM
AVM
© 2010 Andreas Haeberlen
Auditing and replay
Network
Alice Bob
AVMM
AVM
AVMM
AVM
...
371: SEND(Alice, Firing) 370: SEND(Alice, Firing) 369: SEND(Alice, Firing) 368: Mouse button clicked 367: SEND(Alice, Got medipack) 366: Mouse moved left
Modification Evidence
371: SEND(Alice, Firing) 370: SEND(Alice, Firing) 369: SEND(Alice, Firing) 368: Mouse button clicked 367: SEND(Alice, Got medipack) 366: Mouse moved left
372: SEND(Alice, Firing) 373: SEND(Alice, Firing)
© 2010 Andreas Haeberlen
AVM properties ! Strong accountability
! Detects faults ! Produces evidence ! No false positives
! Works for arbitrary, unmodified binaries ! Nondeterministic events can be captured by AVM Monitor
! Alice does not have to trust Bob, the AVMM, or any software that runs on Bob's machine ! If Bob tampers with the log, Alice can detect this ! If Bob's AVM is faulty, ANY log Bob could produce would
inevitably cause a divergence during replay
If it runs in a VM, it will work
© 2010 Andreas Haeberlen
Methodology
! A prototype AVMM ! Based on logging/replay engine in VMware Workstation 6.5.1 ! Extended with tamper-evident logging and auditing
! Evaluation: Cheat detection in games ! Setup models competition / LAN party ! Three players playing Counterstrike 1.6 ! Nehalem machines (i7 860) ! Windows XP SP3
© 2010 Andreas Haeberlen
AVMs can detect real cheats
! If the cheat needs to be installed in the AVM to be effective, AVM can trivially detect it ! Reason: Event timing + control flow change ! Examined real 26 cheats from the Internet; all detectable
98: RECV(Alice, Missed) 97: SEND(Alice, Fire@(3,9)) 96: Mouse button clicked 95: Interrupt received 94: RECV(Alice, Jumping) ...
BC=53 BC=52 BC=47BC=44 BC=37 ...
Bob's log
EIP=0xb382 EIP=0x3633 EIP=0xc490 EIP=0x6771 EIP=0x570f ...
Event timing (for replay)
AVMM
AVM
BC=59 BC=54 BC=49BC=44 BC=37 ...
EIP=0x861e EIP=0x2d16 EIP=0xc43e EIP=0x6771 EIP=0x570f ...
97: SEND(Alice, Fire@(2,7)) 98: RECV(Alice, Hit)
© 2010 Andreas Haeberlen
Cost of auditing
! When auditing a player after a one-hour game, ! How big is the log we have to download? ! How much time is needed for replay?
VMware AVMM
Aver
age
log
grow
th (
MB/
min
ute)
12
10
8
6
4
2
0
~8 MB per minute 2.47 MB
per minute (compressed)
148 MB
Added by accountability
~ 1 hour
© 2010 Andreas Haeberlen
Online auditing
! Idea: Stream logs to auditors during the game ! Result: Detection within seconds after fault occurs ! Replay can utilize unused cores; frame rate penalty is low
200
150
100
50
0
Aver
age
fram
e ra
te
No online auditing
One audit per player
Two audits per player
Alice
Bob Charlie
Gam
e
Logg
ing
Repl
ay
Repl
ay
© 2010 Andreas Haeberlen
! Play and replay: ! NetReview ! TimingReview
! f ! Setup models competition / LAN party ! Three players playing Counterstrike 1.6 ! Nehalem machines (i7 860) ! Windows XP SP3
Extentions
© 2010 Andreas Haeberlen
! Play and replay: ! NetReview ! TimingReview
! Problems ! Privacy concerns ! Efficiency ! Deployment
Extentions
© 2010 Andreas Haeberlen
Questions?