anonymity and probabilistic model checking cs 259 john mitchell 2008

58
Anonymity and Probabilistic Model Checking CS 259 John Mitchell 2008

Upload: julianna-woods

Post on 14-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Anonymity andProbabilistic Model Checking

CS 259

John Mitchell

2008

Course schedule

Lectures• Prob. model checking, other tools (with examples)

Homework 2• Posted last week, due Tues Feb 12• Simple exercises using probabilistic tool

Projects• Presentation 2: Feb 19, 21

– Describe tool or approach and the properties you will check

• Presentation 3: Mar 4 – 13 (two or three meetings)– Final results: turn in slides and tool input

Dining Cryptographers

Clever idea how to make a message public in a perfectly untraceable manner• David Chaum. “The dining cryptographers problem:

unconditional sender and recipient untraceability.” Journal of Cryptology, 1988.

Guarantees information-theoretic anonymity for message senders• This is an unusually strong form of security: defeats

adversary who has unlimited computational power

Impractical, requires huge amount of randomness• In group of size N, need N random bits to send 1 bit

Three-Person DC Protocol

Three cryptographers are having dinner.Either NSA is paying for the dinner, or one of them is paying, but wishes to remain anonymous.

1. Each diner flips a coin and shows it to his left neighbor.• Every diner will see two coins: his own and his right neighbor’s.

2. Each diner announces whether the two coins are the same. If he is the payer, he lies (says the opposite).

3. Odd number of “same” NSA is paying; even number of “same” one of them is paying

• But a non-payer cannot tell which of the other two is paying!

?

Non-Payer’s View: Same Coins

“same”

“different”

payer payer

?

“same”

“different”

Without knowing the coin tossbetween the other two, non-

payercannot tell which of them is

lying

?

Non-Payer’s View: Different Coins

“same”

“same”

payer payer

Without knowing the coin tossbetween the other two, non-

payercannot tell which of them is

lying

?

“same”

“same”

Superposed Sending

This idea generalizes to any group of size N For each bit of the message, every user generates

1 random bit and sends it to 1 neighbor• Every user learns 2 bits (his own and his neighbor’s)

Each user announces (own bit XOR neighbor’s bit) Sender announces (own bit XOR neighbor’s bit

XOR message bit) XOR of all announcements = message bit

• Every randomly generated bit occurs in this sum twice (and is canceled by XOR), message bit occurs once

DC-Based Anonymity is Impractical

Requires secure pairwise channels between group members• Otherwise, random bits cannot be shared

Requires massive communication overhead and large amounts of randomness

DC-net (a group of dining cryptographers) is robust even if some members cooperate• Guarantees perfect anonymity for the other

members

A great protocol to analyze• Difficult to reason about each member’s

knowledge

Definitions of Anonymity

“Anonymity is the state of being not identifiable within a set of subjects.”• There is no such thing as absolute anonymity

Unlinkability of action and identity• E.g., sender and his email are no more related

within the system than they are related in a-priori knowledge

Unobservability• Any item of interest (message, event, action) is

indistinguishable from any other item of interest

“Anonymity is bullshit” - Joan Feigenbaum

Anonymity and Knowledge

Anonymity deals with hiding information• User’s identity is hidden• Relationship between users is hidden• User cannot be identified within a set of

suspects Natural way to express anonymity is to

state what the adversary should not know• Good application for logic of knowledge• Not supported by conventional formalisms for

security (process calculi, I/O automata, …) To determine whether anonymity holds,

need some representation of knowledge

k-Anonymity

Basic idea• Someone robbed the bank• Detectives know that it is one of k people

Advantage• Does not involve probability

Disadvantages• Does not involve probability• Depends on absence of additional information

Data Anonymity

Problem: de-identifying data does not necessarily make it anonymous. It can often be re-identified:

Ethnicity

Visit date

Diagnosis

Procedure

Medication

Total bill

ZIP

Birth date

Sex

Medical Data

Name

Address

Dateregistered

Party

Date lastvoted

Voter Lists

ZIP

Birth date

Sex

SOURCE: LATANYA SWEENEY

Date of birth, gender + 5-digit ZIP uniquely identifies 87.1% of U.S. population

ZIP 60623 has 112,167 people, 11%, not 0% uniquely identified. Insufficient # over 55 living there.

SOURCE: LATANYA SWEENEY

•= one ZIP code

Anonymity via Random Routing

Hide message source by routing it randomly• Popular technique: Crowds, Freenet, Onion Routing

Routers don’t know for sure if the apparent source of a message is the true sender or another router• Only secure against local attackers!

Onion Routing

R R4

R1

R2

R

RR3

Bob

R

R

R

Sender chooses a random sequence of routers • Some routers are honest, some hostile• Sender controls the length of the path• Similar to a MIX cascade

Goal: hostile routers shouldn’t learn that Alice is talking to Bob

[Reed, Syverson, Goldschlag ’97]

Alice

The Onion

R4

R1

R2R3

BobAlice

{R2,k1}pk(R1),{ }k1

{R3,k2}pk(R2),{ }k2

{R4,k3}pk(R3),{ }k3

{B,k4}pk(R4),{ }k4

{M}pk(B)

• Routing info for each link encrypted with router’s public key• Each router learns only the identity of the next router

Crowds System

C C4

C1

C2

C

C

CC3

C0

sender recipient

C

C

C

Cpf

1-pf

Routers form a random path when establishing connection• In onion routing, random path is chosen in advance by sender

After receiving a message, honest router flips a biased coin• With probability Pf randomly selects next router and forwards msg

• With probability 1-Pf sends directly to the recipient

[Reiter,Rubin ‘98]

Messages encrypted with shared symmetric keys

Probabilistic Notions of Anonymity

Beyond suspicion• The observed source of the message is no

more likely to be the true sender than anybody else

Probable innocence• Probability that the observed source of the

message is the true sender is less than 50%

Possible innocence• Non-trivial probability that the observed

source of the message is not the true sender

Guaranteed by Crowds if there aresufficiently many honest routers:Ngood+Nbad pf/(pf-0.5)(Nbad +1)

A Couple of Issues

Is probable innocence enough?

…1% 1% 1% 49% 1% 1% 1%

Multiple-paths vulnerability• Can attacker relate multiple paths from same

sender?– E.g., browsing the same website at the same time of

day

• Each new path gives attacker a new observation• Can’t keep paths static since members join and

leave

Maybe Ok for “plausible deniability”

Probabilistic Model Checking

......

0.3

0.5

0.2 Participants are finite-state

machines• Same as Mur

State transitions are probabilistic• Transitions in Mur are

nondeterministic Standard intruder model

• Same as Mur: model cryptography with abstract data types

Mur question:• Is bad state reachable?

Probabilistic model checking question:• What’s the probability of reaching bad

state?

“bad state”

Discrete-Time Markov Chains

S is a finite set of states s0 S is an initial state T :SS[0,1] is the transition relation

s,s’S s’ T(s,s’)=1 L is a labeling function

(S, s0, T, L)

Markov Chain: Simple Example

B

AC

D

0.2

E0.8

0.9

1.0

1.0

0.5

0.50.1s0

Probabilities of outgoingtransitions sum up to 1.0for every state

• Probability of reaching E from s0 is 0.20.5+0.80.10.5=0.14

• The chain has infinite paths if state graph has loops– Need to solve a system of linear equations to compute probabilities

PRISM [Kwiatkowska et al., U. of Birmingham]

Probabilistic model checker System specified as a Markov chain

• Parties are finite-state machines w/ local variables• State transitions are associated with probabilities

– Can also have nondeterminism (Markov decision processes)

• All parameters must be finite

Correctness condition specified as PCTL formula Computes probabilities for each reachable state

– Enumerates reachable states– Solves system of linear equations to find probabilities

PRISM Syntax

B

AC

D

0.2

E0.8

0.9

1.0

1.0

0.5

0.50.1s0

module Simple state: [1..5] init 1; [] state=1 -> 0.8: state’=2 + 0.2: state’=3; [] state=2 -> 0.1: state’=3 + 0.9: state’=4; [] state=3 -> 0.5: state’=4 + 0.5: state’=5;endmodule IF state=3 THEN with prob. 50% assign 4 to state,

with prob. 50% assign 5 to state

Modeling Crowds with PRISM

Model probabilistic path construction Each state of the model corresponds to a

particular stage of path construction• 1 router chosen, 2 routers chosen, …

Three probabilistic transitions• Honest router chooses next router with probability pf,

terminates the path with probability 1-pf

• Next router is probabilistically chosen from N candidates

• Chosen router is hostile with certain probability

Run path construction protocol several times and look at accumulated observations of the intruder

PRISM: Path Construction in Crowds

module crowds . . . // N = total # of routers, C = # of corrupt routers // badC = C/N, goodC = 1-badC [] (!good & !bad) -> goodC: (good’=true) & (revealAppSender’=true) + badC: (badObserve’=true);

// Forward with probability PF, else deliver [] (good & !deliver) -> PF: (pIndex’=pIndex+1) & (forward’=true) + notPF: (deliver’=true);. . .endmodule

Next router is corrupt with certain probability

Route with probability PF, else deliver

PRISM: Intruder Model

module crowds . . . // Record the apparent sender and deliver [] (badObserve & appSender=0) -> (observe0’=observe0+1) & (deliver’=true); . . . // Record the apparent sender and deliver [] (badObserve & appSender=15) -> (observe15’=observe15+1) & (deliver’=true); . . .endmodule

• For each observed path, bad routers record apparent sender• Bad routers collaborate, so treat them as a single attacker• No cryptography, only probabilistic inference

Probabilistic Computation Tree Logic Used for reasoning about probabilistic

temporal properties of probabilistic finite state spaces

Can express properties of the form “under any scheduling of processes, the probability that event E occurs is at least p’’• By contrast, Mur can express only properties of

the form “does event E ever occur?’’

PCTL Logic [Hansson, Jonsson ‘94]

State formulas• First-order propositions over a single state

::= True | a | | | | P>p[]

Path formulas• Properties of chains of states

::= X | Uk | U

PCTL Syntax

Predicate over state variables(just like a Mur invariant)

Path formula holdswith probability > p

State formula holds forevery state in the chain

First state formula holds for every state in the chain until second becomes true

PCTL: State Formulas

A state formula is a first-order state predicate• Just like non-probabilistic logic

X=3y=0

X=1y=1

X=1y=20.2 X=2

y=0

0.8 0.5

1.0

1.0s0

0.5

= (y>1) | (x=1)

True

TrueFalse

False

PCTL: Path Formulas

A path formula is a temporal property of a chain of states 1U2 = “1 is true until 2 becomes and stays

true”

X=3y=0

X=1y=1

X=1y=20.2 X=2

y=0

0.8 0.5

1.0

1.0s0

0.5

= (y>0) U (x>y) holds for this chain

PCTL: Probabilistic State Formulas

Specify that a certain predicate or path formula holds with probability no less than some bound

X=3y=0

X=1y=1

X=1y=20.2 X=2

y=0

0.8 0.5

1.0

1.0s0

0.5

= P>0.5[(y>0) U (x=2)]

False

TrueTrue

False

Intruder Model Redux

module crowds . . . // Record the apparent sender and deliver [] (badObserve & appSender=0) -> (observe0’=observe0+1) & (deliver’=true); . . . // Record the apparent sender and deliver [] (badObserve & appSender=15) -> (observe15’=observe15+1) & (deliver’=true); . . .endmodule

Every time a hostile crowd member receives a message from some honest member, he records his observation (increases the count for that honest member)

Negation of Probable Innocence

launch -> [true U (observe0>observe1) & done] > 0.5

“The probability of reaching a state in which hostile crowd

members completed their observations and observed the

true sender (crowd member #0) more often than any of

the other crowd members (#1 … #9) is greater than 0.5”

launch -> [true U (observe0>observe9) & done] > 0.5

Analyzing Multiple Paths with PRISM

Use PRISM to automatically compute interesting probabilities for chosen finite configurations

“Positive”: P(K0 > 1)• Observing the true sender more than once

“False positive”: P(Ki0 > 1)• Observing a wrong crowd member more than once

“Confidence”: P(Ki0 1 | K0 > 1)• Observing only the true sender more than once

Ki = how many times crowd member i was recorded as apparent sender

Size of State Space

510

1520

34

56

0

5,000,000

10,000,000

15,000,000

States

Honest routers

Paths

All hostile routers are treated as a single router, selected with probability 1/6

Sender Detection (Multiple Paths)

34

56

612

1824

0%

10%

20%

30%

40%

50%

Sender detection

PathsRouters

All configurations satisfy probable innocence

Probability of observing the true sender increases with the number of paths observed

… but decreases with the increase in crowd size

Is this an attack? Reiter & Rubin: absolutely not But…

• Can’t avoid building new paths

• Hard to prevent attacker from correlating same-sender paths

1/6 of routers are hostile

Attacker’s Confidence

612

1824

34

56

80%

85%

90%

95%

100%

Attacker confidence

RoutersPaths

“Confidence” = probability of detecting only the true sender

Confidence grows with crowd size Maybe this is not so strange

• True sender appears in every path, others only with small probability

• Once attacker sees somebody twice, he knows it’s the true sender

Is this an attack? Large crowds: lower probability to

catch senders but higher confidence that the caught user is the true sender

But what about deniability?1/6 of routers are

hostile

Probabilistic Contract Signing

Slides borrowed from Vitaly Shmatikov

Probabilistic Fair Exchange

Two parties exchange items of value• Signed commitments (contract signing)• Signed receipt for an email message (certified

email)• Digital cash for digital goods (e-commerce)

Important if parties don’t trust each other• Need assurance that if one does not get what it

wants, the other doesn’t get what it wants either

Fairness is hard to achieve• Gradual release of verifiable commitments• Convertible, verifiable signature commitments• Probabilistic notions of fairness

Properties of Fair Exchange Protocols

Fairness• At each step, the parties have approximately

equal probabilities of obtaining what they want

Optimism• If both parties are honest, then exchange

succeeds without involving a judge or trusted third party

Timeliness• If something goes wrong, the honest party

does not have to wait for a long time to find out whether exchange succeeded or not

Rabin’s Beacon

A “beacon” is a trusted party that publicly broadcasts a randomly chosen number between 1 and N every day• Michael Rabin. “Transaction protection by

beacons”. Journal of Computer and System Sciences, Dec 1983.

Jan 27 Jan 28 Jan 29 Jan 30 Jan 31 Feb 1 …

15

28

2 211

25

Contract

CONTRACT(A, B, future date D, contract terms)

Exchange of commitments must be concluded by this date

CONTRACT(A, B, future date D, contract terms)

Rabin’s Contract Signing Protocol

sigB”I am committed if 1 is broadcast on day D”

sigA”I am committed if 1 is broadcast on day D”

sigA”I am committed if i is broadcast on day D”

sigB”I am committed if i is broadcast on day D”…

sigA”I am committed if N is broadcast on day D”

sigB”I am committed if N is broadcast on day D”

2N messages are exchanged if both parties are honest

Probabilistic Fairness

Suppose B stops after receiving A’s ith message• B has sigA”committed if 1 is broadcast”,

sigA”committed if 2 is broadcast”,

… sigA”committed if i is broadcast”

• A has sigB”committed if 1 is broadcast”, ...

sigB”committed if i-1 is broadcast”

… and beacon broadcasts number b on day D• If b <i, then both A and B are committed• If b >i, then neither A, nor B is committed• If b =i, then only A is committed This happens only

with probability 1/N

Properties of Rabin’s Protocol

Fair• The difference between A’s probability to

obtain B’s commitment and B’s probability to obtain A’s commitment is at most 1/N

– But communication overhead is 2N messages

Not optimistic• Need input from third party in every transaction

– Same input for all transactions on a given day sent out as a one-way broadcast. Maybe this is not so bad!

Not timely• If one of the parties stops communicating, the

other does not learn the outcome until day D

BGMR Probabilistic Contract Signing

[Ben-Or, Goldreich, Micali, Rivest ’85-90]

Doesn’t need beacon input in every transaction Uses sigA”I am committed with probability pA”

instead of sigA”I am committed if i is broadcast on day D”

Each party decides how much to increase the probability at each step• A receives sigB”I am committed with probability pB” from B

• Sets pA=min(1,pB)

• Sends sigA”I am committed with probability pA” to B

… the algorithm for B is symmetric

is a parameter chosen by A

CONTRACT(A, B, future date D, contract terms)

BGMR Message Flow

sigB”I am committed with probability “

0.12

sigA”I am committed with probability “

0.10

0.20sigA”I am committed with probability “

sigB”I am committed with probability “

0.23

sigA”I am committed with probability “

1.00

sigB”I am committed with probability “

1.00

Conflict Resolution

sigA”I am committed with probability “

pA1

pA2sigA”I am committed with probability “

???

sigB”I am committed with probability “

pB1

sigB”I am committed with probability “pB1

judge

“Binding” or “Canceled”(same verdict for both

parties)

“Binding” or “Canceled”(same verdict for both

parties)01Waits until date D

If pB1, contract is binding, else contract is canceled

Judge

Waits until date D to decide Announces verdict to both

parties Tosses coin once for each

contract Remembers previous coin tosses

• Constant memory: use pseudo-random functions with a secret input to produce repeatable coin tosses for each contract

Does not remember previous verdicts• Same coin toss combined with

different evidence (signed message with a different probability value) may result in a different verdict

Privilege and Fairness

At any step where Prob(B is privileged) > v,Prob(A is not privileged | B is privileged) <

Intuition: at each step, the parties should have comparable probabilities of causing the judge to declare contract binding (privilege must be symmetric)

Fairness

A party is privileged if it has the evidence tocause the judge to declare contract binding

Privilege

Intuition: the contract binds either both parties, or neither; what matters is the ability to make the contract binding

Properties of BGMR Protocol

Fair• Privilege is almost symmetric at each step: if Prob(B is privileged) > pA0

, then

Prob(A is not privileged | B is privileged) < 1-1/

Optimistic• Two honest parties don’t need to invoke a judge

Not timely• Judge waits until day D to toss the coin• What if the judge tosses the coin and

announces the verdict as soon as he is invoked?

Formal Model

Protocol should ensure fairness given any possible behavior by a dishonest participant• Contact judge although communication hasn’t

stopped• Contact judge more than once• Delay messages from judge to honest participant

Need nondeterminism• To model dishonest participant’s choice of actions

Need probability• To model judge’s coin tosses

The model is a Markov decision process

Constructing the Model

Discretize probability space of coin tosses• The coin takes any of N values with equal probability

Fix each party’s “probability step” • Rate of increases in the probability value contained

in the party’s messages determines how many messages are exchanged

A state is unfair if privilege is asymmetric• Difference in evidence, not difference in

commitments

Compute probability of reaching an unfair state for different values of the parties’ probability steps Use PRISM

Defines state space

Attack Strategy

Dishonest B’s probability of driving the protocol to an unfair state is maximized by this strategy:

1.Contact judge as soon as first message from A arrives 2.Judge tries to send verdict to A (the verdict is probably

negative, since A’s message contains a low probability value)

3.B delays judge’s verdicts sent to A4.B contacts judge again with each new message from A until

a positive verdict is obtained

This strategy only works in the timely protocol• In the original protocol, coin is not tossed and

verdict is not announced until day D

Conflict between optimism and timeliness

Analysis Results

For a higher probability of winning, dishonest B must

exchange more messages with honest A

Probability of reachinga state where B is privileged and A is not

Increase in B’s probability value at each step(lower increase means moremessages must be exchanged)

Attacker’s Tradeoff

Linear tradeoff for dishonest B between probability of winning and ability to delay judge’s messages to A

Without complete control of the communication network, B may settle for a lower probability of winning

Expected number ofmessages before unfair state is reached

Probability of reachinga state where B is privileged and A is not

Summary

Probabilistic contract signing is a good testbed for probabilistic model checking techniques • Standard formal analysis techniques not

applicable• Combination of nondeterminism and

probability• Good for quantifying tradeoffs

Probabilistic contract signing is subtle• Unfairness as asymmetric privilege• Optimism cannot be combined with

timeliness, at least not in the obvious way