statistical measurement of information leakage file–we define i(q,w) = i(q;y) = hvhgv •maximum...

28
Statistical Measurement of Information Leakage Tom Chothia Univ. of Birmingham Joint work with Kostas Chatzikokolakis and Apratim Guha, and Vitaly Smirnov

Upload: duongdan

Post on 28-Aug-2019

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Statistical Measurement ofInformation Leakage

Tom ChothiaUniv. of Birmingham

Joint work with Kostas Chatzikokolakisand Apratim Guha, and Vitaly Smirnov

Page 2: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Overview

• Estimate information leakage statistically fromtrail runs of a real system.– Automatic tool to calculate information leakage.

• We work out bounds on the possible error.

• For actuate results you need more samplesthan the (no. of inputs) x (no. of observations).

• Example: leakage from e-Passport

Page 3: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Information Theory

• A Channel, has inputs X, outputs Y, and a probabilitytransition matrix W(x|y).

• Information sent across the channel = I(X;Y),– We define I(Q,W) = I(Q;Y) = hvhgv

• Maximum rate is the Channel Capacity:

C(W) = MaxQ I(Q,W)

the most information you can send across a channel nomatter how data is sent.

Page 4: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Information Leakage = Capacity(System)

• Think of the whole system as a channel.– The guilty user is the input to the “channel”.– The observable actions are the outputs from the

“channel”.

• Capacity tells us what we can learn about theusers from the observable actions.

• Similar approach can be taken for InformationFlow e.g. Millen, Clark et al. etc.

Page 5: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Applying this to Real Systems

How do we apply information theoreticmeasures of information leakage to realsystems?

• Source code may be too complex to analysedirectly or model / behave probabilistically.

• Leak may be caused by the implementation:– Time based attack on RSA, (Paul Kocher)– Bandwidth attack on Tor (Murdoch & Danezis)

Page 6: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Method of AnalysingInformation Leakage

• To analyse a system we define the inputs andoutputs.– Some abstraction might be needed to make the number

of observations manageable

• We run tests of the system for each input.

• From these tests we estimate a probability matrix.

• We estimate capacity, from the matrix using theBlahut-Arimoto algorithm.

Page 7: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Terms

W : the matrix of the true system.

Wn : a matrix estimated from n samples.

Q : the input dist. that maximise M.I.

Qm(Wn) : the B.A. algorithm applied to Wn.

C = I(Q,W) : the true system capacity.

Cn,m = I(Qm(Wn),Wn) : estimate of capacity ??

ˆ

ˆ ˆ

ˆ ˆ ˆ ˆ

Page 8: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Convergence

Theorem: Cn,m almost surely convergencesto C as n,m

i.e., for any pe and error e there exists n‘ &m’ such that for n > n’ and m > m’:

p(| C - Cn,m | > e ) < peˆ

ˆ

Page 9: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

The Distribution of Anonymity

We can get bounds on the error by ask whatdistribution Cn,m comes from.

Adapting a statistical method from Rao:

• We find the Taylor expansion of the Cn,m• We drop the terms smaller than sampleSize-2

• We then calculate the mean and variance.• We find the distribution using the central limit

theory.

ˆ

ˆ

Page 10: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

For Non-Zero MutualInformation

When the true value is not 0, anestimation of capacity is drawn from anormal distribution with:

Mean: I(Qm(Wn),W) + (I-1)(J-1) + O(n-2) 2nVariance: ...

ˆ ˆ

Page 11: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Variance

1. ∑x Q(x).(∑y W(y|x). (log( Q(x).W(y|x) ))2

n ∑x’ Q(x’)W(y|x’) - (∑y W(y|x). log( Q(x).W(y|x) ))2

∑x’ Q(x’)W(y|x’)+O(n-2)

Page 12: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Results for I = 0

When the true value is 0, an estimation ofcapacity (or mutual information) isdrawn from the distribution:

2n.I ~ χηι2((noOfInputs-1)(noOfOutputs-1))

Mean: (noOfInputs-1)(noOfOutputs-1)/2Variance: (noOfInputs-1)(noOfOutputs-1)/2n2

Page 13: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Upper Bound on the Variance

• In both cases var(C(W)) < I.J / n

• Rule of thumb:– if I.J >> n the variance will be low and the

results actuate.

– If you can get this many samples thenstatically analysis is useful, otherwise not.

Page 14: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

To Analyse a System.

• We define the inputs (I) and outputs (J).

• Run n tests of the system with n >> I.J

• Estimate the matrix and find C = I(Qm(Wn),Wn)

• Point Estimate is:Max ( 0, I(Qm(Wn),Wn) – (I-1)(J-1)/2n )

ˆˆ ˆ ˆ

ˆ ˆ ˆ

Page 15: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

Page 16: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

95% confidence

Page 17: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

95% confidence

Page 18: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

Lower bound

Upper bound

Page 19: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

Page 20: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

Page 21: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Using the Distributions

Observed Capacity Estimate

Prob

C

Lower bound

Upper bound

Page 22: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send
Page 23: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Basic Access Control

Reader Passport --- GET CHALLENGE → Pick random NP ←----- NP --------- Pick random NR,KR --- {NR,NP,KR}Ke,MACKm({NR,NP,KR}Ke) → Check MAC, Decrypt, Check NP Pick random KP ← {NP,NR,KP}Ke,MACKm({NP,NR, KP}Ke)-- Check MAC, Decrypt, Check NR

Page 24: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Attack Part 1

Attacker eavesdrops on Alice using her passport

Reader Alice --- GET CHALLENGE → Pick random NP ←----- NP --------- Pick random NR,KR --- M = {NR,NP,KR}KAeMACKAm({NR,NP,KR}KAe) →

Attack records message M.

Page 25: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Attack Part 2

Attacker ???? --- GET CHALLENGE → Pick random NP2 ←----- NP2 ---------

--- {NR,NP,KR}KAe,MACKAm({NR,NP,KR}KAe) →

Check MAC, Decrypt, Check NPIf

Page 26: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

A Time-Based Attack?

• UK,German, Russian, Irish passportreturns error 6900 in both situation.

• But what about a timing attack?

• Using capacity measure:– Input is “same passport” or “different

passport”.– Output: the response time in seconds

to 4 d.p.

Page 27: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Information Leakage of the e-Passports

We find:

Therefore we can trace a passport withwill a high probability. E.g. > 95%

Page 28: Statistical Measurement of Information Leakage file–We define I(Q,W) = I(Q;Y) = hvhgv •Maximum rate is the Channel Capacity: C(W) = Max Q I(Q,W) the most information you can send

Conclusion

• We can measure information theoreticinformation leakage from sampleddata alone.

• Point and click analyse of systems.

• Example: a new traceability attackagainst e-Passports