protecting location privacy though path confusion [1]

48
Quantifying Location Privacy Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux Presented By: Solomon Njorombe

Upload: samuru

Post on 06-Jan-2016

35 views

Category:

Documents


2 download

DESCRIPTION

Protecting Location Privacy Though Path Confusion [1]. Baik Hoh, Marco Gruteser. CS898 Presentation By Jason Tomlinson. Introduction. A quick overview of this papers primary purpose. Outline for our Discussion Two types of l ocation b ased technologies - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Protecting Location Privacy Though Path  Confusion [1]

Quantifying Location PrivacyReza Shokri, George Theodorakopoulos, Jean-Yves Le

Boudec,

and Jean-Pierre Hubaux

Presented By: Solomon Njorombe

Page 2: Protecting Location Privacy Though Path  Confusion [1]

2

Abstract

• Security issues in progressed personal communication• Many Location-Privacy Protection Mechanisms (LPPMs) proposed• No systematic quantification, and incomplete assumptions

• Framework for LPPMs analysis• Information and attacks available to adversary• Formalize attack performance

• Adversary inference attacks(accuracy, certainty, correctness)• Implement Location Privacy meter• Assess popular metrics(Entropy and k-anonymity) • Low correlation to adversary’s success

Page 3: Protecting Location Privacy Though Path  Confusion [1]

Introduction

Page 4: Protecting Location Privacy Though Path  Confusion [1]

4

Introduction

• Smartphones with location sensors: GPS/Triangulation • Convenience, but leaves traces of your where about• Infer on habits, interests, relationships, secrets

• Increased computing power. • Data mining algorithms, parallel db analysis• Threat to privacy

•Users have the right to control the information shared• Minimal information or only with trusted entities

Page 5: Protecting Location Privacy Though Path  Confusion [1]

5

Introduction: Motivation

Aim: Progress the quantification of performance of LPPM•Why?

• Lack unified generic formal framework. Hence divergent contribution and confusion. Which is more effective LPPM

Humans, bad estimators of risks

A meaningful way to compare LPPMs

Literature, not matured enough in this

Page 6: Protecting Location Privacy Though Path  Confusion [1]

6

Introduction: Contributions

1. Generic model to formalize adversarial attacks• Define tracking and localization on anonymous traces as statistical

inference problem

2. Statistical methods to evaluate performance of such inference attack• Expected estimation error as right metric

3. Location Privacy Meter

4. Inappropriateness of existing metrics

Page 7: Protecting Location Privacy Though Path  Confusion [1]

Framework

Page 8: Protecting Location Privacy Though Path  Confusion [1]

8

Framework

• Location privacy is a tuple : Set of mobile users: Actual traces of userLPPM: Location-Privacy Preserving Mechanism

• Acts on and produces

: Traces observed by adversaryADV: Adversary

• Try to infer a having observed o , relying on LPPM knowledge & users’ mobility model

METRIC: metric for performance and success of ADV. Implies users’ location privacy 𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 9: Protecting Location Privacy Though Path  Confusion [1]

9

Framework: Mobile Users

• set on N mobile users within area portioned into M regions • : Set of time instants when users can be observed. It is

discrete. • Spatiotemporal position of users modeled through events and

traces• Event: where • Trace for user u: T-size vector for events

au(1)=

au(T) =

-> Tuple

au(2)𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 10: Protecting Location Privacy Though Path  Confusion [1]

10

Framework: Mobile Users

• : Set of all traces that may belong to user u •Actual trace of u: Only true trace of u for the period t=1…T

Actual (au(1), au(2), … au(T))

•Actual events: Events of the actual trace of user u, , …

• : Set of all possible traces for all users

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 11: Protecting Location Privacy Though Path  Confusion [1]

11

Framework: Location-Privacy Preserving Mechanisms (LPPM)• LPPM: Mechanism of modifying and distorting actual traces

before exposure•Different implementations• Offline (e.g. from DB) vs Online (On the fly)• Centralized(central anonymity server) vs Distributes(Users’ phones)

• Receives N actual traces and modify them in 2 steps• Obfuscation: Location event replaced with location pseudonyms • Anonymization: User part of each trace replaced with user

pseudonym 𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 12: Protecting Location Privacy Though Path  Confusion [1]

12

Framework: Location-Privacy Preserving Mechanisms (LPPM)•Obfuscated event: <u, r’, t> where •Obfuscated trace: • : Set of all possible obfuscated traces of user u

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 13: Protecting Location Privacy Though Path  Confusion [1]

13

Framework: Location-Privacy Preserving Mechanisms (LPPM)•Obfuscation mechanism:

function that maps a trace into a random variable taking values from set • Probability density function

•Methods by LPPMs to reduce accuracy and/or precision of the events’ spatiotemporal information

• Perturbation• Adding dummy regions

• Reducing precision(merge regions)• Location hiding

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 14: Protecting Location Privacy Though Path  Confusion [1]

14

Framework: Location-Privacy Preserving Mechanisms (LPPM)•Anonymization mechanism: Function randomly chosen from

functions mapping to •Drawn according to probability function •We consider random permutation over possible N!

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 15: Protecting Location Privacy Though Path  Confusion [1]

15

Framework: Location-Privacy Preserving Mechanisms (LPPM)

an instantiation of random variables

Set of actual traces

Set of obfuscated traces

Set of anonymized traces

Set of obfuscated traces

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 16: Protecting Location Privacy Though Path  Confusion [1]

16

Framework: Location-Privacy Preserving Mechanisms (LPPM)• Summarize LPPM with the probability distribution that gives

the probability of mapping into

•Adversary’s aim is to reconstruct a when given o

: Set of all observable traces of user u

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 17: Protecting Location Privacy Though Path  Confusion [1]

17

Framework : Adversary

•Knows anonymization and obfuscation probability distribution functions f and •Has access to training traces + users’ public information• Based on this information, construct mobility profile Pu for

each user •Given LPPM(ie. f &), users’ profiles {(u, Pu)}, observed traces

{o1, o2,…, oN} attacker runs inference attack formulating objectives as (subset of Users, Regions & Time)

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 18: Protecting Location Privacy Though Path  Confusion [1]

18

Framework : Adversary

• Presence/Absence disclosure attacks• Infer user, regions relationship over time1. Tracking attacks: ADV trying to find full/partial sequence os a

user’s track2. Localization attacks: ADV target a single event in a user’s trace

•Meeting Disclosure attack• ADV interested in proximity btw 2 users. (meeting in a given time)

• Paper’s algorithm implement general attack• General attack: Try to recover traces for all users𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 19: Protecting Location Privacy Though Path  Confusion [1]

19

Framework : Evaluation

• Traces are probabilistically generated• Actual traces – probabilistic over user mobility profile• Observed traces – probabilistic over LPPM

•Attack output can be• Probability distribution of possible outcomes • Most probable outcome• Expected outcome under distribution of possible outcomes• Any function of the actual trace

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 20: Protecting Location Privacy Though Path  Confusion [1]

20

Framework : Evaluation

• : Function for the attacker’s objective• If its argument is a then is correct answer to the attack

• : Set of values can take for a given attack( M regions, N users, MT traces of one user)

• But attacker cannot obtain exact , the task is highly probabilistic. • Best hope: extract all information about it from observed

traces

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 21: Protecting Location Privacy Though Path  Confusion [1]

21

Framework : Evaluation

• Extracted information is in the form Pr(x|o), • x is from all possible value derivable from observed o

•Uncertainty: Ambiguity of Pr(x|o) in respect to finding a unique answer (Max under uniform distribution)• Inaccuracy: Difference between Pr(x|o) and • : estimate as ADV doesn’t have infinite resource

• But uncertainty and Inaccuracy don’t quantify user’s privacy, correctness does

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 22: Protecting Location Privacy Though Path  Confusion [1]

22

Framework : Evaluation

• Correctness: Distance between result of the attack and the real answer.

Accu

racy

Certainty

Correctness

•Accuracy and certainty may not be equivalent to correctness

Consider situation with insufficient traces

Only correctness really matters

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 23: Protecting Location Privacy Though Path  Confusion [1]

23

Framework : Evaluation

Accuracy: Quantified with confidence interval and level

Certainty: Quantified through entropy. Concentrated vs uniform. Higher entropy -> lower certainty

Confidence level = 1X=xc

Prohibitively costly

) for some x. It is within some confidence interval

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 24: Protecting Location Privacy Though Path  Confusion [1]

24

Framework : Evaluation

Correctness: Quantified as expected distance between true xc and . • If there is a distance ||.|| between members of X. expected

estimation error is

• If the distance was =0 iff x=xc and 1 otherwise incorrectness would be:

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 25: Protecting Location Privacy Though Path  Confusion [1]

25

Framework : Evaluation

• So correctness is the metric that determines user privacy

•Adversary doesn’t know xc, and cannot observe this parameter. •However Accuracy, Certainty

and correctness are very independent.

𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪

Page 26: Protecting Location Privacy Though Path  Confusion [1]

Location Privacy Meter

Page 27: Protecting Location Privacy Though Path  Confusion [1]

27

Location-Privacy Preserving Machanisms

• Implemented 2 obfuscation mechanisms1. Precision Reducing(merging regions)

Drop low order bits of or region identifier

Eg µx and µy dropped bits of x and y coordinates

2. Location hidingEvents are independently eliminated. Replace location

with Ø with probability λh : location hiding level

• To import LPPM into tool, Specify probability function by importing• Anonymization function• Obfuscation function

Page 28: Protecting Location Privacy Though Path  Confusion [1]

28

Knowledge of the Adversary

Page 29: Protecting Location Privacy Though Path  Confusion [1]

29

Knowledge of the Adversary

•Adversary collects information about user mobility• Can translate to event, transition, full/partial traces

• This can be encoded as:• Traces or• Matrix of Transition Count TC• TC is an M x M matrix with ij number of i to j transitions user created and

not encoded in the traces• Adversary also considers user mobility constraint

Page 30: Protecting Location Privacy Though Path  Confusion [1]

30

Knowledge of the Adversary

•ADV tries to model user mobility using Markov Chain • Such that Pu : user’s transition matrix for their Markov chain• : probability that user will move from rj to ri in next time slot

•Objective: construct Pu starting from prior mobility information.•With bigger goal of:• estimating the underlying Markov Chain• Fill the Training Trace TT towards ET(Estimated Trace)

• Utilize convergence in Gibbs sampling

Page 31: Protecting Location Privacy Though Path  Confusion [1]

31

Tracking Attack

•ADV tries to reconstruct partial/complete actual traces

Maximum Likelihood Tracking Attack•Objective: Find jointly most likely traces for all users, given

the observed traces• That is done within a space of N!MT elements, brute force

approach is not practical

Page 32: Protecting Location Privacy Though Path  Confusion [1]

32

Tracking Attack : Maximum Likelihood Tracking Attack

• Proceed through two steps:• Deanonymization

Cannot assign most probable traces, multiple users may get same tracesPerform the likelihood for all trace-user pairsCreate an edge weighted bipartite graph

The edge weight is the user-trace likelihoodFind maximum weighted Assignment

use Hungarian algorithm

• De-obfuscation Set of users Set of traces

Page 33: Protecting Location Privacy Though Path  Confusion [1]

33

Tracking Attack : Maximum Likelihood Tracking Attack•De-obfuscation• Use Viterbi algorithm. Tries to maximize the joint probability of

the most likely traces. • Recursively compute the values at time T(max probability)• But interest is on the trace itself• Almost similar to finding the shortest path in a edge-weighted

directed graph. Vertices as set of R x T

Page 34: Protecting Location Privacy Though Path  Confusion [1]

34

Tracking Attack : Distribution Tracking Attack• Computes the distribution of traces for each user rather than

the most likely trace•Use Metropolis Hasting algorithm• Try to draw sample from that are identically distributed to as

per the desired distribution. •MH tries to perform a random walk over possible values of ()• Can answer wide range of U-R-T questions but very

computationally intensive.

Page 35: Protecting Location Privacy Though Path  Confusion [1]

35

Localization Attack

• Find the location of user u at time t•Output: distribution of possible region, from which they

select the most probable•Attacker needs estimate of observed trace(Max weighted

assignment)• Can be computed using Forward-Backward algorithm

Page 36: Protecting Location Privacy Though Path  Confusion [1]

36

Meeting Disclosure Attack

• Objective 1: specify a pair of users (u and v), a region r and time t• Computed as a product of the distribution for both events• These established through localization attacks• Another objective: Just a pair of users. How often they would

have met, and the region• Answered using localization attack

• Objective 3: Location and time, expecting number of present users• Through localization attacks again

Page 37: Protecting Location Privacy Though Path  Confusion [1]

Using The Tool: Evaluation of LPPMs

Page 38: Protecting Location Privacy Though Path  Confusion [1]

38

Using The Tool: Evaluation of LPPMs

•Goals:1. Use Location Privacy Meter to quantify effectiveness of LPPMs2. Evaluate effectiveness of entropy and k-anonymity to quantify

location privacy

• Location samples: N=20, 5 min intervals for 8 hrs(T=96), Bay area M=40(5 by 8 grid)• Privacy mechanism:• Precision reducing• Anonymized using random permutation(unique pseudonyms 1-N)

Page 39: Protecting Location Privacy Though Path  Confusion [1]

39

Using The Tool: Evaluation of LPPMs

• To consider strongest adversary:• Feed Knowledge constructor(KC) with actual traces of user

•U-R-T attack scenario• LO-ATT(Localization Attack): User u at time t, what is his location

at time t?• MD-ATT(Meeting Disclosure Attack): How many instances in T are

two people in the same region• AP-ATT(Aggregate Presence Attack): for a region r and time t, what

is the expected time number of users present at t

•Metric: Adversary incorrectness

Page 40: Protecting Location Privacy Though Path  Confusion [1]

40

Using The Tool: Evaluation of LPPMs

LPLO-ATT(u,t) for all users u and time t

• LPPM(µx, µy, λh)

• Incorrectness of the # of users

Page 41: Protecting Location Privacy Though Path  Confusion [1]

41

Using The Tool: Evaluation of LPPMs

LPMD-ATT(u, v) for all pairs of users u, v

• LPPM(µx, µy, λh)

• Incorrectness of # of meetings

Page 42: Protecting Location Privacy Though Path  Confusion [1]

42

Using The Tool: Evaluation of LPPMs

LPAP-ATT(r, t) for all regions r and time t

• LPPM(µx, µy, λh)

• Incorrectness of number of users in a region

Page 43: Protecting Location Privacy Though Path  Confusion [1]

43

Using The Tool: Evaluation of LPPMs

•X-axis: Users privacy•Y-axis: Normalized entropy

*** : LPPM(2, 3, 0.9) strong mechanism

… . : LPPM(1, 2, 0.5) medium

ooo : LPPM(1, 0, 0.0) Weak

Page 44: Protecting Location Privacy Though Path  Confusion [1]

44

Using The Tool: Evaluation of LPPMs

•X-axis: Users privacy•Y-axis: Normalized k-

anonymity

*** : LPPM(2, 3, 0.9) strong mechanism

… . : LPPM(1, 2, 0.5) medium

ooo : LPPM(1, 0, 0.0) Weak

Page 45: Protecting Location Privacy Though Path  Confusion [1]

Conclusion

Page 46: Protecting Location Privacy Though Path  Confusion [1]

46

Conclusion

•A unified formal framework to describe and evaluate a variety of location-privacy preserving mechanisms with respect to various inference attacks• LPPM evaluation is modelled as an estimation problem and

the Expected Estimation Error metric is provided•Designed Location-Privacy Meter tool to evaluate and

compare the location-privacy preserving mechanisms

Page 47: Protecting Location Privacy Though Path  Confusion [1]

Questions

Page 48: Protecting Location Privacy Though Path  Confusion [1]

48

Framework

: Set of mobile users

: Set of regions that partition the whole area

: Time period under consideration

: Set of all possible traces

: Set of all observable traces

: Set of user pseudonyms

: Set of location pseudonyms

: Number of users

: Number of regions

: Number of considered time instants

: Number of user pseudonyms

: Number of location pseudonyms

: Obfuscation function

: Anonymization function

: Actual trace of user u

: Obfuscated trace of user u

: Observed trace of user with pseudonym i

: Set of all possible(actual) traces of user u

: Set of all possible obfuscated traces of user u

: Set of all observable traces of user u

: Profile of user u

: Attacker’s objective

: Set of values that can take