protecting location privacy though path confusion [1]
DESCRIPTION
Protecting Location Privacy Though Path Confusion [1]. Baik Hoh, Marco Gruteser. CS898 Presentation By Jason Tomlinson. Introduction. A quick overview of this papers primary purpose. Outline for our Discussion Two types of l ocation b ased technologies - PowerPoint PPT PresentationTRANSCRIPT
Quantifying Location PrivacyReza Shokri, George Theodorakopoulos, Jean-Yves Le
Boudec,
and Jean-Pierre Hubaux
Presented By: Solomon Njorombe
2
Abstract
• Security issues in progressed personal communication• Many Location-Privacy Protection Mechanisms (LPPMs) proposed• No systematic quantification, and incomplete assumptions
• Framework for LPPMs analysis• Information and attacks available to adversary• Formalize attack performance
• Adversary inference attacks(accuracy, certainty, correctness)• Implement Location Privacy meter• Assess popular metrics(Entropy and k-anonymity) • Low correlation to adversary’s success
Introduction
4
Introduction
• Smartphones with location sensors: GPS/Triangulation • Convenience, but leaves traces of your where about• Infer on habits, interests, relationships, secrets
• Increased computing power. • Data mining algorithms, parallel db analysis• Threat to privacy
•Users have the right to control the information shared• Minimal information or only with trusted entities
5
Introduction: Motivation
Aim: Progress the quantification of performance of LPPM•Why?
• Lack unified generic formal framework. Hence divergent contribution and confusion. Which is more effective LPPM
Humans, bad estimators of risks
A meaningful way to compare LPPMs
Literature, not matured enough in this
6
Introduction: Contributions
1. Generic model to formalize adversarial attacks• Define tracking and localization on anonymous traces as statistical
inference problem
2. Statistical methods to evaluate performance of such inference attack• Expected estimation error as right metric
3. Location Privacy Meter
4. Inappropriateness of existing metrics
Framework
8
Framework
• Location privacy is a tuple : Set of mobile users: Actual traces of userLPPM: Location-Privacy Preserving Mechanism
• Acts on and produces
: Traces observed by adversaryADV: Adversary
• Try to infer a having observed o , relying on LPPM knowledge & users’ mobility model
METRIC: metric for performance and success of ADV. Implies users’ location privacy 𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
9
Framework: Mobile Users
• set on N mobile users within area portioned into M regions • : Set of time instants when users can be observed. It is
discrete. • Spatiotemporal position of users modeled through events and
traces• Event: where • Trace for user u: T-size vector for events
au(1)=
au(T) =
-> Tuple
au(2)𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
10
Framework: Mobile Users
• : Set of all traces that may belong to user u •Actual trace of u: Only true trace of u for the period t=1…T
Actual (au(1), au(2), … au(T))
•Actual events: Events of the actual trace of user u, , …
• : Set of all possible traces for all users
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
11
Framework: Location-Privacy Preserving Mechanisms (LPPM)• LPPM: Mechanism of modifying and distorting actual traces
before exposure•Different implementations• Offline (e.g. from DB) vs Online (On the fly)• Centralized(central anonymity server) vs Distributes(Users’ phones)
• Receives N actual traces and modify them in 2 steps• Obfuscation: Location event replaced with location pseudonyms • Anonymization: User part of each trace replaced with user
pseudonym 𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
12
Framework: Location-Privacy Preserving Mechanisms (LPPM)•Obfuscated event: <u, r’, t> where •Obfuscated trace: • : Set of all possible obfuscated traces of user u
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
13
Framework: Location-Privacy Preserving Mechanisms (LPPM)•Obfuscation mechanism:
function that maps a trace into a random variable taking values from set • Probability density function
•Methods by LPPMs to reduce accuracy and/or precision of the events’ spatiotemporal information
• Perturbation• Adding dummy regions
• Reducing precision(merge regions)• Location hiding
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
14
Framework: Location-Privacy Preserving Mechanisms (LPPM)•Anonymization mechanism: Function randomly chosen from
functions mapping to •Drawn according to probability function •We consider random permutation over possible N!
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
15
Framework: Location-Privacy Preserving Mechanisms (LPPM)
an instantiation of random variables
Set of actual traces
Set of obfuscated traces
Set of anonymized traces
Set of obfuscated traces
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
16
Framework: Location-Privacy Preserving Mechanisms (LPPM)• Summarize LPPM with the probability distribution that gives
the probability of mapping into
•Adversary’s aim is to reconstruct a when given o
: Set of all observable traces of user u
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
17
Framework : Adversary
•Knows anonymization and obfuscation probability distribution functions f and •Has access to training traces + users’ public information• Based on this information, construct mobility profile Pu for
each user •Given LPPM(ie. f &), users’ profiles {(u, Pu)}, observed traces
{o1, o2,…, oN} attacker runs inference attack formulating objectives as (subset of Users, Regions & Time)
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
18
Framework : Adversary
• Presence/Absence disclosure attacks• Infer user, regions relationship over time1. Tracking attacks: ADV trying to find full/partial sequence os a
user’s track2. Localization attacks: ADV target a single event in a user’s trace
•Meeting Disclosure attack• ADV interested in proximity btw 2 users. (meeting in a given time)
• Paper’s algorithm implement general attack• General attack: Try to recover traces for all users𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
19
Framework : Evaluation
• Traces are probabilistically generated• Actual traces – probabilistic over user mobility profile• Observed traces – probabilistic over LPPM
•Attack output can be• Probability distribution of possible outcomes • Most probable outcome• Expected outcome under distribution of possible outcomes• Any function of the actual trace
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
20
Framework : Evaluation
• : Function for the attacker’s objective• If its argument is a then is correct answer to the attack
• : Set of values can take for a given attack( M regions, N users, MT traces of one user)
• But attacker cannot obtain exact , the task is highly probabilistic. • Best hope: extract all information about it from observed
traces
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
21
Framework : Evaluation
• Extracted information is in the form Pr(x|o), • x is from all possible value derivable from observed o
•Uncertainty: Ambiguity of Pr(x|o) in respect to finding a unique answer (Max under uniform distribution)• Inaccuracy: Difference between Pr(x|o) and • : estimate as ADV doesn’t have infinite resource
• But uncertainty and Inaccuracy don’t quantify user’s privacy, correctness does
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
22
Framework : Evaluation
• Correctness: Distance between result of the attack and the real answer.
Accu
racy
Certainty
Correctness
•Accuracy and certainty may not be equivalent to correctness
Consider situation with insufficient traces
Only correctness really matters
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
23
Framework : Evaluation
Accuracy: Quantified with confidence interval and level
Certainty: Quantified through entropy. Concentrated vs uniform. Higher entropy -> lower certainty
Confidence level = 1X=xc
Prohibitively costly
) for some x. It is within some confidence interval
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
24
Framework : Evaluation
Correctness: Quantified as expected distance between true xc and . • If there is a distance ||.|| between members of X. expected
estimation error is
• If the distance was =0 iff x=xc and 1 otherwise incorrectness would be:
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
25
Framework : Evaluation
• So correctness is the metric that determines user privacy
•Adversary doesn’t know xc, and cannot observe this parameter. •However Accuracy, Certainty
and correctness are very independent.
𝓤𝓐 𝑳𝑷𝑷𝑴 𝓞 𝑨𝑫𝑽 𝑴𝑬𝑻𝑹𝑰𝑪
Location Privacy Meter
27
Location-Privacy Preserving Machanisms
• Implemented 2 obfuscation mechanisms1. Precision Reducing(merging regions)
Drop low order bits of or region identifier
Eg µx and µy dropped bits of x and y coordinates
2. Location hidingEvents are independently eliminated. Replace location
with Ø with probability λh : location hiding level
• To import LPPM into tool, Specify probability function by importing• Anonymization function• Obfuscation function
28
Knowledge of the Adversary
29
Knowledge of the Adversary
•Adversary collects information about user mobility• Can translate to event, transition, full/partial traces
• This can be encoded as:• Traces or• Matrix of Transition Count TC• TC is an M x M matrix with ij number of i to j transitions user created and
not encoded in the traces• Adversary also considers user mobility constraint
30
Knowledge of the Adversary
•ADV tries to model user mobility using Markov Chain • Such that Pu : user’s transition matrix for their Markov chain• : probability that user will move from rj to ri in next time slot
•Objective: construct Pu starting from prior mobility information.•With bigger goal of:• estimating the underlying Markov Chain• Fill the Training Trace TT towards ET(Estimated Trace)
• Utilize convergence in Gibbs sampling
31
Tracking Attack
•ADV tries to reconstruct partial/complete actual traces
Maximum Likelihood Tracking Attack•Objective: Find jointly most likely traces for all users, given
the observed traces• That is done within a space of N!MT elements, brute force
approach is not practical
32
Tracking Attack : Maximum Likelihood Tracking Attack
• Proceed through two steps:• Deanonymization
Cannot assign most probable traces, multiple users may get same tracesPerform the likelihood for all trace-user pairsCreate an edge weighted bipartite graph
The edge weight is the user-trace likelihoodFind maximum weighted Assignment
use Hungarian algorithm
• De-obfuscation Set of users Set of traces
33
Tracking Attack : Maximum Likelihood Tracking Attack•De-obfuscation• Use Viterbi algorithm. Tries to maximize the joint probability of
the most likely traces. • Recursively compute the values at time T(max probability)• But interest is on the trace itself• Almost similar to finding the shortest path in a edge-weighted
directed graph. Vertices as set of R x T
34
Tracking Attack : Distribution Tracking Attack• Computes the distribution of traces for each user rather than
the most likely trace•Use Metropolis Hasting algorithm• Try to draw sample from that are identically distributed to as
per the desired distribution. •MH tries to perform a random walk over possible values of ()• Can answer wide range of U-R-T questions but very
computationally intensive.
35
Localization Attack
• Find the location of user u at time t•Output: distribution of possible region, from which they
select the most probable•Attacker needs estimate of observed trace(Max weighted
assignment)• Can be computed using Forward-Backward algorithm
36
Meeting Disclosure Attack
• Objective 1: specify a pair of users (u and v), a region r and time t• Computed as a product of the distribution for both events• These established through localization attacks• Another objective: Just a pair of users. How often they would
have met, and the region• Answered using localization attack
• Objective 3: Location and time, expecting number of present users• Through localization attacks again
Using The Tool: Evaluation of LPPMs
38
Using The Tool: Evaluation of LPPMs
•Goals:1. Use Location Privacy Meter to quantify effectiveness of LPPMs2. Evaluate effectiveness of entropy and k-anonymity to quantify
location privacy
• Location samples: N=20, 5 min intervals for 8 hrs(T=96), Bay area M=40(5 by 8 grid)• Privacy mechanism:• Precision reducing• Anonymized using random permutation(unique pseudonyms 1-N)
39
Using The Tool: Evaluation of LPPMs
• To consider strongest adversary:• Feed Knowledge constructor(KC) with actual traces of user
•U-R-T attack scenario• LO-ATT(Localization Attack): User u at time t, what is his location
at time t?• MD-ATT(Meeting Disclosure Attack): How many instances in T are
two people in the same region• AP-ATT(Aggregate Presence Attack): for a region r and time t, what
is the expected time number of users present at t
•Metric: Adversary incorrectness
40
Using The Tool: Evaluation of LPPMs
LPLO-ATT(u,t) for all users u and time t
• LPPM(µx, µy, λh)
• Incorrectness of the # of users
41
Using The Tool: Evaluation of LPPMs
LPMD-ATT(u, v) for all pairs of users u, v
• LPPM(µx, µy, λh)
• Incorrectness of # of meetings
42
Using The Tool: Evaluation of LPPMs
LPAP-ATT(r, t) for all regions r and time t
• LPPM(µx, µy, λh)
• Incorrectness of number of users in a region
43
Using The Tool: Evaluation of LPPMs
•X-axis: Users privacy•Y-axis: Normalized entropy
*** : LPPM(2, 3, 0.9) strong mechanism
… . : LPPM(1, 2, 0.5) medium
ooo : LPPM(1, 0, 0.0) Weak
44
Using The Tool: Evaluation of LPPMs
•X-axis: Users privacy•Y-axis: Normalized k-
anonymity
*** : LPPM(2, 3, 0.9) strong mechanism
… . : LPPM(1, 2, 0.5) medium
ooo : LPPM(1, 0, 0.0) Weak
Conclusion
46
Conclusion
•A unified formal framework to describe and evaluate a variety of location-privacy preserving mechanisms with respect to various inference attacks• LPPM evaluation is modelled as an estimation problem and
the Expected Estimation Error metric is provided•Designed Location-Privacy Meter tool to evaluate and
compare the location-privacy preserving mechanisms
Questions
48
Framework
: Set of mobile users
: Set of regions that partition the whole area
: Time period under consideration
: Set of all possible traces
: Set of all observable traces
: Set of user pseudonyms
: Set of location pseudonyms
: Number of users
: Number of regions
: Number of considered time instants
: Number of user pseudonyms
: Number of location pseudonyms
: Obfuscation function
: Anonymization function
: Actual trace of user u
: Obfuscated trace of user u
: Observed trace of user with pseudonym i
: Set of all possible(actual) traces of user u
: Set of all possible obfuscated traces of user u
: Set of all observable traces of user u
: Profile of user u
: Attacker’s objective
: Set of values that can take