privacy vulnerability of published anonymous mobility traces

Privacy Vulnerability ofPrivacy Vulnerability ofPublished Anonymous Mobility Published Anonymous Mobility TracesTraces

Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University)

Nageswara S. V. Rao(Oak Ridge National Laboratory)

Motivation:Motivation:Collecting mobility tracesCollecting mobility tracesMobile network applications

◦traffic monitoring, road surface sensing, radiation and chemical detection

Mobility traces are collected and published to assist the design, analysis, and evaluation of mobile networks◦E.g., Crawdad

Motivation:Motivation:Privacy vulnerabilityPrivacy vulnerability

Measures are carried out to protect privacy of the participants◦Traces are identified using a random

but consistent and unique identifier that is not correlated to the real ID

◦Spatial and temporal granularities are reduced

<11:32:12, Chris Ma, (41.89840,-87.61999)>

<11:30~11:35, ID-271, (41.89~41.90,-87.62~-87.61)>

These measures are not enough!◦Participants can be openly observed◦Participants may leak their location

information (snapshots of time and location pairs, termed as side information) web blogs, status in social networks, tweets,

causal conversations, etc.

An adversary, who tries to identify the complete trace (movement history) of one or more participants, may succeed with high probability

Motivation:Motivation:Privacy vulnerabilityPrivacy vulnerability

Our contributionsOur contributionsComprehensive study of attack

strategies◦Various ways for side information collection◦Analytically proved the optimality of attack

strategy◦Quantitative simulation results

Privacy implications of characteristics of real traces and synthetic traces◦Synthetic nodes are more sparsely placed

More easily identified but more difficult to meet with

AgendaAgendaProblem formulationAnalytical derivationExperimental analysisConclusion

Problem formulationProblem formulation- trace sampling and publication- trace sampling and publication

<t, R.B., (x,y)> <t’, IDi, (x’,y’)>

Problem formulationProblem formulationAn adversary tries to identify the

complete movement history of the participant(s)◦collects side information and

compares with the published tracesPossible attack scenarios

◦Adversary infers the location of a victim indirectly (passive adversary)

◦Adversary observes the movement of the victims physically (active adversary)

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

Special case:reference times are sampling times

General case:reference times are not sampling times

Infers the possible location of the node at reference times using a general mobility model - preference of the nodes, physical constraints

Infers the possible location of the node at reference times using a general mobility model

Attack approaches of passive Attack approaches of passive adversaryadversaryUse of Bayesian approach to determine the

trace that gives the best match with the inferred location information

Published traces

Noisy side information

Attack approaches of passive Attack approaches of passive adversaryadversaryFor the special case (reference time =

sampling time), with the assumption that noise is i.i.d.,

For the general case, with the assumptions that noise is i.i.d. and movement is Markovian,

Attack approaches of passive Attack approaches of passive adversaryadversary

Most Likelihood Estimator (MLE) approach

Minimum Square (MSQ) approach

Basic (BAS) approach

Weighted Exponential (EXP) approach

• When noise is Gaussian, MLE and MSQ are equivalent

Distance0

0 Distance

Active AdversaryActive Adversary- observes victims physically- observes victims physically

Adversary is one of the participants

Adversary stays at a (popular) position

Adversary travels between popular locations

Problem formulationProblem formulationWhy the two different cases?

◦Active Needs to consider how to collect the side

information physically as time evolves Adversary tries to identify as many victims

as possible – plot of k-anonymity as function of time

◦Passive Snapshots of victim are inferred (not

collected) and less accurate in general Adversary tries to identify one victim only –

plot of correctness as function of pieces of side information

Attack strategy of active Attack strategy of active adversaryadversaryAlgorithm of the attack (in

action)1 A, B, C2 A, B, C3 A, B, C

1 A, B, C2 A, B, C3 A, B, C

1 A, B2 A, B3 A, B, C

real ID trace IDs

Experimental analysisExperimental analysisBasic information

◦Real traces 536 San Francisco taxicabs 2348 Shanghai Grid buses

◦Synthetic traces Using map size and average speed computed

from taxi cab traces Random waypoint (with different maximum

trip lengths) Random walk

◦Spatial granularity = 1 km◦Temporal granularity = 1 minute

(unless stated otherwise)

Characteristics of the tracesCharacteristics of the tracesDistance between tracesDistance between traces

Real traces are closer to each other on average◦ Bus traces have a

broader range For synthetic traces,

the shorter the trip length, the further away they are from each other in general

Significant observationsSignificant observations• Lack of preferred locations and

random initial location of the synthetic traces–Nodes are more sparsely distributed in

the network• Implications:–For adversary in general• Can easily identify the trace of a synthetic

node since no other traces share similar path–For active adversary• May take longer time to meet with each

synthetic node

Attack performanceAttack performancePassive adversary (special case)Passive adversary (special case)

Special case - side-information inferred at sampling times of traces

Correct assumption of noise (Gaussian )

Cab traces Observations

◦ MLE, MSQ perform equally well

◦ BAS gives the least amount of wrong conclusions initially

Random waypoint traces

Most efficient attack◦ traces have very

different paths

Incorrect assumption of noise◦ Assumption:

Uniform◦ Actual: Gaussian

Cab tracesObservations

◦ MLE is much worsened

Attack performanceAttack performancePassive adversary (general case)Passive adversary (general case)

General case – side information at times different from trace sampling times

Worst case scenario – all times are different

Infer the location of the victim using the mobility model

Gaussian noise (no noise as best performance bound)

Cab traces

SummarySummaryPassive adversaryPassive adversary

For passive adversary◦MLE and MSQ give the best

performance among the four approaches in terms of the fraction of correct conclusions

◦Since MLE relies on the knowledge of type of noise and its magnitude, MSQ is the preferred more robust attack approach

Attack performanceAttack performanceActive adversary as one of mobile nodesActive adversary as one of mobile nodes

Higher attack efficiency for real traces◦ Mobile nodes

more likely to visit the same set of locations at the same time

◦ Synthetic nodes more sparsely distributed in the network

1 time step = 1 minute

Attack performanceAttack performanceActive adversary who stays at one of the Active adversary who stays at one of the cellscells

cabs buses

Random waypointRandom walk

Observations◦ Comparing real traces and synthetic

traces Attacks on real traces are more efficient –

k-anonymity drops more quickly◦ Popular cells in real traces and random

waypoint traces are more aggregated together

◦ Being at a popular cell does not necessarily results in higher attack efficiency

cabs buses

Random waypointRandom walk

Attack performanceAttack performanceActive adversary who moves among Active adversary who moves among popular cellspopular cells

The ability to move among popular cells improve attack efficiency◦ Improvement is more

significant if node movements are more localized

◦Visiting more cells does not necessarily improves efficiency

ConclusionConclusionStudy how privacy leaks through

trace publication◦Under different adversary strategies to

collect side information◦Using different mobile traces with

different characteristicsExperimentally show that the

adversary is able to identify the trace of a victim from the published set with high probability

privacy vulnerability of published anonymous mobility traces

Documents

local alcoholics anonymous and narcotics anonymous meeting

trafficking · 2020. 12. 3. · this report traces the...

vulnerability summary for the week of september 7, 2018 ·...

fieldnotes #3: traces

do children’s omissions leave...

structural vulnerability non-structural vulnerability...

traces documentation

traces ge biography

howison traces

traces - foedevarestyrelsen.dk

traces of things

anonymous networks · overview - what is an anonymous...

melting traces

emagazine.ncdrc.res.in€¦ · vulnerability and...

alcoholics anonymous effectiveness: faith meets science ·...

cucm traces

author’s response to anonymous referee 1 anonymous referee...

marcus lipstick traces

unity traces & troubleshooting

traces - ec.europa.eu