d leakage detection - stanford...
TRANSCRIPT
![Page 1: D LEAKAGE DETECTION - Stanford Universityinfolab.stanford.edu/~ppapadim/papers/Data_Leakage_Detection_po… · Leakage Problem A data distributor, e.g., Facebook, owns a set T of](https://reader035.vdocument.in/reader035/viewer/2022062506/5f0539f67e708231d411e8ca/html5/thumbnails/1.jpg)
DATA LEAKAGE DETECTIONPanagiotis Papadimitriou and Hector Garcia-Molina
Stanford University
Jeremy Sarah MarkName: SarahSex: Female
…Name: Mark
Sex: Male…
Kathryn
Other Sourcese.g., Mark’s Friend
App. U1
App. U2
Leakage ProblemA data distributor, e.g., Facebook,owns a set T of private data items,e.g., FB profiles. The distributorgives to supposedly trusted agentsU1, …, Un, e.g., Facebook Apps, the
sets R1, …, Rn T. A leaker obtainsdata from agents or from othersources and publishes set S T.An agent who provides the leakerwith data is guilty.• Given the leaked set S, what isthe probability that Ui is guilty?• How can the distributor allocatedata items to to agents so that hecan detect guilty agents?
or
Independently All OR nothing
or
(1-p)2
(1-p)p
p(1-p)
p2
Guilt Modelsp: Posterior probability that aleaked profile comes from othersources (other than the agents).Pr(Gi|S): Probability that agent Ui
is guilty, given the leaked set ofprofiles S.
Models’ Assumptions• Agents leak each of their dataitems independently.• Agents leak all their data itemsOR nothing.
or
Pr(G1|S)
Pr(G2|S) Pr(G2|S)
Pr(G1|S)
Data Allocation ProblemAgents’ Requests• Sample, e.g., any 100 Stanfordprofiles.• Explicit, e.g., all people whoadded an application.
ObjectiveAllocate data to agents so that ifUi leaks his set Ri, then
Pr(Gi|S) >> Pr(Gj|S) for i ≠ j.
Example• 4 agents U1, U2, U3 and U4.• Each agents requests a sample of (any) 2 profiles.
U1
U2
U3
U4
U1
U2
U3
U4
U1
U2
U3
U4
GoodPoor Optimal
Agents U1 and U2 are not suspects if U3 or U4 leak data
All agents have the same guilt prob. in case of leakage
Agent Ui who leaks its data has the highest guilt prob.
OR
minimize(over R1 , ..., Rn )
1
RiRiR j
ji
i
(1)
minimize(over R1 , ..., Rn )
maxi j
RiR j
Ri(2)
Allocation StrategiesSample Requests• s-random: Allocates at random.• s-overlap: Minimizes sum of over
laps |Ri Rj|.• s-sum: Minimizes (1).• s-max: Minimizes (2).
Explicit Requests• no fake: Allocates exactly therequested data items.• e-random, e-optimal: In additionto the requested real items, theyallocate B fake items. e-randomallocates them at random. e-optimal minimizes (1) and (2).
Sample Requests Explicit Requests
min mini j
Pr(Gi | S Ri)Pr(G j | S Ri)