Download - Towards a theory of data entangelement

Towards a Theory of Data Entanglement

James Aspnes, Joan Feigenbaum,

Aleksandr Yampolskiy, and Sheng Zhong

(Yale University)

Outline

Motivation Dagster and Tangler Our model Notions of entanglement Possibility and impossibility results Conclusion

Goal: Protect Remotely Stored Data from the Server

Question: Suppose you store your data on a remote server. How do you ensure that it is not corrupted by the server?

Answer: Have your data entangled with some VIPs’ data so that corruption of your data corruption of theirs.

Previous Work: Dagster [SW01]

New Documen

t

Encrypt

c randomly chosen blocks

Pool of blocks

Analysis:

Deleting a typical document loss of O(c) documents

Previous Work: Tangler [WM01]

(0, New Documen

t)2 randomly chosen blocks

Pool of n blocks

Analysis:

Deleting a typical document loss of O((log n) / n) documents

Interpolate degree-2 poly F()

(x1,F(x1))(x2,F(x2))

Our Model: Basic Framework

Initialization: Keys are distributed to participants.

Entanglement: Users’ data are combined into a common store.

Tampering: Adversary tampers with the store before it is stored on server.

encoding E

…

d1 d2 dn

initializer Ik1 k2 kn

kE

tamperer storage server

Our Model: Basic Framework (cont.)

Recovery: Users attempt to recover their data.

If Ri returns original document di, we say that user i recovers her data.

…k1 k2 kn

storage server

Our Model : Classification

Question: What can the adversary do to the data store?

Answer: He can… tamper with the store tamper with the store and distribute a new

recovery algorithm to all users (upgrade attack) encrypt the store and distribute his recovery

algorithm only to a few select buddies (superencryption attack)

Our Model : Classification (cont.)

Classification based on recovery algorithm: Standard recovery algorithm

Public recovery algorithm

Private recovery algorithm

…

…

…

Our Model : Classification (cont.)

Classification based on corrupting algorithm: Destructive adversary that reduces entropy of the

data store. Arbitrary adversary.

Altogether, we have 6 (= 3£ 2) adversary classes.

Our Definitions

Fix encoding scheme , adversary , and recovery algorithms Ri.

Recovery vector summarizes which documents are recovered

Our Definitions (cont.)

Data dependency: di depends on dj if, with high probability, di is recovered dj is recovered:

d1

d2

d3 d4

d1 depends on d2


All-or-nothing integrity (AONI): every document depends on every other document:

d1

d2

d3 d4


Symmetric recovery: adversary cannot bias which documents are recovered

Possibility of AONI in Standard-Recovery Model All users use the standard recovery algorithm:

for all i, Ri=R.

When combining data, mark data store using an unforgeable Message Authentication Code (MAC).

Standard recovery algorithm checks MAC: If MAC is valid, recover data. If MAC is invalid, refuse to recover data.

Impossibility of AONI in Public and Private-Recovery Models If any users use the adversary’s recovery

algorithm (for some i, Ri ≠ R), AONI cannot be achieved

Adversary modifies the data store so that old recovery algorithm does not work.

And distributes a new recovery algorithm that flips a coin to decide whether to recover data or not.

Impossibility of AONI in Public and Private-Recovery Models (cont.)

With high probability, not all coin flips will have same result.

With high probability, some data are recovered while others are not.

…

Possibility of Symmetric Recovery in Public-Recovery Model

All users use adversary’s recovery algorithm: for all i,

We can prevent targeted destruction of documents. Documents d1,…, dn must appear i.i.d Encoding scheme must be symmetric:

Possibility of AONI for Destructive Adversaries We can achieve AONI in all recovery models if

tamperer destroys entropy. When combining data, interpolate a polynomial using

points (ki, di).

Store = polynomial. AONI is achieved if sufficient entropy is removed.

Many stores are mapped to single corrupted store.

With high probability, cannot recover every data item.

Summary of Results

Destructive Tamperer

Arbitrary Tamperer

Standard Recovery

all-or-nothing all-or-nothing

Public Recovery

all-or-nothing symmetric recovery

Private Recovery

all-or-nothing

Future Work

We have considered a single-round model. Allowing multiple rounds of storage/retrieval will be more realistic.

What if data entanglement is combined with other techniques like replication? Will that help to defend data against untrusted server(s)?

Download - Towards a theory of data entangelement

Top Related