using declarative invariants for protecting file-system integrity

21
Using Declarative Invariants for Protecting File-System Integrity By Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel and Angela Demke Brown University of Toronto 1

Upload: serena

Post on 11-Jan-2016

23 views

Category:

Documents


1 download

DESCRIPTION

Using Declarative Invariants for Protecting File-System Integrity. By Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel and Angela Demke Brown University of Toronto. Motivation. File systems have bugs Cause corruption and/or data loss - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Declarative Invariants for Protecting File-System Integrity

1

Using Declarative Invariants for Protecting File-System

IntegrityBy Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel and

Angela Demke BrownUniversity of Toronto

Page 2: Using Declarative Invariants for Protecting File-System Integrity

2

File systems have bugs• Cause corruption and/or data loss• Existing reliability techniques, e.g., journaling,

RAID, don’t help

Existing recovery solutions• Restore from backup, but slow, risks loss of data• Offline checker (i.e. fsck), but too slow

Can we do better?

Motivation

Page 3: Using Declarative Invariants for Protecting File-System Integrity

3

Eliminate Bugs• Static Analysis

Does not scale Bugs may be input dependent

Tolerate Bugs• N-version programming (e.g., Envyfs)

High overheads (performance and storage) Can only check features common to all versions

• Micro reboot of file system (e.g. Membrane) Requires detectable failures Many corruption bugs are fail-silent

Possible Alternatives?

Page 4: Using Declarative Invariants for Protecting File-System Integrity

4

Verify correctness at runtime• Benefit: Make silent failures detectable

What are we checking?• Same thing fsck checks

When are we checking for consistency?• When file system claims to be consistent

Leverage transactions provided for crash recovery I.e. check at commit time

How are we doing the checks?• Convert global fsck checks into local checks on

transactions

Our Approach

Page 5: Using Declarative Invariants for Protecting File-System Integrity

5

File systems have consistency properties• E.g. all in-use data blocks are marked in the block

allocation bitmap• This is a global property that fsck checks

For each property, we derive an invariant• Invariant must hold in any transaction to preserve

the corresponding property• E.g. when Block N is allocated, bit N must be set

in the allocation bitmap, in the same transaction• Invariants operate on changes in transactions,

requiring local checks

No Really… How?

Page 6: Using Declarative Invariants for Protecting File-System Integrity

6

The focus of work…

Change records encode the updates in a transaction

Recon Data Flow

Write Cache

Read Cache

Modified Block

LogicalDifference

Engine

Original Block

Change Record

InvariantChecker

Invariants

Violation?

Page 7: Using Declarative Invariants for Protecting File-System Integrity

7

Example: write 8 bytes of data into an empty file with inode #1

Change Record

Change Record[type, id, field, old, new]

Description

[inode, 1, 3, 0, 7] For inode #1, set direct block pointer (field 3 in inode) from 0 to 7, i.e., allocate block 7 to inode #1

[b_freemap, 7, _, 0, 1] Set bit 7 on (from 0 to 1)

[inode, 1, 2, 0, 8] For inode #1, set size (field 2 of inode) from 0 to 8

Block 7 allocated

Direct block

Bit set

Page 8: Using Declarative Invariants for Protecting File-System Integrity

8

Clean and concise

Works well with

recursive structures

Easy to reason about

Recon: C(Imperative)

SQCK: SQL(Query) ?

Recon: Datalog(Logic)

How to Express Invariants?

Page 9: Using Declarative Invariants for Protecting File-System Integrity

9

R1_violation(IN ,BN) :- block_allocated(IN, BN),not(change(b_freemap, _, _, BN, _, 1)).

R2_violation(BN) :- change(b_freemap, _, _, BN, _, 1),not(block_allocated(_, BN)).

Datalog Invariant CheckingDatalog facts: change(type, id, field, old, new)

change(inode, 1, 3, 0, 7).

change(b_freemap, 7, _, 0, 1).

change(inode, 1, 2, 0, 3).

Change records are trivially converted to Datalog facts

Page 10: Using Declarative Invariants for Protecting File-System Integrity

10

On each transaction• Add facts from change records into Datalog

knowledge base• Check all invariants on Datalog facts

Problem• Set of facts grows over time• Facts need to persist across reboots

Slows invariant checking Introduces more consistency problems

Insight• After commit, all facts in the transaction are

incorporated in file system

Invariant Checking

Page 11: Using Declarative Invariants for Protecting File-System Integrity

11

FS state is available in Recon caches We provide Datalog primitives to access

caches Can discard all facts after transaction commit

Querying File System State

Disk

Change Record

Invariants

Violation?Primitives

Datalog Interpreter

Read Cache

Write Cache

…Fact

Page 12: Using Declarative Invariants for Protecting File-System Integrity

12

Example: Directory Cycle Detection

path(X , P) :- dir_get_parent(X, P).path(X , A) :- dir_get_parent(X, P),

path(P, A).cycle(X) :- path(X, X).

No cycle for this tree!

Using Primitives

c b a /

Primitive

Page 13: Using Declarative Invariants for Protecting File-System Integrity

13

Implemented for a simple test file system• TestFS implemented at user level• Designed to be a simplified version of Ext3

All TestFS invariants are applicable to Ext3

TestFS has 12 Datalog invariants Ext3 has 33 invariants in C Invariants are independent

Total number of lines of invariant code is 38

Current Status

Page 14: Using Declarative Invariants for Protecting File-System Integrity

14

Datalog invariants for ext3, btrfs file systems• Currently, Ext3/Btrfs Recon is implemented in OS• We plan to implement it in a hypervisor to provide

strong fault model• Don’t need to port Datalog to kernel!

Customize Datalog interpreter• Optimize for file-system specific operations

Future Work

Page 15: Using Declarative Invariants for Protecting File-System Integrity

15

The Recon framework allows detecting arbitrary metadata corruption through runtime consistency checking

When a transaction commits, Recon checks invariants to ensure file system consistency

Invariants can be expressed in Datalog clearly and concisely

Conclusion

Page 16: Using Declarative Invariants for Protecting File-System Integrity

16

Using Declarative Invariants for Protecting File-System

IntegrityBy Kuei (Jack) Sun, Daniel Fryer, Ashvin Goel

and Angela Demke Brown

Questions?

Page 17: Using Declarative Invariants for Protecting File-System Integrity

17

Workload• ~203K commands

e.g. mkdir, rmdir, rm, touch, cd, write to file

Evaluation

Original With Checking Overhead

User 17.8±0.2s 36.4±0.1s 2.04x

System 22.5±0.1s 23.10.1s 1.03x

Sleep 545.4±9.1s 604.9±9.0s 1.11x

Total 585.8±9.2s 664.4±9.1s 1.13x

Page 18: Using Declarative Invariants for Protecting File-System Integrity

18

Example: move /a into /a/b/c

Directory Cycle Detection

c

b

a

/

: child entry

: parent entry

cb

a

/

Page 19: Using Declarative Invariants for Protecting File-System Integrity

19

Change records for move /a into /a/b/c

Directory Cycle Detection

Datalog Fact[type, id, field, old, new]

Description

change(dir_block, 0, 1 ‘a’, φ).

Remove entry ‘a’ from root node (inode #0)

change(dir_block, 1, 0, ‘..’, φ).

Remove entry ‘..’ from inode #1

change(dir_block, 3, 1, φ, ‘a’).

Add entry ‘a’ to inode #3 (inode for directory ‘c’)

change(dir_block, 1, 3, φ, ‘..’)

Add entry ‘..’ to inode #1 and set it to inode #3 (directory ‘c’ is its parent)

Page 20: Using Declarative Invariants for Protecting File-System Integrity

20

cycle(3).• path(3, 3).

parent(3, 3). parent(3, ?), path(?, 3). parent(3, 2), path(2, 3).

parent(3, 2), parent(2, 3). parent(3, 2), parent(2, ?), path(?, 3). parent(3, 2), parent(2, 1), path(1, 3).

parent(3, 2), parent(2, 1), parent(1, 3).

We have a match, a.k.a: violation!

Invariant Checking

path(IN , PIN) :- dir_get_parent(IN , PIN).path(IN , AIN) :- dir_get_parent(IN , PIN), path(PIN , AIN).cycle(IN) :- path(IN , IN).

cb

a1

32

Page 21: Using Declarative Invariants for Protecting File-System Integrity

21

Problem:• The set of change records that we have is

insufficient.• From the transaction alone, we cannot deduce the

parent of ‘c’ and ‘b’. We know the parent of ‘a’ is ‘c’.

Solution:• Primitives are predicates written in

the C language that is able to querythe read and write cache in Recon

Primitiveschange(dir_block, 3, 1, φ, ‘a’).change(dir_block, 1, 3, φ, ‘..’).

cb

a1

32