crashmonkey: a framework to automatically test file...

46
CrashMonkey: A Framework to Systematically Test File-System Crash Consistency Ashlie Martinez Vijay Chidambaram University of Texas at Austin

Upload: others

Post on 04-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey: A Framework to Systematically Test

File-System Crash Consistency

Ashlie Martinez

Vijay Chidambaram

University of Texas at Austin

Page 2: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Crash Consistency

• File-system updates change multiple blocks on storage

• Data blocks, inodes, and superblock may all need updating

• Changes need to happen atomically

• Need to ensure file system consistent if system crashes

• Ensures that data is not lost or corrupted

• File data is correct

• Links to directories and files unaffected

• All free data blocks are accounted for

• Techniques: journaling, copy-on-write

• Crash consistency is complex and hard to implement

2

Page 3: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Testing Crash Consistency

• Randomly power cycling a VM or machine

• Random crashes unlikely to reveal bugs

• Restarting machine or VM after crash is slow

• Killing user space file-system process

• Requires special file-system design

• Ad-hoc

• Despite its importance, no standardized or systematic tests

3

Page 4: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

What Really Needs Tested?

• Current tests write data to disk each time

• Crashing while writing data is not the goal

• True goal is to generate disk states that crash could cause

4

Page 5: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

5

CrashMonkeyFramework to test crash consistency

Works by constructing crash states for given workload

Does not require reboot of OS/VM

File-system agnostic

Modular, extensible

Currently tests 100,000 crash states in ~10min

Page 6: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

6

Page 7: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

How Consistency Is Tested Today

• Power cycle a machine or VM

• Crash machine/VM while data is being written to disk

• Reboot machine and check file system

• Random and slow

• Run file system in user space

• ZFS test strategy

• Kill file system user process during write operations

• Requires file system have the ability to run in user space

Write to foo.txt

7

Rebooting – Please Wait...

?X

Page 8: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

8

Page 9: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Persistent storage deviceBlock Device

Linux Storage Stack

9

VFS Provides consistent interface across file systems

Page Cache Holds recently used files and data

File System Ext, NTFS, etc.

Generic Block Layer Interface between file systems and device drivers

Block Device Driver Device specific driver

Disk Cache Caches data on block device

Page 10: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes – Write Flags

• Metadata attached to operations sent to device driver

• Change how the OS and device driver order operations

• Both IO scheduler and disk cache reorder requests

• sync – denotes process waiting for this write

• Orders writes issued with sync in that process

• flush – all data in the device cache should be persisted

• If request has data, data may not be persisted at return

• Forced Unit Access (FUA) – return when data is persisted

• Often paired with flush so all data including request is durable

10

Page 11: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes

• Data written to disk in epochs

• each terminated by flush and/or FUA operations

• Reordering within epochs

• Operating system adheres to FUA, flush, and sync flags

• Block device adheres to FUA and flush flags

11

E: write, sync

F: write, sync

G: write, sync

H: FUA, flush

Epoch 2Epoch 1

A: writeB: write,

metaC: write,

syncD: flush

Page 12: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes – Example

12

echo “Hello World!” > foo.txt

Data 1 Data 2 flush

epoch 1

Journal: inode

flush

epoch 2

Journal: commit

flush

epoch 3

Operating System

Block Device

Page 13: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes – Example

13

echo “Hello World!” > foo.txt

Data 1 Data 2 flush

epoch 1

Journal: inode

flush

epoch 2

Journal: commit

flush

epoch 3

Operating System

Block Device

Data 2 Data 1 flush

epoch 1

Page 14: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes – Example

14

echo “Hello World!” > foo.txt

Data 1 Data 2 flush

epoch 1

Journal: inode

flush

epoch 2

Journal: commit

flush

epoch 3

Operating System

Block Device

Data 2 Data 1 flush

epoch 1

Journal: inode

flush

epoch 2

Page 15: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Linux Writes – Example

15

echo “Hello World!” > foo.txt

Data 1 Data 2 flush

epoch 1

Journal: inode

flush

epoch 2

Journal: commit

flush

epoch 3

Operating System

Block Device

Data 2 Data 1 flush

epoch 1

Journal: inode

flush

epoch 2

Journal: commit

flush

epoch 3

Page 16: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

16

Page 17: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Goals for CrashMonkey

• Fast

• Ability to intelligently and systematically direct tests toward interesting crash states

• File-system agnostic

• Works out of the box without the need for recompiling the kernel

• Easily extendable and customizable

17

Page 18: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey: Architecture

18

File System

Generic Block Layer

Device Wrapper

Custom RAM Block Device

Test Harness

Kernel

User

User Workload

Crash State 1

Crash State 2

User provided file-system operations

Records information about user workload

Provides fast writable snapshot capability

Generated potential crash states

Page 19: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Constructing Crash States

19

touch foo.txt

echo “foo bar baz” > foo.txt

Randomly choose n epochs to permute (n = 2 here)Journal:

inode

flush

epo

ch 1

Data 1

Data 2

Data 3

flush

epo

ch 2

Journal: inode

flush

epo

ch 3

Page 20: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Constructing Crash States

20

touch foo.txt

echo “foo bar baz” > foo.txt

Randomly choose n epochs to permute (n = 2 here)

Copy epochs [1, n – 1]

Journal: inode

flush

epo

ch 1

Data 1

Data 2

Data 3

flush

epo

ch 2

Journal: inode

flush

epo

ch 3

Journal: inode

flush

epo

ch 1

Page 21: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Constructing Crash States

21

touch foo.txt

echo “foo bar baz” > foo.txt

Data 3

Data 1

epo

ch 2

Randomly choose n epochs to permute (n = 2 here)

Copy epochs [1, n – 1]

Permute and possibly drop operations from epoch n

Journal: inode

flush

epo

ch 1

Data 1

Data 2

Data 3

flush

epo

ch 2

Journal: inode

flush

epo

ch 3

Journal: inode

flush

epo

ch 1

Page 22: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

22

User WorkloadTest Harness

Device Wrapper

Base Disk

Page 23: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

23

Workload Setup

User WorkloadTest Harness

Device Wrapper

Base Disk

Metadata

mkdir test

Page 24: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

24

Snapshot Device

User WorkloadTest Harness

Device Wrapper

Writable Snapshot

Metadata

Page 25: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

25

Profile Workload

User WorkloadTest Harness

Device Wrapper

Writable Snapshot

Metadata Data

Metadata Metadata

Data

echo “bar baz” > foo.txt

Page 26: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

26

Export Data

User WorkloadTest Harness

Device Wrapper

Writable Snapshot

Metadata Data

Metadata Metadata

Data

Data

Metadata

Page 27: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

27

Restore Snapshot

User WorkloadTest Harness

Device Wrapper

Crash State

Metadata Data

Metadata

DataMetadata

Page 28: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

28

Reorder Data

User WorkloadTest Harness

Device Wrapper

Crash State

Metadata Data

Metadata

Metadata

Page 29: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

29

Write Reordered Data to Snapshot

User WorkloadTest Harness

Device Wrapper

Crash State

Metadata Data

Metadata

Metadata

Metadata

Page 30: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

CrashMonkey In Action

30

Check File-System Consistency

User WorkloadTest Harness

Device Wrapper

Crash State

Metadata Data

Metadata

Metadata

Metadata

Page 31: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Testing Consistency

• Different types of consistency

• File system is inconsistent and unfixable

• File system is consistent but garbage data

• File system has leaked inodes but is recoverable

• File system is consistent and data is good

• Currently run fsck on all disk states

• Check only certain parts of file system for consistency

• Users can define checks for data consistency

31

Page 32: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Customizing CrashMonkey

• Customize algorithm to construct crash states

• Customize workload:• Setup

• Data writes

• Data consistency tests

32

class BaseTestCase {

public:

virtual int setup();

virtual int run();

virtual int check_test();

};

class Permuter {

public:

virtual void init_data(vector);

virtual bool gen_one_state(vector);

};

Page 33: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

33

Page 34: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Results So Far

• Testing 100,000 unique disk states takes ~10 minutes

• Test creates 10 1KB files in a 10MB ext4 file system

• Majority of time spent running fsck

• Profiling the workload takes ~1 minute

• Happens only once per user-defined test

• Want operations to write to disk naturally

• sync() adds extra operations to those recorded

• Must wait for writeback delay

• Decrease delay through /proc file

34

Page 35: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

35

Page 36: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

The Path Ahead

• Identify interesting crash states

• Focus on states which have reordered metadata

• Huge search space from which to select crash states

• Avoid testing equivalent crash states

• Avoid generating write sequences that are equivalent

• Generate write sequences then check for equivalence

• Parallelize tests

• Each crash state is independent of the others

• Optimize test harness to run faster

• Check only parts of file system for consistency

36

Page 37: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Outline

• Overview

• How Consistency is Tested Today

• Linux Writes

• CrashMonkey

• Preliminary Results

• Future Plans

• Conclusion

37

Page 38: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Conclusion• Crash consistency is very important

• Crash consistency is hard and complex to implement

• Current crash consistency not well tested despite importance

• CrashMonkey seeks to alleviate these problems

• Efficient, systematic,file-system agnostic

• Work in progress

• Code available at https://github.com/utsaslab/crashmonkey

38

Page 39: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Thank You!

Questions?

39

Page 40: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Related Work

• ALICE and BOB [Pillai et al. OSDI’14]

• Very narrow scope – explore how file systems crash

• No attempt to explore or test crash consistency

• Database Replay Framework [Zheng et al. OSDI’14]

• Specifically targets databases

• Works only on SCSI drives

• Not open source

• Does not allow user defined tests

40

Page 41: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

41

User Process

RAM Block Device

Metadata: inode

File data

File system writeKernel

User

Page 42: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

42

Writable Snapshot

Metadata: inode

File data

RAM Block Device

Metadata: inode

File data

User Process

Kernel

User

Snapshot

Page 43: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

43

Writable Snapshot

Metadata: inode

File data

RAM Block Device

Metadata: inode

File data

User Process

Kernel

User

Read original data

Page 44: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

44

Writable Snapshot

Metadata: inode

New file data

RAM Block Device

Metadata: inode

File data

User Process

Kernel

User

Overwrite file data

Page 45: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

45

Writable Snapshot

Metadata: inode

New file data

RAM Block Device

Metadata: inode

File data

User Process

Kernel

User

Write new data

Metadata: inode 2

File 2 data

Page 46: CrashMonkey: A Framework to Automatically Test File …vijay/papers/hotstorage17-crashmonkey-slides.pdf•Test creates 10 1KB files in a 10MB ext4 file system •Majority of time spent

Custom RAM Block Device

46

Writable Snapshot

Metadata: inode

File data

RAM Block Device

Metadata: inode

File data

User Process

Kernel

User

Restore