efficient systematic testing for dynamically updatable software

Post on 23-Feb-2016

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Efficient Systematic Testing for Dynamically Updatable Software. Christopher M. Hayden, Eric A. Hardisty , Michael Hicks, Jeffrey S. Foster University of Maryland, College Park. Dynamic Software Updating (DSU). Performing updates to software at runtime has clear benefits: - PowerPoint PPT Presentation

TRANSCRIPT

Efficient Systematic Testing for

Dynamically Updatable Software

Christopher M. Hayden, Eric A. Hardisty, Michael Hicks, Jeffrey S. Foster

University of Maryland, College Park

2

Dynamic Software Updating (DSU)

Performing updates to software at runtime has clear benefits: Increased software availability No need to terminate active connections /

computation

… but can we trust updated software? Critical to ensure updates are safe

3

Our Contributions Verification of DSU through testing:

Testing Procedure Test Minimization Algorithm

Experiments Effectiveness of Minimization Empirical study of Update Safety / Safety

Checks

4

DSU Safety DSU creates the opportunity for new

sources of bugs: Faulty state transformation Unsafe update timing

Safety Checks – restrict when updates may be applied Activeness Safety / Con-freeness Safety

5

Activeness Safety (AS) AS prevents updates to active code In this example, no patch updating main or

foo is allowed:

main() { foo();

… baz(); }

foo() { … bar();}

6

Con-freeness Safety (CFS) CFS (Stoyle, et al ‘05) allows updates to

active code only when type safety can be ensured

In this example, no patch updating the signature of baz or bar is allowed:

main() { foo();

… baz(); }

foo() { … bar();}

7

DSU Testing Safety Checks offer limited guarantees:

CFS and AS ensure type-safe execution AS ensure that you never return to old code following

an update Neither of these properties ensure safe update timing

We propose testing to verify the correctness of allowed update points: Use existing suite of application system tests Ensure that updating anywhere during the execution

of those tests results in an execution that passes the test.

8

Testing Procedure Approach:

Instrument application to trace update points

Execute system test and gather initial trace

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test

Potential Update Points

Trace Start

9

Testing Procedure Approach:

Instrument application to trace update points

Execute system test and gather initial trace

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test ✔

initial trace

10

Testing Procedure Approach:

Instrument application to trace update points

Execute system test and gather initial trace

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test

✔initial trace update tests

✔✘✔

11

Update Test Minimization Program traces may have thousands or millions

of update points Many update tests have the same behavior for

a given patch we can eliminate redundant tests

baz() {…}Patch A

void main() {

foo();

bar();

baz();}

Version 0

foo() {…}bar() {…}baz() {…}

Patch B

All update points yield

same behavior

All update points yield

distinct behavior

12

Minimization Algorithm Execution events are traced if they have the

potential to conflict with a patch A event conflicts with a patch p if applying p before

the event might produce a different result than applying p after the event

Example: function calls, global variable accesses

Trace the execution of a test T on P0

Iterate through the trace noting the last update point each time we reach a conflicting trace element

Run only the identified update tests Tnp

13

Experimental Results

14

Experimental Setup Based testing infrastructure on top of the

Ginseng DSU system (Neamtiu, et al): Modified to support tracing and updating at pre-

selected update points Insertion of explicit update points before each

function call to approximate more liberal systems

Disabled safety checking (CFS) for experiments

Tested 3 years of patches to OpenSSH and vsftpd (only report OpenSSH in this talk)

15

Program Modificationsfoo() { while (1) { // main loop

update();

extract { ... // main loop body } } extract { ... // after main Loop }}

Identify Long-running loopsAdd a Manually

Selected Update Point

Perform Loop Body Extraction

PerformContinuation

Extraction

16

Experiments: Update Test Suite

How many update tests must be run to test real-world updates to real-world applications?

How effective is minimization at eliminating redundant tests?

17

Update Test Suite Size: OpenSSH

#D to next version ReductionSig Fun Typ

eAll Points Activeness-Safe Points

0 3 98 5 580,871 g 31,791 (95%) 35,314 g 3,027 (91%)

1 0 6 0 705,322 g 1,795 (~100%) 587,578 g 1,717 (~100%)

2 5 238 11 638,720 g 63,011 (90%) 20,902 g 2,353 (89%)

3 0 18 0 772,198 g 4,324 (99%) 638,803 g 3,775 (99%)

4 13 172 10 773,086 g 27,399 (96%) 21,343 g 1,564 (93%)

5 0 24 1 878,235 g 17,398 (98%) 111,950 g 1,723 (98%)

6 6 257 10 879,668 g 47,092 (95%) 44,278 g 2,139 (95%)

7 4 179 12 918,717 g 89,601 (90%) 100,854 g 4,141 (96%)

8 0 72 3 973,364 g 34,293 (96%) 61,724 g 2,070 (97%)

9 10 157 7 933,514 g 52,356 (94%) 61,051 g 2,891 (95%)

Total 8,053,695 g 369,060 (95%) 1,683,797 g 25,400 (98%)

18

Empirical Study of Update Safety

How many failures occur when applying updates arbitrarily?

How many failures occur when applying updates subject only to the AS and CFS safety checks?

19

Safety: OpenSSHD to next version All Points CFS Points AS Points

Update Sig Fun Type Failed Total Failed Total Failed

Total

0 3 98 5 19,715 580,871 0 68,044 0 35,314

1 0 6 0 0 705,322 0 705,322 0 587,578

2 5 238 11 306,965 683,720 1,688 75,307 4 20,902

3 0 18 0 0 772,198 0 772,198 0 638,803

4 13 172 10 565,681 773,086 609 110,633 380 21,343

5 0 24 1 10,703 878,235 0 130,000 0 111,950

6 6 257 10 163,333 879,668 44,461 96,183 110 44,278

7 4 179 12 11,380 918,717 1 80,070 1 100,854

8 0 72 3 3 973,364 0 261,885 0 61,724

9 10 157 7 357,919 933,514 24 121,337 0 61,051

Total 1,435,699 8,053,695 46,783 2,420,979 495 1,683,797

20

void handle_upload_common() { ret = do_file_recv();}

void do_file_recv() { … // receive file if (ret == SUCCESS) write(226, “OK.”); return ret;}

Version 0

void handle_upload_common() { ret = do_file_recv(); if (ret == SUCCESS) write(226, “OK.”);}

void do_file_recv () { … // receive file return ret;}

Version 1 (patch)

Unsafe Timing:Version Inconsistency (vsftpd)

Unsafe Timing:Version Inconsistency

void foo() { bar(); … baz();}

void bar() { … }

void baz() { dig(); … }

Version 0

void foo() { bar(); … baz();}

void bar() { dig(); … }

void baz() { … }

Version 1 (patch)

Manually Selected Update Points

22

D to next version Safety

# Tests Sig Fun

Type Reduction Failed

Total

0 75 3 98 5 566 g 566 (0%) 0 566

1 75 0 6 0 630 g 592 (6%) 0 630

2 76 5 238 11 568 g 568 (0%) 0 568

3 91 0 18 0 783 g 770 (2%) 0 783

4 91 13 172 10 782 g 782 (0%) 0 782

5 104 0 24 1 860 g 841 (2%) 0 860

6 104 6 257 10 859 g 859 (0%) 0 859

7 104 4 179 12 850 g 850 (0%) 0 850

8 105 0 72 3 868 g 823 (5%) 0 868

9 104 10 157 7 833 g 833 (0%) 0 833

Total

7,599 g 7,484 (2%) 0 7,599

23

Summary We have argued that verification is

necessary to prevent unsafe updates Provided empirical evidence that AS/CFS

cannot prevent all unsafe updates

We have presented an approach for testing dynamic updates

We have presented and evaluated a minimization strategy to make update testing more practical

24

Discussion Questions Given that AS cannot ensure correctness

(both in theory and in practice), should DSU implementations continue to rely on it?

What standards for verification should be required of DSU system benchmarks?

Are there other assumptions of DSU that are appropriate for empirical evaluation?

top related