efficient systematic testing for dynamically updatable software

Efficient Systematic Testing for

Dynamically Updatable Software

Christopher M. Hayden, Eric A. Hardisty, Michael Hicks, Jeffrey S. Foster

University of Maryland, College Park

Dynamic Software Updating (DSU)

Performing updates to software at runtime has clear benefits: Increased software availability No need to terminate active connections /

computation

… but can we trust updated software? Critical to ensure updates are safe

Our Contributions Verification of DSU through testing:

Testing Procedure Test Minimization Algorithm

Experiments Effectiveness of Minimization Empirical study of Update Safety / Safety

Checks

DSU Safety DSU creates the opportunity for new

sources of bugs: Faulty state transformation Unsafe update timing

Safety Checks – restrict when updates may be applied Activeness Safety / Con-freeness Safety

Activeness Safety (AS) AS prevents updates to active code In this example, no patch updating main or

foo is allowed:

main() { foo();

… baz(); }

foo() { … bar();}

Con-freeness Safety (CFS) CFS (Stoyle, et al ‘05) allows updates to

active code only when type safety can be ensured

In this example, no patch updating the signature of baz or bar is allowed:

main() { foo();

… baz(); }

foo() { … bar();}

DSU Testing Safety Checks offer limited guarantees:

CFS and AS ensure type-safe execution AS ensure that you never return to old code following

an update Neither of these properties ensure safe update timing

We propose testing to verify the correctness of allowed update points: Use existing suite of application system tests Ensure that updating anywhere during the execution

of those tests results in an execution that passes the test.

Testing Procedure Approach:

Instrument application to trace update points

Execute system test and gather initial trace

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test

Potential Update Points

Trace Start

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test ✔

initial trace

For each update point in the initial trace, perform an update test: force an update at that point while executing the system test

✔initial trace update tests

✔✘✔

Update Test Minimization Program traces may have thousands or millions

of update points Many update tests have the same behavior for

a given patch we can eliminate redundant tests

baz() {…}Patch A

void main() {

foo();

bar();

baz();}

Version 0

foo() {…}bar() {…}baz() {…}

Patch B

All update points yield

same behavior

All update points yield

distinct behavior

Minimization Algorithm Execution events are traced if they have the

potential to conflict with a patch A event conflicts with a patch p if applying p before

the event might produce a different result than applying p after the event

Example: function calls, global variable accesses

Trace the execution of a test T on P0

Iterate through the trace noting the last update point each time we reach a conflicting trace element

Run only the identified update tests Tnp

Experimental Results

Experimental Setup Based testing infrastructure on top of the

Ginseng DSU system (Neamtiu, et al): Modified to support tracing and updating at pre-

selected update points Insertion of explicit update points before each

function call to approximate more liberal systems

Disabled safety checking (CFS) for experiments

Tested 3 years of patches to OpenSSH and vsftpd (only report OpenSSH in this talk)

Program Modificationsfoo() { while (1) { // main loop

update();

extract { ... // main loop body } } extract { ... // after main Loop }}

Identify Long-running loopsAdd a Manually

Selected Update Point

Perform Loop Body Extraction

PerformContinuation

Extraction

Experiments: Update Test Suite

How many update tests must be run to test real-world updates to real-world applications?

How effective is minimization at eliminating redundant tests?

Update Test Suite Size: OpenSSH

#D to next version ReductionSig Fun Typ

eAll Points Activeness-Safe Points

0 3 98 5 580,871 g 31,791 (95%) 35,314 g 3,027 (91%)

1 0 6 0 705,322 g 1,795 (~100%) 587,578 g 1,717 (~100%)

2 5 238 11 638,720 g 63,011 (90%) 20,902 g 2,353 (89%)

3 0 18 0 772,198 g 4,324 (99%) 638,803 g 3,775 (99%)

4 13 172 10 773,086 g 27,399 (96%) 21,343 g 1,564 (93%)

5 0 24 1 878,235 g 17,398 (98%) 111,950 g 1,723 (98%)

6 6 257 10 879,668 g 47,092 (95%) 44,278 g 2,139 (95%)

7 4 179 12 918,717 g 89,601 (90%) 100,854 g 4,141 (96%)

8 0 72 3 973,364 g 34,293 (96%) 61,724 g 2,070 (97%)

9 10 157 7 933,514 g 52,356 (94%) 61,051 g 2,891 (95%)

Total 8,053,695 g 369,060 (95%) 1,683,797 g 25,400 (98%)

Empirical Study of Update Safety

How many failures occur when applying updates arbitrarily?

How many failures occur when applying updates subject only to the AS and CFS safety checks?

Safety: OpenSSHD to next version All Points CFS Points AS Points

Update Sig Fun Type Failed Total Failed Total Failed

0 3 98 5 19,715 580,871 0 68,044 0 35,314

1 0 6 0 0 705,322 0 705,322 0 587,578

2 5 238 11 306,965 683,720 1,688 75,307 4 20,902

3 0 18 0 0 772,198 0 772,198 0 638,803

4 13 172 10 565,681 773,086 609 110,633 380 21,343

5 0 24 1 10,703 878,235 0 130,000 0 111,950

6 6 257 10 163,333 879,668 44,461 96,183 110 44,278

7 4 179 12 11,380 918,717 1 80,070 1 100,854

8 0 72 3 3 973,364 0 261,885 0 61,724

9 10 157 7 357,919 933,514 24 121,337 0 61,051

Total 1,435,699 8,053,695 46,783 2,420,979 495 1,683,797

void handle_upload_common() { ret = do_file_recv();}

void do_file_recv() { … // receive file if (ret == SUCCESS) write(226, “OK.”); return ret;}

Version 0

void handle_upload_common() { ret = do_file_recv(); if (ret == SUCCESS) write(226, “OK.”);}

void do_file_recv () { … // receive file return ret;}

Version 1 (patch)

Unsafe Timing:Version Inconsistency (vsftpd)

Unsafe Timing:Version Inconsistency

void foo() { bar(); … baz();}

void bar() { … }

void baz() { dig(); … }

Version 0

void foo() { bar(); … baz();}

void bar() { dig(); … }

void baz() { … }

Version 1 (patch)

Manually Selected Update Points

D to next version Safety

# Tests Sig Fun

Type Reduction Failed

0 75 3 98 5 566 g 566 (0%) 0 566

1 75 0 6 0 630 g 592 (6%) 0 630

2 76 5 238 11 568 g 568 (0%) 0 568

3 91 0 18 0 783 g 770 (2%) 0 783

4 91 13 172 10 782 g 782 (0%) 0 782

5 104 0 24 1 860 g 841 (2%) 0 860

6 104 6 257 10 859 g 859 (0%) 0 859

7 104 4 179 12 850 g 850 (0%) 0 850

8 105 0 72 3 868 g 823 (5%) 0 868

9 104 10 157 7 833 g 833 (0%) 0 833

7,599 g 7,484 (2%) 0 7,599

Summary We have argued that verification is

necessary to prevent unsafe updates Provided empirical evidence that AS/CFS

cannot prevent all unsafe updates

We have presented an approach for testing dynamic updates

We have presented and evaluated a minimization strategy to make update testing more practical

Discussion Questions Given that AS cannot ensure correctness

(both in theory and in practice), should DSU implementations continue to rely on it?

What standards for verification should be required of DSU system benchmarks?

Are there other assumptions of DSU that are appropriate for empirical evaluation?

efficient systematic testing for dynamically updatable software

behaviorall update points

safe update timingwe

patch ball update points

foo baz

foo bar baz

test t

initial tracefor

type safety

Documents

reliability of dynamically sensitive offshore platforms...

improving speed and security in updatable encryption...

open access dynamically tunable electromagnetically induced...

dynamically re-configurable processors

intech-dynamically incompressible flow

dynamically responsive intervention for

dynamically protected cat-qubits: a new paradigm for...

chapter 7: routing dynamically

updatable encryption with post-compromise security ·...

load factors for dynamically sensitive structures · load...

scalable and dynamically updatable lookup engine for ... ·...

dynamically-connected transport

dynamically integrated intelligent organisation

cloudera apache kudu updatable analytical storage for modern...

dynamically evolving klaim nets

updatable and universal common reference …updatable and...

dynamically controllable dynamic scanning

systematic adaptation of dynamically generated source code

updatable and universal common reference strings with

ieee transactions on parallel and...