recrash: making crashes reproducible by preserving object states (ecoop 2008)

70
ReCrash Making crashes reproducible by preserving object states Shay Artzi, Sunghun Kim*, Michael D. Ernst MIT * now at HKUST

Upload: sung-kim

Post on 25-Jun-2015

748 views

Category:

Technology


4 download

DESCRIPTION

It's really fun to work with Shay and Mike. Thank you!

TRANSCRIPT

Page 1: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

ReCrash  Making  crashes  reproducible  by  preserving  object  states  

Shay  Artzi,  Sunghun  Kim*,  Michael  D.  Ernst  MIT   * now at HKUST

Page 2: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Eclipse  bug  30280:  2  days  to  reproduce,  4  minutes  to  fix  

2003-­‐01-­‐27    08:01   User:    Eclipse  crashed…    I  have  no  idea  why...    Here  is  the  stack  trace.  

2003-­‐01-­‐27    08:26   Developer:    What  build  are  you  using?    Do  you  have  a  testcase  to  reproduce?  

2003-­‐01-­‐27    08:39   Developer:    Which  JDK  are  you  using?  2003-­‐01-­‐28    13:06   User:    I’m  running  Eclipse  2.1,  …  

I  was  not  able  to  reproduce  the  crash.  2003-­‐01-­‐29    04:33   Developer:    Reproduced.  2003-­‐01-­‐29    04:37   Developer:    Fixed.  

Page 3: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reproducing  crashes  

•  If  a  crash  can’t  be  reproduced:  – Hard  to  fix  – Hard  to  validate  a  soluYon  

•  Reproducing  a  crash  is  hard!  – Nondeterminism  – ConfiguraYon  and  system  informaYon  – Steps  to  reproduce  may  be  complex  or  long  –  In-­‐field  detecYon  – Users  rarely  provide  reproducible  bug  reports  

Page 4: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Approach  1:    Postmortem  analysis  

Examples:    stack  trace,  core  dump    Problems:  •  Fault  (bug)  may  be  far  from  failure  (excepYon)  

–  Faulty  method  may  not  be  in  stack  trace  •  Too  much  informaYon  

–  Core  dump:    big;  hard  to  interpret  •  Not  enough  informaYon  

–  Shows  effects  (final  values),  not  causes  – Need  iniYal  values  to  reproduce  the  failure  

Page 5: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Approach  2:    Record  &  replay  

•  Logging:    record  interacYons  with  environment  •  Replay:    use  log  to  reproduce  the  execuYon  •  Checkpoint:    replay  skips  part  of  the  execuYon  

Problems:  •  UnrealisYc  overhead  •  Invasive  changes  to  HW/OS/applicaYon  

Page 6: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Record  &  replay  for  OO  programs  

•  Object-­‐oriented  style  uses  only  nearby  state  – Unit  tesYng  depends  on  this  property  

•  ReCrash  reproduces  this  nearby  state  – Does  not  replay  an  execuYon  – StaYc  and  dynamic  analyses  reduce  the  size  

•  Lightweight:    efficient,  no  harness,  usable  in-­‐field  

•  Not  guaranteed  to  reproduce  every  failure  

Page 7: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

ReCrash  technique  

Goal:    Convert  a  crash  into  a  set  of  unit  tests  1.  Monitoring:    maintain  a  shadow  stack  

– Contains  a  copy  of  each  method  argument  – On  program  crash,  write  the  shadow  stack  to  a  file  

2.  Test  generaYon:    create  many  unit  tests  – For  each  stack  frame,  create  one  unit  test:  

•  Invoke  the  method  using  arguments  from  the  shadow  stack  

•  If  the  test  does  not  reproduce  the  crash,  discard  the  test  

Page 8: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

•  On  method  entry:  – Push  a  new  shadow  stack  frame  – Copy  the  actual  arguments  to  the  shadow  stack  

•  On  non-­‐excepYonal  method  exit:  – Pop  the  shadow  stack  frame  

•  On  program  failure  (top-­‐level  excepYon):  – Write  the  shadow  stack  to  a  file  

•  Serializes  all  state  referenced  by  the  shadow  stack  

Page 9: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Create  one  JUnit  test  per  stack  frame      public void test_resolveType() { AllocExpr rec = (AllocExpr) shadowStack.getArg(0); BlockScope arg = (BlockScope) shadowStack.getArg(1); rec.resolveType(arg); }

 We  expect  the  method  to  fail  as  it  did  at  run  Yme  

Test case for Eclipse bug 30280

Read arguments from the saved shadow stack

Invoke the method from the stack frame

Page 10: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

EvaluaYng  unit  tests  

•  Run  each  generated  unit  test  •  Discard  the  test  if  it  does  not  reproduce  the  run-­‐Yme  excepYon  

Page 11: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

How  a  developer  uses  the  tests  

•  In  a  debugger,  step  through  execuYon  and  examine  fields  

•  Experiment  by  modifying  the  tests  •  Verify  a  fix  •  Create  a  regression  test  

– Replace  deserialized  objects  by  real  objects  or  mock  objects  

– More  readable  and  robust  

Page 12: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Why  create  mulYple  tests?  

•  Not  all  tests  may  reproduce  the  failure  – Due  to  state  not  captured  on  the  shadow  stack  

•  Sockets,  files,  nondeterminism,  distant  program  state  •  Does  capture  all  values  that  are  passed  as  arguments  

•  Some  tests  may  not  be  useful  for  debugging  

Page 13: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Not  every  test  is  useful  

Stack  trace:  NullPointerException at Class1.toString at Class2.myMethod ...

Tests:  void test_toString() { Class1 receiver = null; receiver.toString(); } void test_myMethod() { Class2 receiver = (Class2) shadowStack.getArg(0); receiver.myMethod(); }

Page 14: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Other  features  of  ReCrash  

•  Non-­‐crashing  failures  – Add  a  ReCrash  annotaYon  

•  Caught  excepYons  that  lead  to  later  failures  •  Adding  extra  informaYon  to  test  cases  

– Version  number,  configuraYon  informaYon  

•  Reducing  the  serialized  stack  – Size,  privacy  

Page 15: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Cost  of  monitoring  

Key  cost:    copying  arguments  to  shadow  stack  Tradeoff:    less  informaYon  in  shadow  stack  ⇒  

lower  chance  of  reproducing  failures  1.  Depth  of  copy  

– Deep,  reference,  or  a  hybrid  2.  Save  less  informaYon  about  each  argument  

– Focus  on  important  fields  3.  Monitor  fewer  methods  

–  Ignore  methods  not  likely  to  crash  or  to  be  useful  

Page 16: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

Page 17: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

R:   17

Page 18: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

R:   17

Page 19: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

R:   18

Page 20: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

R:  

R:  

A1:  

18

Page 21: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Original  program  execuYon  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18

Page 22: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

1.    Depth  of  copying  

Page 23: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

Real  stack   Shadow  stack  

Page 24: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

Real  stack  

R:   17 R:   17

Shadow  stack  

Page 25: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

Real  stack  

R:   18 R:   17

Shadow  stack  

Page 26: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

Real  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Shadow  stack  

Page 27: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Real  stack   Shadow  stack  

Page 28: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Deep  copy  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Real  stack  

Multiple copies ⇒ quadratic cost Unusable in practice

Shadow  stack  

Page 29: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reference  copy  

Real  stack   Shadow  stack  

Page 30: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reference  copy  

Real  stack  

R:   17 R:  

Shadow  stack  

Page 31: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reference  copy  

Real  stack  

R:   18 R:  

Shadow  stack  

Page 32: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reference  copy  

Real  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

Shadow  stack  

Page 33: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Reference  copy  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

Shadow  stack  

Page 34: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Depth-­‐1  copy  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Shadow  stack  

Page 35: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

2.    Ignoring  some  fields  

Page 36: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Depth-­‐1  copy  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Shadow  stack  

Page 37: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Used  fields  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Shadow  stack  

Analysis results

Page 38: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Depth1  +  used  fields  (=  Depth2  –  unused  fields)  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Shadow  stack  

Page 39: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Pure  methods  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Shadow  stack  

Analysis results

Page 40: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Pure  methods  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

A2:  

17

18

A1:  R:  

Shadow  stack  

Page 41: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Immutable  objects  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

A2:  

17

18

A1:  R:  

Shadow  stack  

Analysis results

Page 42: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Immutable  objects  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

Shadow  stack  

Page 43: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

3.    Ignoring  some  methods  

Page 44: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Ignored  methods  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  A1:  A2:  

17

R:  

A1:  18

Shadow  stack  

Analysis results

Page 45: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Ignored  methods  

Real  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

R:  

R:  A1:  A2:  

17 18

Shadow  stack  

Page 46: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Methods  that  are  unlikely  to  be  useful  

•  Trivial  methods  •  Private  methods  •  Library  methods  •  Methods  that  are  unlikely  to  crash  

Page 47: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Second  chance  mode  

Idea:    monitor  only  methods  that  are  likely  to  crash  •  IniYally,  monitor  no  methods  •  Aner  a  crash,  add  monitoring  for  methods  in  the  stack  trace  –  Can  update  all  clients,  not  just  the  one  that  crashed  

•  Tradeoffs:  + Very  low  overhead  (no  overhead  unYl  a  crash)  –  Requires  a  failure  to  occur  twice  

Page 48: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Experimental  study  

1.  Can  ReCrash  reproduce  failures?  2.  Are  the  ReCrash-­‐generated  tests  useful?  3.  How  large  are  the  test  cases?  4.  What  is  the  overhead  of  running  ReCrash?  

Page 49: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Subject  programs  

InvesYgated  11  real  crashes  from:  – BST:    .2  KLOC  – SVNKit:    22  KLOC  – Eclipse  compiler:    83  KLOC  –  Javac-­‐jsr308:    86  KLOC  

 

Page 50: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q1:    Can  ReCrash  reproduce  failures?  Program   Failure   Candidate  

tests  Reproducible  tests  

reference  copy  

depth  1  +  used-­‐fields  

deep  copy  

BST   Class  cast   3   3   3   3  

Class  cast   3   3   3   3  

Unsupported   3   3   3   3  

SVNKit   Index  bounds   3   3   3   3  

Null  pointer   2   2   2   2  

Null  pointer   2   2   2   2  

Eclipsec   Null  pointer   13   0   1   8  

Javac-­‐jsr308  

Null  pointer   17   5   5   5  

Illegal  arg   23   11   11   11  

Null  pointer   8   1   1   1  

Index  bounds   28   11   11   11  

Page 51: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q1:    Can  ReCrash  reproduce  failures?  Program   Failure   Candidate  

tests  Reproducible  tests  

reference  copy  

depth  1  +  used-­‐fields  

deep  copy  

BST   Class  cast   3   3   3   3  

Class  cast   3   3   3   3  

Unsupported   3   3   3   3  

SVNKit   Index  bounds   3   3   3   3  

Null  pointer   2   2   2   2  

Null  pointer   2   2   2   2  

Eclipsec   Null  pointer   13   0   1   8  

Javac-­‐jsr308  

Null  pointer   17   5   5   5  

Illegal  arg   23   11   11   11  

Null  pointer   8   1   1   1  

Index  bounds   28   11   11   11  

Page 52: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q2:    Are  the  ReCrash  tests  useful?  

•  Developers  found  the  tests  useful  – Developer  1:    “You  don’t  have  to  wait  for  the  crash  to  occur  again”;  also  liked  mulYple  tests  

– Developer  2:    “Using  ReCrash,  I  was  able  to  jump  (almost  directly)  to  the  necessary  breakpoint”  

•   Developers  found  the  stack  trace  insufficient  – Unable  to  reproduce  – The  failure  may  be  far  removed  from  the  fault  

Page 53: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q3:    How  large  are  the  test  cases?  

•  The  JUnit  test  suite  uses  the  shadow  stack  •  Serializes  all  reachable  parts  of  the  heap  

Program   Average  shadow  stack  

size  (KB)  BST   12  SVNKit   34  Eclipsec   62  Javac-­‐jsr308   422  

Page 54: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q4:    Time  overhead  of  ReCrash  

0 0.5 1 1.5 2

Eclipsec

SVNKit

OriginalReferenceDepth  1  +  used-­‐fieldsSecond  chance

Overhead  of  instrumented  program  in  the  field  

Page 55: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Q4:    Memory  overhead  of  ReCrash  

Absolute  memory  overhead:    .2M  –  4.7  M  

Page 56: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

GeneraYng  unit  tests  from  system  runs  

•  Test  factoring  [Saff  2005,  Elbaum  2006]  – Developer  selects  a  porYon  of  the  program  – System  logs  interacYons  with  the  environment  – Unit  test  replays  execuYon  in  a  test  harness  

•  Contract-­‐driven  development  [Leitner  2007]  – Reference  copying,  intended  for  durable  tests  

•  Backward-­‐in-­‐Yme  debuggers  [Lienhard  2008]  – Heavier-­‐weight  logging  and  checkpoints  

Page 57: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Future  work  

•  Capture  more  state  – Concurrency,  Yming,  external  resources  

•  Other  implementaYon  tradeoffs  – Copy-­‐on-­‐write  – ExisYng  VM  hooks  – Logging/debugging  techniques  – These  are  probably  orthogonal  to  ReCrash  

Page 58: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

ReCrash  converts  failures  into  tests  

•  ReCrash  effecYvely  reproduces  failures  •  Replicates  program  states  •  Generates  mulYple  unit  tests  

•  The  unit  tests  are  useful  •  Low  overhead  

•  Records  only  relevant  parts  of  an  execuYon  •  4  program  analyses;  second  chance  mode  •  Can  deploy  instrumented  programs  in  the  field  

•  Download:    hup://pag.csail.mit.edu/ReCrash/      

Page 59: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

ReCrash  converts  failures  into  tests  

•  ReCrash  effecYvely  reproduces  failures  •  Replicates  program  states  •  Generates  mulYple  unit  tests  

•  The  unit  tests  are  useful  •  Low  overhead  

•  Records  only  relevant  parts  of  an  execuYon  •  4  program  analyses;  second  chance  mode  •  Can  deploy  instrumented  programs  in  the  field  

•  Download:    hup://pag.csail.mit.edu/ReCrash/      

Page 60: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
Page 61: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
Page 62: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

Real  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Shadow  stack  

Page 63: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

17

18

Real  stack   Shadow  stack  

On method entry

Page 64: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

Real  stack   Shadow  stack  

On method entry: 1.  Push a new shadow stack frame

Page 65: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

R:  A1:  A2:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Real  stack   Shadow  stack  

On method entry: 1.  Push a new shadow stack frame 2.  Copy the actual arguments to the shadow

stack

Page 66: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

R:  A1:  A2:  

17

18

18

Real  stack   Shadow  stack  

On method exit

Page 67: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Real  stack   Shadow  stack  

On method exit: 1.  Pop shadow stack frame

Page 68: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Real  stack   Shadow  stack  

On program failure (top-level exception):

Page 69: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Shadow  stack  

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Real  stack  

On program failure (top-level exception): 1.  Write the shadow stack to a file

Page 70: ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)

Shadow  stack  

Maintaining  the  shadow  stack  

R:  

R:  

A1:  

18 R:  

R:  

A1:  

17

18

Real  stack  

On program failure (top-level exception): 1.  Write the shadow stack to a file

Serializes all referenced state