presentation of failure- oblivious computing vs. rx os seminar, winter 2005 by lauge wullf and jacob...
Post on 12-Jan-2016
214 Views
Preview:
TRANSCRIPT
Presentation of Failure-Oblivious Computing vs. Rx
OS Seminar, winter 2005by Lauge Wullf and Jacob Munk-Stander
January 4th, 2006
Agenda
• Introduction
• Failure-Oblivious Computing
• Rx: Treating Bugs As Allergies
Introduction
• Problem– Reliability (deterministic and non-deterministic)
• Cause– Software defects account for up to 40% of
system failures– Memory- and concurrency related bugs cause
more than 60% of system vulnerabilities
• Effect– Expensive
Introduction
• Solutions– Safe languages, e.g. ML, Java or C#– Rebooting/restarting
• Whole program restart, micro rebooting, etc.
– Check pointing and recovery• Check point, roll back on failure, re-execute
– Application specific• Multi-process model, exception handling, etc.
– Non-conventional approaches• E.g. failure-oblivious computing
Failure-Oblivious Computing
• An instance of acceptability-oriented computing:– A flawed system must ensure that it respects basic
acceptability properties, e.g.:• System must never accelerate the vehicle beyond a specific
velocity• System should continue to execute even if it has a memory
error
• Makes invalid memory accesses oblivious– Invalid reads return manufactured values– Invalid writes are discarded
• Thus, no termination of processes or exceptions
Failure-Oblivious Computing, cont.
• Behavior– Standard Compilation
• memory corruption, potential crash
– Safe Compilation• process terminates without potentially
contaminating global data
– Failure-Oblivious Compilation• process continues execution, speculative,
unsafe execution path
Failure-Oblivious Computing, cont.
• Example, Pine 4.44– Index uses From field of messages– Quotes certain characters– Bug when quoting certain values
• Maximum length is miscalculated, thus a too small buffer is allocated for quoted value
– Standard and Safe: Pine crashes on start– FOC: Pine operates “normally”
Failure-Oblivious Computing, cont.
• Example, bug-server (fictional)– FOC uses malloc/free to monitor memory access– Memory deallocation takes up much time,
bug-server2.0 uses memory pools:• pool *new_pool()
creates a new pool for memory allocation• void *pool_alloc(pool *p, size_t size)
allocates size bytes from the pool p• void free_pool(pool *p)
frees all memory allocated to pool p– Pools internally use malloc to create new or
extend pools, free to free pools– A security exploit is released, affects only 2.0,
why?
Failure-Oblivious Computing, cont.
• Extension to gcc• Implemented using checking code and
continuation code• Checking code evaluates whether a
memory access is valid or not• Continuation code executes when an invalid
memory access occurs– Discards erroneous writes– Manufactures a sequence of results for
erroneous reads, [0, 1, 2, 0, 1, 3, 0, 1, 4, …]
Failure-Oblivious Computing, cont.
• Checking code– based on Jones and Kelley’s scheme– enhanced by Ruwase and Lam
• Jones and Kelley’s scheme– A table maps locations to data units– A data unit is e.g. a struct, array, variable– The table tracks intended data units and
is used to distinguish in-bounds from out-of-bounds pointers
Failure-Oblivious Computing, cont.
• Base Case – always in-bounds– Base pointer is the address of an array, struct or variable.– Intended data unit is the corresponding data unit of base
pointer• Pointer Arithmetic
– Starting pointer + offset– In-bound if and only if starting pointer and derived pointer
point to the same data unit– Intended data unit is the same for both– Does not work with “reverse” pointer arithmetic?
• Pointer Variables– In-bound if-and-only if it was assigned to in-bound pointer– Intended data unit is the same as the pointer to which it
was assigned
Failure-Oblivious Computing, cont.
• Valid out-of-bounds pointer– Points to the next byte after intended data unit– Obtained by padding each data item with an
extra byte
• Illegal out-of-bounds-pointer have value ILLEGAL (-2)
• Used to support valid out-of-bounds pointers in terminating loops when using pointer arithmetic
Failure-Oblivious Computing, cont.
• Dereferencing pointer, checks table:– in-bounds pointer returns referent value– out-of-bounds pointer causes program to
halt with error
• Does not support pointer arithmetic used to obtain a pointer to a location past the end of intended data unit, which is then used to calculate an in-bound
Failure-Oblivious Computing, cont.
• Ruwase and Lam’s enhancement– Out-of-bounds pointers are set to point to
out-of-bounds (OOB) object– OOB object:
• Start address of intended data unit• Offset from this address
– Can track out-of-bounds pointers to their intended data unit
Failure-Oblivious Computing, cont.
• Pros– Global state is not corrupted– Local data accessed in loops
• Individual iteration failures can be handled
– Servers without state• No propagation of errors beyond a single request
– Interactive programs• Programs do not crash• Can show meaningful results• Tolerable slow-down
Failure-Oblivious Computing, cont.
• Cons:– “safe compiler for C”
• What if this introduces bugs?• Only C?• Programs must be recompiled
– Always in use, not only when needed– Manufactured reads can lead to wrong execution
path, i.e. not for correctness-critical applications• Only tested in the case of Midnight Commander
Failure-Oblivious Computing, cont.
• Cons, cont.– “The key question is how (or even if) the
incorrect or unexpected result may propagate through the remaining computation to affect the overall results of the program”
• How to determine this is not answered• Vaguely mentions that FOC is less
appropriate for such cases• Global change, thus might only be suited for
isolated functionality, i.e. local
Failure-Oblivious Computing, cont.
• Cons, cont.– Patch-management
• Rather have a fixed system than one which seems to run fine, but might not
– “Lucky” cases:• Pine – different method used elsewhere• Sendmail – length-check catches error• Midnight Commander – dangling link
minimizes error• Mutt – server returns “does not exist”
Failure-Oblivious Computing, cont.
• Performance– Programs that would crash earlier
continue execution– Slowdown from 1.03 to 8.1 times the
original performance
top related