richard mancusi - csci 297 static analysis and modeling tools which allows further checking of...

Richard Mancusi - CSCI 297

Static Analysis and Modeling

Tools which allows further checking of software systems


Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. Checking System Rules Using System-Specific, Programmer-Written Compiler Extensions. OSDI 2000

Madanlal Musuvathi, David Y.W. Park, Andy Chou, Dawson R. Engler, David L. Dill. CMC: A pragmatic approach to model checking real code. ISCA 2001.


Issues

Programming tools find (simple) static errors; not useful for semantic errors.

Brunt force testing methodologies are not effective nor thorough when considering larger, more complex software systems.

The amount of effort towards identifying issues increases (exponentially?) as time moves onward.


More Issues

We really are not good at programming. The psychology of the “master” programmer Etc. (There are as many excuses for the

incorrect as there are programmers.) Software cannot be “verified”. The best we

can hope for are sophisticate checks to unfold (more of) the errors in our code.


Meta Compilation

System implementers understand the semantics of the system better.

Compilers are better enforcers of rules that map well to the source code.

Therefore: MC involves integrating user provided systemic (semantic) rules to the compilation process.


MC Extensions

Uses “Metal”, a language for expressing a broad class of customized, static, bug-finding analyses.

xgcc, the analysis engine searches all execution path and applies extensions

Local analysis

C code

AnalysisResults

Extensions

XG++


Examplesm free_checker {

state decl { any_ptr } p;

start: { free(p) } ==> p.freed ;p.freed: { *p } ==> p.stop, { err(“using %s after free!”, mc_identifier(p)); } | { free(p) } ==> p.stop, { err(“double free of %s!”, mc_identifier(p)); } ;

}

From:Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler.

A System and Language for Building System-Specific Static Analyses. PLDI 2002


Rule Templates

“Never/always do X” Always initialize variables

“Do X rather than Y” Avoid globally disabling interrupts

“Always do X before/after Y” Release locks after using them

“In situation X, do (do not) Y” Protect all shared variables with locks

“In situation X do Y rather than Z” To save an instruction in bit mask use XOR instead of assignment.


Memory Management

Check against null pointers Unreclaimed memory checks “Double free” instances checks Use after deallocation checks


Global Checks Extension

The authors suggest useful checks performed on the whole code input: Kernel code should not call blocking

functions when holding a spin lock. (42/4) Library modules should not call blocking

functions until after the reference count is set properly. (53/2)


Other uses

Detection of race conditions and deadlocks:

RacerX: effective, static detection of race conditions and deadlocks, Dawson Engler and Ken Ashcraft, In Proceedings of the Symposium on Operating Systems Principles, pages 237-253, October 2003


Transitioning…

Any questions?

(yawns?)


“Conventional” Model Checking

Modeling software is difficult at best, requiring abstract definition of software system.

Abstraction tends to minimize details of implementation.

Time consuming, manual process. Memory intensive, usually exhausting

system resources.


CMC – “C Model Checker” Integrates with the code implementation Process state includes global and local

variables, heap, stack, and registers as well as shared memory

Optimizations to avoid unnecessary “state explosion problem”

Non-deterministic modeling supported Can benefit on successive systems


CMC Steps

Correctness properties Environment specification Identify Initialization code and event

handlers Initial state generated using init functionsState generationCorrectness checks during model execution


State Space Explosion

Key to prolonging model execution State caching to prevent reintroductions Hash compaction (store small signature to

represent each state) Balance missing few errors in exchange to

reducing state spaceDown-scale model parameterizationsHeuristics to remove uninteresting states


The AODV Model

Use of interrupt driven event handlers fits well into the CMC modeling paradigm

3 different implementations of routing protocol modeled

34 distinct errors discovered, including one specification bug

(Mostly) shared modeling code


AODV Correctness Properties

General assertions (segmentation faults, memory leaks, dangling pointers)

All routing tables contain no loops Routing table entries (a) one per node, (b)

no route to self, valid hop count Messages have valid hop counts (can’t be

infinity), and reserved fields are zeroed.


AODV Environment

Uses unordered message queue Message loss modeled with random

queue deletions Alternate wrapper function provide to send

network packets Stubs for 22 kernel functions and user-

spaced socket buffer library


AODV: Initialization and Event Handling

The initialization code is clearly identified Every signal handler mapped to a CMC

“transition”


Example

1: int c;2: mutex_t m;3:4: void Odd() { lock(m); if ((c%2) == 1) printf(“odd: %d\n”, c++); unlock(m); }5: void Even() { lock(m); if ((c%2) == 0) printf(“even: %d\n”, c++); unlock(m); }6: 7: int main()8: {9: c = 0;10: init_mutex(m);11: schedule(Odd);12: schedule(Even);13:14: wait(5);15:}


Conclusions

Static analysis tools are available which provide rules-based checking of code

Modeling can be used to identify more bugs under controlled executions with programs which “fit” the framework well.

“Finding bugs is easy, given the right approach” The search for better means to “validate” software

should continue; more lessons to come

richard mancusi - csci 297 static analysis and modeling tools which allows further checking of...

Documents

system rules

exhausting system

kernel code

real code

code input

source code

static analysis

freed p