2014-06-26 - a guide to undefined behavior in c and c++

26
A Guide to Undefined Behavior in C and C++ Stanley SIT@Synology 2014-06-26

Upload: -

Post on 19-May-2015

243 views

Category:

Software


0 download

TRANSCRIPT

Page 1: 2014-06-26 - A guide to undefined behavior in c and c++

A Guide to Undefined Behavior in C and C++

StanleySIT@Synology

2014-06-26

Page 2: 2014-06-26 - A guide to undefined behavior in c and c++

Great Articles

● A Guide to Undefined Behavior in C and C++○ http://blog.regehr.org/archives/213

● What Every C Programmer Should Know About Undefined Behavior○ http://blog.llvm.org/2011/05/what-every-c-

programmer-should-know.html■ by Chris Lattner, primary author of the LLVM

project

Page 3: 2014-06-26 - A guide to undefined behavior in c and c++

void *safe_alloc(size_t n, size_t m) {size_t total_size = n * m;if (n > 0 && SIZE_MAX/n < m)

return 0;return malloc(total_size);

}

Let’s have a quiz 1.What’s the bug inside?

Page 4: 2014-06-26 - A guide to undefined behavior in c and c++

void *safe_alloc(size_t n, size_t m) {// Compiler will assume total_size must not overflow// n*m must not exceed SIZE_MAX, thus remove checksize_t total_size = n * m;if (n > 0 && SIZE_MAX/n < m)

return 0;return malloc(total_size);

}

Let’s have a quiz 1.What’s the bug inside?

Page 5: 2014-06-26 - A guide to undefined behavior in c and c++

void *safe_alloc(size_t n, size_t m) {size_t total_size = 0;if (n > 0 && SIZE_MAX/n < m)

return 0;total_size = n * m;return malloc(total_size);

}

Revised version

Page 6: 2014-06-26 - A guide to undefined behavior in c and c++

Let’s have a quiz 2.What main() will return?

int a;int assign_a (int val) { a = val; return val;}int main (void) { assign_a (0) + assign_a (1); return a;}http://blog.regehr.org/archives/161

Page 7: 2014-06-26 - A guide to undefined behavior in c and c++

Let’s have a quiz 2.What main() will return?

int a;int assign_a (int val) { a = val; return val;}int main (void) { // order of evaluation of the subexpressions in C is unspecified // main() may either return 0, or return 1 assign_a (0) + assign_a (1); return a;}

Page 8: 2014-06-26 - A guide to undefined behavior in c and c++

Revised version

int a;int assign_a (int val) { a = val; return val;}int main (void) { int x = assign_a (0); int y = assign_a (1); x+y; return a;}http://blog.regehr.org/archives/161

Page 9: 2014-06-26 - A guide to undefined behavior in c and c++

“undefined behavior”

● Anything at all can happen● Standard imposes no requirements.

○ may fail to compile○ may crashing○ may silently generating incorrect results○ may fortunately do exactly what the programmer

intended.

Page 10: 2014-06-26 - A guide to undefined behavior in c and c++

List of undefined behavior

● Use of an uninitialized variable○ Int a;○ if(a>0) {}

● Signed integer overflow○ "INT_MAX+1" is not guaranteed to be INT_MIN.

● Oversized Shift Amounts○ Shifting a uint32_t by 32○ 1<<32

Page 11: 2014-06-26 - A guide to undefined behavior in c and c++

List of undefined behavior

● Dereferences of Wild Pointers and Out of Bounds Array Accesses○ Int *a = rand();○ *a = 1;

● Dereferencing a NULL Pointer○ Int *a = NULL;○ *a = 1;

■ contrary to popular belief, It is not defined to trap

Page 12: 2014-06-26 - A guide to undefined behavior in c and c++

List of undefined behavior

● Violating Type Rules○ cast an int* to a float*

● Divide by zero● …

Page 13: 2014-06-26 - A guide to undefined behavior in c and c++

Why Is Undefined Behavior Good?

● the only good thing!○ it simplifies the compiler’s job○ can generate very efficient code

Page 14: 2014-06-26 - A guide to undefined behavior in c and c++

Why Is Undefined Behavior Good?

● Avoid overhead○ initialization○ array range checked

● Enable loop optimization● Compiler Don't have to deal with various

CPUs● Enable advanced optimization technique

○ "Type-Based Alias Analysis" (TBAA)

Page 15: 2014-06-26 - A guide to undefined behavior in c and c++

Why Is Undefined Behavior Bad?

● Application Developer may not aware of○ the code generate undefined behavior○ Modern compiler optimizer contains many

optimizations○ different compilers often have substantially different

optimizers

Page 16: 2014-06-26 - A guide to undefined behavior in c and c++

bug in Linux Kernelvoid contains_null_check(int *P) { int dead = *P; if (P == 0) return; *P = 4;}

Compiler Optimizations:● Dead Code Elimination● Redundant Null Check

EliminationIf two optimizations run at different order on this code snippet ...

Page 17: 2014-06-26 - A guide to undefined behavior in c and c++

If compiler run Dead Code Elimination First

Dead Code Elimination:void contains_null_check_after_DCE(int *P) { int dead = *P; // deleted by the optimizer. if (P == 0) return; *P = 4;}

Redundant Null Check Elimination:void contains_null_check_after_DCE_and_RNCE(int *P) { // Null check is kept. if (P == 0) return; *P = 4;}

Page 18: 2014-06-26 - A guide to undefined behavior in c and c++

If compiler runRedundant Null Check Elimination First

Redundant Null Check Elimination:void contains_null_check_after_RNCE(int *P) { int dead = *P; if (false) // P was dereferenced by this point, so it can't be null return; *P = 4;}

Dead Code Elimination:void contains_null_check_after_RNCE_and_DCE(int *P) { int dead = *P; if (false) return; *P = 4;}

Page 19: 2014-06-26 - A guide to undefined behavior in c and c++

If performance is not your only goal

● undefined behavior is often a scary

Page 20: 2014-06-26 - A guide to undefined behavior in c and c++

CVE-2009-1897 bug in Linux Kernel

● kernel/git/torvalds/linux.git○ tun subsystem in the Linux kernel 2.6.30 and

2.6.30.1○ drivers/net/tun.c

● null check removed by optimize○ http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-

2009-1897○ Fun with NULL pointers, part 1

■ http://lwn.net/Articles/342330/

Page 21: 2014-06-26 - A guide to undefined behavior in c and c++

Is a large codebase contains undefined behavior?

● no good way to determine● some useful tools that can help find bugs

primary author of the LLVM project

Page 22: 2014-06-26 - A guide to undefined behavior in c and c++

Is a large codebase contains undefined behavior?

● Enable and pay attention to compiler warnings, preferably using multiple compilers

● Use static analyzers (like Clang’s, Coverity, etc.) to get even more warnings

● Use compiler-supported dynamic checks○ gcc’s -ftrapv flag generates code to trap signed

integer overflows

Page 23: 2014-06-26 - A guide to undefined behavior in c and c++

Is a large codebase contains undefined behavior?

● Use tools like Valgrind to get additional dynamic checks● When functions are “type 2″ as categorized above,

document their preconditions and postconditions● Use assertions to verify that functions’ preconditions are

postconditions actually hold● Particularly in C++, use high-quality data structure

libraries

Page 24: 2014-06-26 - A guide to undefined behavior in c and c++

Is a large codebase contains undefined behavior?

● Clang has an experimental -fcatch-undefined-behavior mode

$ clang t.c $ ./a.out $ clang t.c -fcatch-undefined-behavior $ ./a.out Illegal instruction

Page 25: 2014-06-26 - A guide to undefined behavior in c and c++

Reference

● The C FAQ○ http://c-faq.com/ansi/undef.html

● Undefined behavior○ http://en.wikipedia.org/wiki/Undefined_behavior

Page 26: 2014-06-26 - A guide to undefined behavior in c and c++

Q&A