common errors in c - swanseacsdavec/hpc/5commoncerrors.pdf · common errors in c david chisnall...

26
Common Errors in C David Chisnall February 15, 2011

Upload: others

Post on 01-Apr-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Common Errors in C

David Chisnall

February 15, 2011

Page 2: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

The C Preprocessor

• Runs before parsing

• Allows some metaprogramming

Author's Note
Comment
The C preprocessor was originally a separate program, which helped keep the compiler small (important in the '70s!) and provided some features that weren't in the language.
Page 3: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Preprocessor Macros Are Not Functions

• The preprocessor performs token substitution

• It is completely unaware of the language semantics�#define ANSWER 42

test(ANSWER); // replaced by test (42)

#define CALL(x) (x)()

CALL(y()); // replaced by (y())() � �

Author's Note
Comment
Macros look like functions, but they behave differently. Common style guidelines recommend using all-caps names for macros so they are obviously different from functions.
Page 4: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Example Failure

�#define MIN(x, y) ((x<y) ? x : y) � �

• Looks correct?

• What happens if the arguments are side effects?

Author's Note
Comment
If this were a function, that would be fine, but remember that the preprocessor only works by pasting tokens.
Page 5: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Side Effects Evaluated Twice

�int min = MIN(a(), b());

// Expands to:

int min = ((a()<b()) ? a() : b()); � �• Whichever function has the lower value is called twice

• If it has side effects, this is very bad

• If it doesn’t, it’s still overhead

Author's Note
Comment
The preprocessor provides a very convenient way of hiding bugs, if not used correctly
Page 6: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Solution: Use Functions?�int min(int x, int y)

{

if (x<y)

{

return x;

}

return y;

}

...

min(a(), b()); � �• Looks safe - functions are called then the result passed as

arguments.

• What happens if a() and b() return floats?

Author's Note
Comment
This version is fine for integers, but it will break if you pass it floating point values. They will be cast to integers before the call. For example, passing 0.1 and 0.4 will pass 0 as both arguments and return 0, which is wrong.
Page 7: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Solution 2: More Complex Macro

�#define MIN(x, y) ({ \

__typeof__(x) _x = x;\

__typeof__(y) _y = y;\

_x < _y ? _x : y;\

}) � �• Works correctly

• But uses two GCC-specific extensions

• It is not possible to portably solve this problem in C99!

Author's Note
Comment
This uses two GCC extensions. The brackets-and-braces syntax for providing a block as an expression, which evaluates to the result of the last statement is one. The __typeof__() operator is another. It works, but it will break with Microsoft's C compiler, for example.
Page 8: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Type Promotion

• Arithmetic results in C depend on the types of the operands

• This may not do what you want!�int64_t multiply_extend(int32_t a, int32_t b)

{

// Wrong!

int64_t result = a*b;

// Correct:

result = (int64_t)a*b;

return result;

} � �

Author's Note
Comment
Multiplying two 32-bit values will give a result that definitely fits in 64 bits, but will do 32-bit arithmetic, so you'll get an overflow in some cases. If you want a 64-bit result, you must convert to a 64-bit type before you do the multiplication, not afterwards.
Page 9: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

The Rules (Simplified)

• Arithmetic operations on the same type evaluate to that type

• Arithmetic operations on different types use the one with thewider range

• The actual rules are more complex than this!

• If you’re confused, explicitly cast both arguments.

• Overly verbose code is better than buggy code�double half = 1/2; // Evaluates to 0! � �

Author's Note
Comment
Never be afraid in C code to write redundant brackets or extra explicit casts, or even comparisons to 0. It's alway better to have code that is longer than it needs to be than code that is broken. The compiler will optimise away redundant expressions, but won't optimise away buggy code.
Page 10: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Assignment Instead of Comparison

�if (a = b) � �

• What does this do?

• Assigns b to a

• Tests if a is zero

• Enters the if block if it is non-zero

Author's Note
Comment
This is probably the worst bit of C. It is designed to make it possible to write dense code, but dense code is usually buggy code, so that's not a very good excuse for a terrible language feature.
Page 11: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Avoiding Accidental Assignment

• Put rvalue on the left side of comparions (e.g. 0 == a nota == 0)

• Explicitly add 0 == when you want to compare against 0.

• Turn on compiler warnings...

• ...and pay attention to them!

Author's Note
Comment
Lots of idioms in C are designed specifically to avoid this kind of bug. A good compiler will force you to add extra brackets to silence a warning when you do this, to make sure you really meant it, but only at some warning levels.
Page 12: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Forgetting How Switch Statements Work�switch (expression)

{

case 1:

doSomething ();

case 2:

doSomethingElse ();

default:

giveUp ();

} � �• Looks right?

• giveUp() is called in all cases

• doSomethingElse() is called even for values of 1

• Sometimes you want this, but usually you don’t

Author's Note
Comment
This fall-through behaviour is very useful sometimes, since it lets you have cascading sets of conditions. It's most useful where you have a few groups of conditions, so you have a few empty cases that just fall through to the next one.
Page 13: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Forgetting How Switch Statements Work�switch (expression)

{

case 1:

doSomething ();

break;

case 2:

doSomethingElse ();

break;

default:

giveUp ();

} � �• This version is right

• Note for C# programmers: C# switch statement worksdifferently!

Page 14: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Comparing Strings with ==

�if ("test" == a) � �

• Why is this wrong?

• C does not have real strings

• It uses pointers to characters / character arrays instead

• This compares two pointers - they are only the same if theypoint to the same string, not two identical strings in differentplaces in memory

Author's Note
Comment
This is a particularly nasty bug, because sometimes the pointer comparison will work. If you are comparing constant strings, the compiler or linker may have made two declarations of a constant string into the same one, so it will work, but then a different compiler (or optimisation level) may not, so you get strange bugs.
Page 15: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Correct String Comparison

�if ((NULL != a) && (strcmp("test", a) == 0)) � �

• NULL test is required because strcmp() expects valid pointers

• The strcmp() function returns an ordering, so you cancompare against zero to sort strings

Page 16: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Arrays Are Pointers - Mostly

�int example(void)

{

char *a = "foo";

char b[] = "foo";

... � �• Are these two the same?

• "foo" is a global constant string

• a is a pointer to that string

• b is a copy of that string on the stack

Author's Note
Comment
This particular distinction is confusing in conjunction with the string comparison mistake. If you define two char* variables and point them at constant strings with the same value, the compiler will probably make them both point to the same string, so you might think that you can always compare strings for equality that way.
Page 17: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Printf and Scanf Problems�int a = 12;

printf("Current value: %f, old value: %d\n", a);

scanf("%d", a); � �• Three bugs here, what are they?

• a is passed as the first argument to printf, but the formatspecifier tells it to expect a floating point value.

• One of the arguments to printf is missing

• scanf reads values, so expects pointer arguments

• These functions expect variable arguments, so the compilerneeds special logic to check them.

• Most compilers will check arguments to these functions, don’tignore the warnings!

Author's Note
Comment
Variadic functions (ones that take a variable number of parameters) completely bypass the type checking. GCC has a few special cases: printf, scanf, and null-terminated argument list. In general, these functions only work because the caller and callee agree on the argument types without the compiler's assistance. If they don't agree, then it's a problem.
Page 18: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

More Variadic Function Problems

�expects_null_terminated(a, b, c, d, e, 0); � �

• What’s wrong here?

• Hint: It will work correctly on most platforms.

• 0 is passed as an int

• On some platforms, int is 4 bytes, pointers are 8 bytes

• The callee will read 8 bytes, test if they’re NULL and treatthem as a pointer if they’re not.

• Bigger problem in C++ because standard C++ has no properNULL (fixed in C++0X)

Author's Note
Comment
When designing C++, someone thought it was a good idea to use a literal 0 instead of NULL for NULL pointers. This was an ugly hack to work around the braindead type system in C++, and this is one of the cases where it's problematic. The same problem in C only appears if you don't use NULL. Most C++ compilers define NULL as some kind of magic value that can be cast to any pointer type.
Page 19: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Not Understanding Strings

• Most languages have a proper string type

• C uses character (byte) arrays

• No embedded length, end is identified by a 0 byte

Author's Note
Comment
C is a low-level language, and the designers regarded strings as a high-level concept. They were also American, so thought 7-bit ASCII should be enough for everyone. If you're actually using C for doing a lot of string processing, you'll probably use something like libicu to handle unicode strings.
Page 20: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Common Errors With Strings

�int len = strlen(str);

char buffer[len];

strncpy(buffer , str , len); � �• Allocates enough space for the characters in the string

• But not enough for the NULL terminator

Author's Note
Comment
This code looks correct, but isn't. It copies the text of the string, but you end up with an unterminated string. The next time something tries treating it as a C string, it will go off the end. If you're lucky, it will crash. If you're unlucky, it will corrupt some random bit of memory.
Page 21: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Don’t Forget the Terminator

�int len = strlen(str);

char buffer[len +1];

memcpy(buffer , str , len);

buffer[len] = ’\0’; � �

Author's Note
Comment
C code often has lots of random +1s in string handling code, because it needs the NULL terminator in every string. This is usually a sign that you're doing the wrong thing - if you're doing a lot of string processing in C, use a library that provides a better interface for strings.
Page 22: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Unsafe String Functions

�strcpy(buffer , input); � �

• What happens if buffer is smaller than input?

• Memory corruption!

Page 23: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Slightly Less Unsafe String Functions

�strncpy(buffer , input , length); � �

• What happens if buffer is smaller than input?

• Silent truncation, buffer is not terminated

Page 24: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Safe String Functions

�int len = 128;

char buffer[len];

if (len < strlcpy(buffer , input , len))

{

// Handle case where truncation would occur

} � �• Output string is always terminated.

• Return value from strlcpy() is the size of input

• Easy to use safely

• Not in glibc, so unavailable on GNU/Linux

Author's Note
Comment
According to Ulrich Drepper, the GNU libc maintainer, this function is 'inefficient C crap' and so was rejected from inclusion in the GNU C library. This is one of the main reasons why you see more security holes on GNU/Linux systems than *BSD systems.
Page 25: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Using gets

�// Should be big enough

char *buffer = malloc (1024);

// Ooops , this function has no way of knowing

how big the buffer is

gets(buffer); � �• It is impossible to use gets() correctly

• It is deprecated in C99

• It is removed in C1X

• If someone suggests using it, ignore anything else they sayabout anything.

Author's Note
Comment
I've seen gets() in a few C tutorials. It's generally a sign that the person writing the code has no idea what they're doing, so if you see this in a tutorial then immediately look for a better tutorial. As a more general rule, beware of any function that takes a pointer to use as output and does not also take the size of the pointer as an argument.
Page 26: Common Errors in C - Swanseacsdavec/HPC/5CommonCErrors.pdf · Common Errors in C David Chisnall February 15, 2011. The C Preprocessor Runs before parsing Allows some metaprogramming

Questions?