taming undefined behavior in llvm - sigplsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf ·...

118
Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations Lab (Advisor: Chung-Kil Hur)

Upload: others

Post on 06-Mar-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

Taming Undefined Behavior

in LLVM

SIGPL 2017 Summer

Juneyoung Lee

Software Foundations Lab (Advisor: Chung-Kil Hur)

Page 2: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 452

Undefined

Behavior

(UB)?

UB

in LLVM IR

Problem of

UB in IR

& Solution

Software Foundations Lab

Page 3: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

What isUndefined Behavior?

3Software Foundations Lab

Page 4: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 454

Undefined

Behavior?

Software Foundations Lab

Page 5: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 454

Undefined

Behavior?

Software Foundations Lab

Page 6: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 454

Undefined

Behavior?

Software Foundations Lab

Page 7: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 455Software Foundations Lab

ISO/IEC 9899:2011

Programming languages – C

Page 8: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 455

int a[4]; *(a + 4) = 123;

Software Foundations Lab

ISO/IEC 9899:2011

Programming languages – C

Page 9: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 455

int a[4]; *(a + 4) = 123; Undefined Behavior!

Software Foundations Lab

ISO/IEC 9899:2011

Programming languages – C

Page 10: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

C ≠ Assembly

• C abstract machine!

6

Memory

-

int a[4]; *(a + 4) = 123;

Software Foundations Lab

Page 11: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

C ≠ Assembly

• C abstract machine!

7

l

𝑎 → (𝑙, 0)

Memory

int a[4]; *(a + 4) = 123;

Software Foundations Lab

Page 12: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int a[4]; *(a + 4) = 123;

C ≠ Assembly

• C abstract machine!

8

Memory

l

𝑎 → (𝑙, 0)

?

Software Foundations Lab

Page 13: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

UB & Compiler

9

State

1

State

2 UB

C

State

1

State

2

State

3 …

Asm

Compile

Software Foundations Lab

Page 14: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

UB & Compiler

10

State

1

State

2 UB

C

State

1

State

2

State

3’ …

Compile’

Asm(opt.)

Page 15: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

UB & Compiler

10Optimizer can change UB into anything!

State

1

State

2 UB

C

State

1

State

2

State

3’ …

Compile’

Asm(opt.)

Page 16: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Eliminating Redundant Load

11

a[0] = 10;b[i] = 20;output(a[0]);

a[0] = 10;b[i] = 20;output(10);

AsmC

int a[4];int b[4];int i;

UB and Optimization

Software Foundations Lab

Page 17: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Eliminating Redundant Load

11

a[0] = 10;b[i] = 20;output(a[0]);

a[0] = 10;b[i] = 20;output(10);

AsmC

int a[4];int b[4];int i; 3

UB and Optimization

Software Foundations Lab

Page 18: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Eliminating Redundant Load

11

a[0] = 10;b[i] = 20;output(a[0]);

a[0] = 10;b[i] = 20;output(10);

AsmC

int a[4];int b[4];int i; 3

UB and Optimization

Software Foundations Lab

4

Page 19: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Eliminating Redundant Load

11

a[0] = 10;b[i] = 20;output(a[0]);

a[0] = 10;b[i] = 20;output(10);

AsmC

int a[4];int b[4];int i; 3

UB and Optimization

UB

Software Foundations Lab

4

Page 20: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Register Promotion

12

a = t;b[i] = 20;output(a);

eax = t;b[i] = 20;output(eax);

AsmC

int a;int b[4];int i;

UB and Optimization

Software Foundations Lab

Page 21: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Register Promotion

12

a = t;b[i] = 20;output(a);

eax = t;b[i] = 20;output(eax);

AsmC

int a;int b[4];int i; 4

UB and Optimization

Software Foundations Lab

Page 22: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Register Promotion

12

a = t;b[i] = 20;output(a);

eax = t;b[i] = 20;output(eax);

AsmC

int a;int b[4];int i; 4

UB and Optimization

UB

Software Foundations Lab

Page 23: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Undefined Behavior in C

• Many operations are defined to produce UB

- To support optimizations

- > 200 cases in total!

• Let me introduce two famous cases:

1. Pointer overflow

2. Signed integer overflow

13Software Foundations Lab

Page 24: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4514

Pointer OverflowUB and Optimization

Page 25: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4514

Pointer OverflowUB and Optimization

int a[4]; .. a + 4 ..

int a[4]; .. a + 5 ..

UB OK

Page 26: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4514

Pointer OverflowUB and Optimization

int a[4]; .. a + 4 ..

int a[4]; .. a + 5 ..

UB OK

This guarantees that “p + i” never overflows 2^32!

Page 27: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

15

UB and Optimization

Software Foundations Lab

Page 28: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

0x100 00x100 0

0xFFFFFF00

15

UB and Optimization

Software Foundations Lab

Page 29: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

15

UB and Optimization

Software Foundations Lab

Page 30: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

false0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

15

UB and Optimization

Software Foundations Lab

Page 31: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

false true0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

15

UB and Optimization

Software Foundations Lab

Page 32: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* p;int a;int b;

Pointer Overflow

output(p + a > p + b); output(a > b);

AsmC

false true0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

15

UB

UB and Optimization

Software Foundations Lab

Page 33: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4516

Signed Integer OverflowUB and Optimization

Page 34: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4516

Signed Integer OverflowUB and Optimization

int t = 2147..47.. t + 1 ..

UB int t = -2147..48.. (-t) ..

UB

Page 35: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4516

Signed Integer OverflowUB and Optimization

int t = 2147..47.. t + 1 ..

UB

1. It is known that SIO is ‘dangerous’.

2. It gives much more optimization opportunities!

int t = -2147..48.. (-t) ..

UB

Page 36: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow is Dangerous!

17

• It’s on “CWE/SANs Top 25 Software Errors”[1].

• Sanitize your application with..

- Runtime Test: IOC[2] / Static Analysis Tool: cppcheck

• Use unsigned int[1] http://cwe.mitre.org/top25/

[2] Will, Peng, John, and Vikram Adve., Understanding Integer Overflow in C/C++, ICSE’12

The 2011 CWE/SANS Top 25 Most Dangerous Software Errors isa list of the most widespread and critical errors

that can lead to serious vulnerabilities in software.

Page 37: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow

int i;for(i=0; i<=N; i++)p[i] = 10;

int64 i;for(i=0; i<=N; i++)p[i] = 10;

C

18

Asm

int* p;

UB and Optimization

Software Foundations Lab

Page 38: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow

int i;for(i=0; i<=N; i++)p[i] = 10;

int64 i;for(i=0; i<=N; i++)p[i] = 10;

C

18

Asm

int* p;

UB and Optimization

INT32_MAX INT32_MAX

Software Foundations Lab

Page 39: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow

int i;for(i=0; i<=N; i++)p[i] = 10;

int64 i;for(i=0; i<=N; i++)p[i] = 10;

C

18

Asm

int* p;

UB and Optimization

INT32_MAX

INT32_MIN

INT32_MAX

Software Foundations Lab

Page 40: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow

int i;for(i=0; i<=N; i++)p[i] = 10;

int64 i;for(i=0; i<=N; i++)p[i] = 10;

C

18

Asm

int* p;

UB and Optimization

INT32_MAX

INT32_MIN

INT32_MAX

INT32_MAX+1

Software Foundations Lab

Page 41: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow

int i;for(i=0; i<=N; i++)p[i] = 10;

int64 i;for(i=0; i<=N; i++)p[i] = 10;

C

18

Asm

int* p;

UB and Optimization

INT32_MAX

INT32_MIN

INT32_MAX

INT32_MAX+1

UB

Software Foundations Lab

Page 42: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow (cont.)

19

UB and Optimization

Software Foundations Lab

Page 43: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow (cont.)

19

UB and Optimization

Software Foundations Lab

t = -x / -y

C

t = x / y

Asm

INT32_MIN 2 INT32_MIN 2

Page 44: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow (cont.)

19

UB and Optimization

Software Foundations Lab

t = -x / -y

C

t = x / y

Asm

INT32_MIN 2 INT32_MIN 2

t = (x*c) / c’

C

t = x * (c/c’)

Asm

-1 INT32_MIN -1 INT32_MIN

Page 45: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Signed Integer Overflow (cont.)

19

UB and Optimization

Software Foundations Lab

t = -x / -y

C

t = x / y

Asm

INT32_MIN 2 INT32_MIN 2

t = (x*c) / c’

C

t = x * (c/c’)

Asm

-1 INT32_MIN -1 INT32_MIN

Page 46: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Summary

• UB is the result of erroneous operation.

• A well-written program should not have UB.

• UB helps compiler to do more optimization.

20Software Foundations Lab

Page 47: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Undefined Behavior

in LLVM IR

21Software Foundations Lab

Page 48: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

IR in Compiler

22

C

Asm

IR Optimization!

Software Foundations Lab

Page 49: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

UB & Optimization

23

State

1

State

2 UB

IR

State

1

State

2

State

3 …

IR

Optimize

Software Foundations Lab

Page 50: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

UB & Optimization

24

State 1 UB

C

State 1 State 2

AsmState 3 State 4

State 1 State 2

IR

UB

State 1 State 2

IR

UB State 3

Software Foundations Lab

Page 51: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4525

Undefined

Behavior in

LLVM IR?

Software Foundations Lab

Page 52: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 4525

Undefined

Behavior in

LLVM IR?

UB in C

UB in IR!

Software Foundations Lab

Page 53: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

26

UB in C ≠ UB in IR

Software Foundations Lab

Page 54: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

0x100 00x100 0

0xFFFFFF00

26

UB in C ≠ UB in IR

Software Foundations Lab

Page 55: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

26

UB in C ≠ UB in IR

Software Foundations Lab

Page 56: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

false0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

26

UB in C ≠ UB in IR

Software Foundations Lab

Page 57: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

false true0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

26

UB in C ≠ UB in IR

Software Foundations Lab

Page 58: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

false true0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

26

UB

UB in C ≠ UB in IR

Software Foundations Lab

Page 59: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

int* pint aint b

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

Peephole Optimization

output(p + a > p + b) output(a > b)

IR IR

false true0x0

(Overflow!)

0x100 00x100 0

0xFFFFFF00

26

UB

UB in C ≠ UB in IR

Software Foundations Lab

Page 60: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

Loop Invariant Code Motion

27

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){

a[i] = q}

IR IR

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

UB in C ≠ UB in IR

Page 61: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

Loop Invariant Code Motion

27

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){

a[i] = q}

IR IR 0

0xFFFFFF00

0xFFFFFF00

0

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

UB in C ≠ UB in IR

Page 62: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

Loop Invariant Code Motion

27

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){

a[i] = q}

IR IR 0

0xFFFFFF00

0xFFFFFF00

0

Overflow!

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

UB in C ≠ UB in IR

Page 63: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45Software Foundations Lab

Loop Invariant Code Motion

27

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){

a[i] = q}

IR IR 0

0xFFFFFF00

0xFFFFFF00

0

Overflow!

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

UB

UB in C ≠ UB in IR

Page 64: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

Poison Value: A Deferred UB

28

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){a[i] = q

}

IR IR 0

0xFFFFFF00

0

0xFFFFFF00

Overflow! UB

UB in C ≠ UB in IR

Software Foundations Lab

Page 65: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

28

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){a[i] = q

}

IR IR 0

0xFFFFFF00

0

0xFFFFFF00

Overflow!poison UB

UB in C ≠ UB in IR

Software Foundations Lab

Page 66: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

C’s UB Model:

Pointer Arithmetic Overflow is

Undefined Behavior

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

28

...for(i=0; i<n; ++i){a[i] = p + 0x100

}

q = p + 0x100for(i=0; i<n; ++i){a[i] = q

}

IR IR 0

0xFFFFFF00

0

0xFFFFFF00

Overflow!poison UB

UB in C ≠ UB in IR

Software Foundations Lab

Page 67: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

output(p + a > p + b) output(a > b)

IR IR 0xFFFFFF00

0x100 0 0x100 0

29

UB

0x0

(Overflow!)

UB in C ≠ UB in IR

Software Foundations Lab

Page 68: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

output(p + a > p + b) output(a > b)

IR IR 0xFFFFFF00

0x100 0 0x100 0

29

UB

0x0

(Overflow!)Poison

UB in C ≠ UB in IR

Software Foundations Lab

Page 69: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

output(p + a > p + b) output(a > b)

IR IR 0xFFFFFF00

0x100 0 0x100 0

29

UB

0x0

(Overflow!)Poison

UB in C ≠ UB in IR

Software Foundations Lab

Page 70: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Pointer Arithmetic Overflow is

A Poison “Value”

Poison Value: A Deferred UB

output(p + a > p + b) output(a > b)

IR IR 0xFFFFFF00

0x100 0 0x100 0

29

UB

UB

0x0

(Overflow!)Poison

UB in C ≠ UB in IR

Software Foundations Lab

Page 71: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

UB in IR is only for C?

• Example: Java

- Type checker:“Function args are either null or dereferenceable.”

- Put ‘dereferenceable_or_null’ tag to them!

- It’s UB for them to have invalid pointers

30Software Foundations Lab

Page 72: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Summary

• C’s UB ≠ LLVM IR’s UB.

• The notion of ‘deferred UB’ helps further opt.

• UB works for well-typed languages, too

31Software Foundations Lab

Page 73: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of UB in LLVM IR

& Solution

32Software Foundations Lab

Page 74: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

Taming Undefined Behavior

in LLVM

Nuno P. Lopes

PLDI 2017 Barcelona

Seoul National Univ.

Juneyoung Lee

Yoonseung Kim

Youngju Song

Chung-Kil Hur

Azul Systems Sanjoy Das

Google

John RegehrUniversity of Utah

David Majnemer

Microsoft Research

Software Foundations Lab

Page 75: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of Poison

34

p a p b

+ +

>

output

0xFFFFFF00 0x100

Software Foundations Lab

Page 76: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of Poison

34

p a p b

+ +

>

output

0xFFFFFF00 0x100

poison

Software Foundations Lab

Page 77: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of Poison

34

p a p b

+ +

>

output

0xFFFFFF00 0x100

poison

poisonPropagate

Software Foundations Lab

Page 78: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of Poison

34

p a p b

+ +

>

output

0xFFFFFF00 0x100

poison

poisonPropagate

Raise UB

UB

Software Foundations Lab

Page 79: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Problem of Poison

34

p a p b

+ +

>

output

0xFFFFFF00 0x100

poison

poisonPropagate

Raise UB

“Poison is

Sometimes

Too Poisonous”

UB

Software Foundations Lab

Page 80: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

Global Value Numbering (GVN)Problems with LLVM’s UB

Software Foundations Lab

Page 81: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

Global Value Numbering (GVN)Problems with LLVM’s UB

Software Foundations Lab

Page 82: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

Global Value Numbering (GVN)Problems with LLVM’s UB

Software Foundations Lab

Page 83: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

Global Value Numbering (GVN)Problems with LLVM’s UB

Software Foundations Lab

Page 84: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

0 poison

Global Value Numbering (GVN)Problems with LLVM’s UB

Software Foundations Lab

Page 85: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

0 poison

Global Value Numbering (GVN)Problems with LLVM’s UB

UB

Software Foundations Lab

Page 86: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

0 poison

Global Value Numbering (GVN)Problems with LLVM’s UB

UB

Software Foundations Lab

Page 87: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM’s UB Model:

Branching on poison is

???

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

35

if (x == y) {

.. use x ..

}

if (x == y) {

.. use y ..

}

0 poison 0 poison

poisonpoison

0 poison

Global Value Numbering (GVN)Problems with LLVM’s UB

UB UB

Software Foundations Lab

Page 88: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Loop Unswitching (LU)

36

while (n > 0) {if (cond)A

elseB

}

if (cond)while (n > 0){ A }

elsewhile (n > 0){ B }

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

Problems with LLVM’s UB

Software Foundations Lab

Page 89: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Loop Unswitching (LU)

36

while (n > 0) {if (cond)A

elseB

}

if (cond)while (n > 0){ A }

elsewhile (n > 0){ B }

0 poison

poison 0

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

Problems with LLVM’s UB

Software Foundations Lab

Page 90: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Loop Unswitching (LU)

36

while (n > 0) {if (cond)A

elseB

}

if (cond)while (n > 0){ A }

elsewhile (n > 0){ B }

0 poison

poison 0

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

Problems with LLVM’s UB

UB

Software Foundations Lab

Page 91: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Loop Unswitching (LU)

36

while (n > 0) {if (cond)A

elseB

}

if (cond)while (n > 0){ A }

elsewhile (n > 0){ B }

0 poison

poison 0

LLVM’s UB Model:

Branching on poison is

Undefined Behavior

Problems with LLVM’s UB

UB

Software Foundations Lab

Page 92: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Inconsistency in LLVM

• GVN + LU is inconsistent.

• We found a miscompilation bug in LLVMdue to the inconsistency (LLVM Bugzilla 31652).

- It is being discussed in the community

- No solution has been found yet

37Software Foundations Lab

Page 93: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Overview

38

Existing Approaches

Defined values

Undef. values

Poison values

Can’t Control

Poison

GVN + LU

More

Defined

Complex

Inconsistent

UB

Software Foundations Lab

Page 94: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Overview

38

𝒇𝒓𝒆𝒆𝒛𝒆

Existing Approaches Our Approach

Defined values

Undef. values

Poison values

Defined values

Poison values

Can’t Control

Poison

GVN + LU

More

Defined

Complex

Inconsistent

Simpler

UB UB

Software Foundations Lab

Page 95: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Overview

38

𝒇𝒓𝒆𝒆𝒛𝒆

Existing Approaches Our Approach

Defined values

Undef. values

Poison values

Defined values

Poison values

Can’t Control

Poison

GVN + LU

More

Defined

Can Control

Poison

Complex

Inconsistent

Simpler

UB UB

Software Foundations Lab

Page 96: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Overview

38

𝒇𝒓𝒆𝒆𝒛𝒆

Existing Approaches Our Approach

Defined values

Undef. values

Poison values

Defined values

Poison values

Can’t Control

Poison

GVN + LU

More

Defined

Can Control

Poison

Complex

Inconsistent

Simpler

Consistent

UB UB

Software Foundations Lab

Page 97: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Key Idea: “Freeze”

• Introduce a new instruction

• Semantics:

39

y = freeze x

When x is a defined value:

When x is a poison value:

freeze x

freeze x

0

1

2

. . .

x

Nondet. Choice of

A Defined Value

Software Foundations Lab

Page 98: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Our UB Model:

Branching on poison is

Undefined Behavior

poison

if (freeze(cond))while (n > 0){ A }

elsewhile (n > 0){ B }

(cond)

40

while (n > 0) {if (cond)A

elseB

}

0

Our Solution

Loop Unswitching

UB

Software Foundations Lab

Page 99: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Our UB Model:

Branching on poison is

Undefined Behavior

if (freeze(cond))while (n > 0){ A }

elsewhile (n > 0){ B }

poison

40

while (n > 0) {if (cond)A

elseB

}

0

Our Solution

Loop Unswitching

UB

Software Foundations Lab

Page 100: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Our UB Model:

Branching on poison is

Undefined Behavior

if (freeze(cond))while (n > 0){ A }

elsewhile (n > 0){ B }

poison

40

while (n > 0) {if (cond)A

elseB

}

true false0

Our Solution

Loop Unswitching

UB

Software Foundations Lab

Page 101: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Our UB Model:

Branching on poison is

Undefined Behavior

if (freeze(cond))while (n > 0){ A }

elsewhile (n > 0){ B }

poison

40

while (n > 0) {if (cond)A

elseB

}

true false0

Our Solution

Loop Unswitching

UB

Software Foundations Lab

Page 102: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Summary of Freeze

• Branching on freeze(poison) => Nondet.

- Used for Loop Unswitching

• Branching on poison => UB

- Used for Global Value Numbering

41

Compilers can control poison!

Software Foundations Lab

Page 103: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Summary of Freeze

• Branching on freeze(poison) => Nondet.

- Used for Loop Unswitching

• Branching on poison => UB

- Used for Global Value Numbering

41

Compilers can control poison!

Freeze can also fix many other

UB-related problems.

Software Foundations Lab

Page 104: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

Software Foundations Lab

Page 105: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison

0

poison

Software Foundations Lab

Page 106: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poison poison

Software Foundations Lab

Page 107: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poison poison

Software Foundations Lab

Page 108: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

UB

poison poison

Software Foundations Lab

Page 109: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

UB

poison poison

Software Foundations Lab

Page 110: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poison

Software Foundations Lab

Page 111: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

freeze(x) | 0x1

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poison

Software Foundations Lab

Page 112: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

freeze(x) | 0x1

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poisonA defined

value

Software Foundations Lab

Page 113: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

freeze(x) | 0x1

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poisonA defined

value

non-zero

Software Foundations Lab

Page 114: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

LLVM does not currently support it.Freeze can make LLVM support it!

// bitwise-or

k = x | 0x1

t = 100 / k

while (n > 0)

use(t)

freeze(x) | 0x1

42

// bitwise-or

k = x | 0x1

while (n > 0)

use(100 / k)

Hoisting DivisionFurther Example

poison poison

0

poisonA defined

value

non-zero

Software Foundations Lab

Page 115: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Implementation

• Target: LLVM 4.0 RC 4 (Mar. 2017)

• Add Freeze instruction to LLVM IR

• Bug Fixes Using Freeze

- Loop Unswitching Optimization

- C Bitfield Translation to LLVM IR

- InstCombine Optimizations

43

* More details are given in the paper

Software Foundations Lab

Page 116: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Experiment Results

• Benchmarks (4.6M LOC):

- SPEC CPU2006

- LLVM Nightly Test

- Large Single File Benchmarks

• Compilation Time: ± 1%

• Compilation Memory Usage: Max + 2%

• Generated Code Size: ± 0.5%

• Execution Time: ± 3%

44

* More details are given in the paper

Software Foundations Lab

Page 117: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Experiment Results

• Benchmarks (4.6M LOC):

- SPEC CPU2006

- LLVM Nightly Test

- Large Single File Benchmarks

• Compilation Time: ± 1%

• Compilation Memory Usage: Max + 2%

• Generated Code Size: ± 0.5%

• Execution Time: ± 3%

44

* More details are given in the paper

“Freeze” Can Fix UB Semantics

Without Significant Performance Penalty

Software Foundations Lab

Page 118: Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf · 2019-12-11 · Taming Undefined Behavior in LLVM SIGPL 2017 Summer Juneyoung Lee Software Foundations

/ 45

Summary

• Modern compilers’ UB models cannot support some textbook optimizations.

• We propose “freeze” to fix such problems.

• Freeze has little impact on performance.

45Software Foundations Lab