control flow hijacking attacks - uni-saarland.de · buffer overflow and control flow hijacking...

32
Prof. Dr. Michael Backes Control Flow Hijacking Attacks

Upload: habao

Post on 31-Mar-2018

225 views

Category:

Documents


1 download

TRANSCRIPT

Prof. Dr. Michael Backes

Control Flow Hijacking Attacks

Control Flow Hijacking

Contains bug in PDF

parser

Control of viewer can be

hijacked

malicious.pdf

Foundations of Cybersecurity 2016

Control Flow Hijacking – Principles

Foundations of Cybersecurity 2016

Normal Control Flow(depicted as Control Flow Graph – CFG)

Start

User

Action

Open

File

Delete

File

Display

Content

Control Flow Hijacking – Principles

Foundations of Cybersecurity 2016

Attacked Control Flow(depicted as Control Flow Graph – CFG)

Start

User

Action

Open

File

Delete

File

Display

Content

J^@!!##’_±

Control Flow Hijacking Attacks

Attacker’s goal

• Take over target machine (e.g. web server)

• Execute arbitrary code on target by

hijacking application control flow

Foundations of Cybersecurity 2016

Control Flow Hijacking Attacks

Attack Pattern is Always Similar

Foundations of Cybersecurity 2016

Find bug in

program

Create code to exploit bug and control

program

Feed vulnerable program

with exploit

Hijacked

Common Types of Control Flow Attacks

Foundations of Cybersecurity 2016

Discover which

assumptions are made

Craft exploit outside

assumptions

Assumptions are vulnerabilities

Example: Coffee shop, handing out ordered drinks

- Assumptions:

• Only one Bill in shop right now?

• Legitimate Bill picks up? (ID check?)

• Name correctly understood/pronounced by staff?

- Exploit: Pretend to be Bill, be faster than original Bill

Computer program: What we commonly assume

We write our code in languages that offer several layers of abstraction over machine code; even C

- High-level statements: “=” (assign), “;” (seq), if, while, for, etc.

- Procedures / functions

Naturally, our execution model assumes:

- Basic statements (e.g. assignment) are atomic

- Only one of the branches of an if statement can be taken

- Functions start at the beginning

- They (typically) execute from beginning to end

- And, when done, they return to their call site

- Only the code in the program can be executed

- The set of executable instructions is limited to those output during compilation of the program

Foundations of Cybersecurity 2016

Computer program: The Ugly Truth

We write our code in languages that offer several layers of abstraction over machine code; even C

- High-level statements: “=” (assign), “;” (seq), if, while, for, etc.

- Procedures / functions

But, actually, at the level of machine code

- Basic statements (e.g. assignment) are atomic

- Only one of the branches of an if statement can be taken

- Functions start at the beginning

- They (typically) execute from beginning to end

- And, when done, they return to their call site

- Only the code in the program can be executed

- The set of executable instructions is limited to those output during compilation of the program

Foundations of Cybersecurity 2016

- Each basic statement compiled down to many instructions

- There is no restriction on the target of a jump

- Can start executing in the middle of functions

- A fragment of a function may be executed

- Returns can go to any program instruction

- Dead code (e.g. unused library functions) can be executed

- On the x86, can start executing not only in the middle of functions, but in the middle of instructions!

Computer program: The Ugly Truth

We write our code in languages that offer several layers of abstraction over machine code; even C

- High-level statements: “=” (assign), “;” (seq), if, while, for, etc.

- Procedures / functions

But, actually, at the level of machine code

- Each basic statement compiled down to many instructions

- There is no restriction on the target of a jump

- Can start executing in the middle of functions

- A fragment of a function may be executed

- Returns can go to any program instruction

- Dead code (e.g. unused library functions) can be executed

- On the x86, can start executing not only in the middle of functions, but in the middle of instructions!

Foundations of Cybersecurity 2016

Common Types of Control Flow Attacks

Foundations of Cybersecurity 2016

Discover which

assumptions are made

Craft exploit outside

assumptions

Assumptions are vulnerabilities

Two assumptions often exploited:

- Target buffer is large enough for source data

• Buffer overflows deliberately break this assumption

- Computer integers behave like math integers

• Integer overflows violate this assumption

Hijacked

Control Flow Hijacking Attacks

There are many attacks that execute code on an attacker’s behalf

Buffer Overflows

Integer Overflows

Return-Oriented Programming

Format String Vulnerabilities

Use after Free

Foundations of Cybersecurity 2016

This lecture

Buffer Overflow – History

1972 : First described – no documented use

1988 : “Morris Worm” hits the wildexploited a buffer overflow in the Unix service finger

... Many, many since then...

Foundations of Cybersecurity 2016

6,203

Buffer Overflow

Why is an assumption about data length dangerous?

Computer programs organise internal data values in variablesstored in memory

Foundations of Cybersecurity 2016

PC Memory

Program “Car configurator” Another program...

Colour PriceNav EmailVariables: Red Yes [email protected] 19750

Reserved 50characters for email

looooooooooooooooooongemail

Memory for “email” variable

(size: 50 characters)

Memory for “price” variable

Buffer Overflow and Changing Variable Values

The car configurator example allows us to overwrite the associated price using a very long email address

Foundations of Cybersecurity 2016

[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ]

51 characters

19750

looooooooooooo...oooooongemail7

l o … o n g m a i l 7

Car for 7 €

1 2 50494847464544

Buffer Overflow and Control Flow Hijacking

This technique can also be used to overwrite program control informationthat programs internally store:

- Where they came from when calling a sub-routine

- Local variables that are only used within a sub-routine

This data is organized in a memory region called the stack

Foundations of Cybersecurity 2016

C

y

S

e

c

C

y

S

e

c

C

y

S

e

c

C

y

S

e

push(c) pop() pop()

Stack operations

C

y

S

e

push(e)

ABCD

Control Flow and the Stack

Calling a simple program on shell:$ prog ABCD

Foundations of Cybersecurity 2016

void dangerous()

{

fprintf(stdout, “Hidden functionality!\n");

}

int bar(char *arg, char *out)

{

strcpy(out, arg);

return 0;

}

int foo(char *argv[])

{

char buf[128];

bar(argv[1], buf);

}

int main(int argc, char *argv[])

{

foo(argv);

return 0;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

Entry

point

Return address: 20

Address argv

Memory for buf

(128 bytes)

Address argv[1]

Address of buf

Return address: 15

Source code of prog Stack during execution (simplified)Control flow

Direction the

buffer is filled

ABCD

Control Flow and the Stack

Calling a simple program on shell:$ prog ABCD

Foundations of Cybersecurity 2016

void dangerous()

{

fprintf(stdout, “Hidden functionality!\n");

}

int bar(char *arg, char *out)

{

strcpy(out, arg);

return 0;

}

int foo(char *argv[])

{

char buf[128];

bar(argv[1], buf);

}

int main(int argc, char *argv[])

{

foo(argv);

return 0;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

Return address: 20

Address argv

Memory for buf

(128 bytes)

Address argv[1]

Address of buf

Return address: 15

Source code of prog Stack during execution (simplified)

Exit

Current

instruction

Control flow

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

Buffer Overflow and Control Flow Hijacking

Calling a simple program on shell:$ prog AAAAAAAAAAAAAAAAA...AAAAAAAAAA01

Foundations of Cybersecurity 2016

void dangerous()

{

fprintf(stdout, “Hidden functionality!\n");

}

int bar(char *arg, char *out)

{

strcpy(out, arg);

return 0;

}

int foo(char *argv[])

{

char buf[128];

bar(argv[1], buf);

}

int main(int argc, char *argv[])

{

foo(argv);

return 0;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

Entry

point

Return address: 20

Address argv

Memory for buf

(128 bytes)

Address argv[1]

Address of buf

Return address: 15

Source code of prog Stack during execution (simplified)Control flow

Return address: 01

128 bytes

Direction the

buffer is filled

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

AAAAAAAAAAAAAAAAA

Buffer Overflow and Control Flow Hijacking

Calling a simple program on shell:$ prog AAAAAAAAAAAAAAAAA...AAAAAAAAAA01

Foundations of Cybersecurity 2016

void dangerous()

{

fprintf(stdout, “Hidden functionality!\n");

}

int bar(char *arg, char *out)

{

strcpy(out, arg);

return 0;

}

int foo(char *argv[])

{

char buf[128];

bar(argv[1], buf);

}

int main(int argc, char *argv[])

{

foo(argv);

return 0;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

Return address: 20

Address argv

Memory for buf

(128 bytes)

Address argv[1]

Address of buf

Return address: 15

Source code of prog Stack during execution (simplified)Control flow

Return address: 01

128 bytes

Control flowhijacked!

Current

instruction

Buffer Overflow

We have just created our first Control Flow Hijacking attack!

Foundations of Cybersecurity 2016

Control Flow Hijacking Attack

What can we do with an overwritten return address?

Foundations of Cybersecurity 2016

Everything!

• Call an arbitrary address in memory

Call arbitrary functions or only parts of functions

• Call our own code by pointing the return address to the stack that

we have overwritten

\x90\x90\x90\x90

\x90\x90\x90\x90

\xeb\x1f\x5e\x89

\x76\x08\x31\xc0

Executing shell code

$ prog \x90\x90…\x90\xeb\x1f\x5e…0x01234

Foundations of Cybersecurity 2016

void dangerous()

{

fprintf(stdout, “Hidden functionality!\n");

}

int bar(char *arg, char *out)

{

strcpy(out, arg);

return 0;

}

int foo(char *argv[])

{

char buf[128];

bar(argv[1], buf);

}

int main(int argc, char *argv[])

{

foo(argv);

return 0;

}

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

Return address: 20

Address argv

Address argv[1]

Address of buf

Return address: 15

Source code of prog Stack during execution (simplified)Control flow

Return address: 0x1234

128 bytes

Executeshellcode!

“NOP Slide”

(Account for stack changes) Shellcode Address of buf

Current

instruction Address:0x1234

After pop():Shellcode still present in memory at old address of buf!

Reasons for Buffer Overflows

Foundations of Cybersecurity 2016

The level of abstraction of a programming language

correlates to how vulnerable it is to buffer overflows.

Assembler

High-LevelLanguages

2GL

3GL

Advanced

3GL

2GL=2nd Generation Language, 3GL=3rd Generation Language

mov r8, r9

add r8, $42

test r8

jne 0x48aac1

Directly interpreted by the CPU.

Must be

written for a

particular

CPU

int main(int argc, char*

argv[])

Translated to

assembler

Intel

ARM

MIPS

C

C++ObjC

Object o = new Map<int, String>;

o.getIterator();

compiled at

run-timeC#

JavaPython

PROGRAMMER MUST ORGANISE

BUFFER LENGTHS AND VARIABLE

LAYOUT

LANGUAGE ORGANISES AND IS SAFE

Finding Buffer Overflows

Finding Buffer Overflows

Finding Buffer Overflows is useful

- for programmers to test their software for vulnerabilities

- for attackers to ...

Methods (white box vs. black box)

- Source Code Analysis (white box)

• Manual code analysis

• Automatic Tools

- Fuzzing (black box)

• Input massive amounts of random data to test whether program crashes

Foundations of Cybersecurity 2016

Source Code Analysis

Manually or automatically scan source for fixed-length buffers that can be overwritten by user input

Only necessary for lower 3GL languages (C, C++, ObjC, ...)

Fixes:

- Either: allocate dynamic buffer as neededchar* buf = malloc(length);

- Or: check length when touching buffersstrncpy() instead of strcpy()

Foundations of Cybersecurity 2016

Fuzzing

Foundations of Cybersecurity 2016

Fuzzing feeds unexpected input to a program and makes it explore corner cases in the hopes that it eventually crashes

Crash dump (=memory content at time of crash) is analyzed to guess the cause of the crash

Fuzzing is no exact science

Very helpful to start

Fuzzing

Simple Fuzzing

- Run target app on local machine

- Feed input with long string “$$$$$$$$$$$$$$$$$$$$....”

- If app crashes, search crash dump for “$$$$$” to find overflow location

Foundations of Cybersecurity 2016

Fuzzing

Advanced Fuzzing

- Run target app on local machine

- Feed input with long string in the form of a certain pattern where each character uniquely describes its positionAa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0A...

- If app crashes, check return addresses on the stack and calculate which position of the generated string that isAa0 = 0, Aa1 = 3, Aa2 = 6, Aa3 = 9, ...

- Now we know the exact buffer length and the position of the return address to overwrite

Foundations of Cybersecurity 2016

The mother of all suspicious files

Foundations of Cybersecurity 2016

https://xkcd.com/1247/

Demo