buffer overflow: a short study
DESCRIPTION
TRANSCRIPT
Buffer Overflow: A Short Study
Jonathan Hutchison
Robert Lee
Connor Mahoney
Caleb Wherry
Information Security
Mrs. Nancy Smithfield
November 13, 2007
GroupA1
Buffer overflow is a serious problem in the computer world today. It poses a serious
threat to any piece of code out there and has the potential for great damage on any system, no
matter the operating system or manufacturer. Buffer overflow takes seemingly harmless code
and uses large amounts of information pushed into this code to make a machine do things it
normally would not do. This method of attack accounts for around 50% of all malicious attacks
that take place currently in the computer world. There is not one area of computers that is totally
immune to a buffer overflow attack. It is prevalent in C/C++, SQL, JAVA, and even in data
image processing. Malicious code can be hidden in images and when the image is opened up, the
code is executed in the back ground. This can cause serious damage to someone’s computer in
many ways. Buffers can be easily accessed while the person at the computer is unaware of what
is going on making it incredible hard to stop one of these attacks. These all are ways to cause
damage through buffer overflows. We will discuss some of these dangers and how they can be
prevented and what the future holds for each of them.
When looking at buffer overflows, one of the largest areas that seem to pertain to them is
that of the programming languages of C and C++. These languages’ use of arrays and vectors
make it easy to cause a buffer overflow and do some serious damage. But before we get into the
technical specifications of a buffer overflow attack in C and C++, lets first define what buffer
overflow really is. Buffer overflow is the use of code to fill up a computer buffer to the point
where it cannot fit in the buffer any more so it flows over into other parts of memory. But what is
a buffer, you ask. A buffer is a temporary memory storage area comparable to cache memory. It
stores input from the keyboard or information that is being currently populated in a running
program. This is where the first steps of a buffer overflow attack starts, with the basic knowledge
GroupA2
of how the computer buffers actually work. Since C and C++ are two the most vulnerable
languages out there at the moment, we will look at ways to create buffer overflow in these
languages. With these languages, there are many ways of going about causing damage with the
buffer but we will look specifically at the case of stack-based buffer overflows. A stack is a type
of memory storage that has certain characteristics. Stack has the distinctive feature of having a
“First-In-Last-Out” memory allocation algorithm. We will look at this type, as opposed to heap-
based buffer overflows, because it is much easier to understand and code. Take this C code
example:
#include <stdio.h>
#include <string.h>
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh"; // Shell code that will be executed once the buffer is // over flown. It allows us to change the stance of our // login to “root”.
char large_string[128];
int main(int argc, char *argv[])
{
char buffer[96]; // buffer to overflow
int i;
long *long_ptr = (long *)large_string;
GroupA3
for (i = 0; i < 32; i++) // These for loops take the shell code and // translate it into the large string and then // in turn puts a full buffer into each // pointer value of the large_string
*(long_ptr + i) = (int)buffer;
for (i = 0; i < (int)strlen(shellcode); i++)
large_string[i] = shellcode[i];
strcpy(buffer, large_string); // The string copy function in C should be used // with the utmost caution. This is where the code // blows up and causes the program to execute the // rest of the shell code on the command line.
return 0;
}
This code, when run from an unsecured Linux system, causes the user to automatically
switch to the root user without any type of authentication at all! To really understand why this
does what it does, you have to first understand that many statements in C and C++ are extremely
risky to use, such as the “strcpy()” function. There is no automatic checking system in this
command, or for that matter arrays as well, that makes sure that the operation it is doing it within
the set limit of space it was allocated at the beginning of the program. This is where the
knowledge of the buffer can really help someone to cause a lot of damage. The simple shell code
declared at the beginning of the code is sent directly to the command prompt because the buffer
variable is already full. The shell code then gets executed, thus causes the users to switch to the
root user. When looking at the shell code, there are many different directories that the shell code
is referencing. Once the shell code is executed, it finds the directory “/bin/sh” and grants control
to that user on that system. Since the only user that has access to that directory is the root user,
control is switched to full root privileges. The root user has full control of everything on the
GroupA4
Linux system. Once a person has access to the root account they can do anything they want on
the system. They have full rights to anything and everything. Just by executing this simple C
code they can gain full access to a Linux system within a few minutes. Preventing these kinds of
attacks is the next topic of interest for us in this study.
Preventing buffer overflow attacks is almost as easy as doing a buffer overflow attack.
First and foremost, use trusted libraries when running code. Do not use random libraries for your
code because you have no clue what is really going on when you make a function call out of that
library. Another thing that goes along with this is the use of good programming techniques as a
programmer. Do not write code that is going to be susceptible to overflow attacks. Make sure all
data coming from user is in the right format and is within the limit of the scope of the program or
variable. Letting a user input as much information as they want is a sure way to open your
system up to buffer overflow attacks. Also test the software before it is released. Do not release
alpha versions of software without knowing good and well that the code is safe. As the above
example shows, a simple C program can cause much havoc on a Linux system. Test the
programs and make sure all data going in and out is being properly tested. A software developer
should also use readily available programs such as Flawfinder and Viega’s RATS to overlook
their source code for possible weaknesses where an attack could happen. On the front end these
tasks may seem pointless and costly, but in the long run they help. One of the main causes for
software failure due to buffer overflow is the ineptitude of the programmers behind the software.
Time constraints and money bonuses get in the way of the real objective of a software product.
Steps are skipped in the writing of code, no one makes a note of it, and no one ever goes back
and fixes the issues, thus causing potential threats to people who use the software.
GroupA5
Best Practices for Prevention of Buffer Overflow In Code
Use trusted libraries when including them in projects.
Use buffer overflow prevention software to scan code to check for potential trouble spots.
Use trusted releases of programs. Alpha and Beta versions have bugs that tend to not prevent buffer overflow.
Check user input for validity and make sure it is in the correct format and within range of function.
Use trusted functions within trusted libraries.
Know what you are using!
The next big area of interest in buffer overflow is that in SQL (Structured Query
Language). SQL is used in the development of software databases and is widely used with online
database systems as well. These two combined make a huge market for exploits in SQL to cause
buffer overflows. These attacks are done by inputting commands that exceed the space of the
variable assigned to take the input stream. The system can then be attacked by overwriting data
in order to execute malicious code or just cause corruption of data. For example, in 2002 an
advisory was issued by NGS Software Insight Security Research describing remote buffer
overrun vulnerability in Microsoft SQL Server 2000. Both stack based and heap based overflow
vulnerabilities existed, allowing a hacker to execute code of his or her choice without ever
needing to authenticate with the server. In each case, the security vulnerabilities in question
occurred through UDP port 1434, the Microsoft SQL monitor port. This port is routinely used by
legitimate clients in order to establish a connection to the server. UDP, or “User Datagram
Protocol”, is a connection-less communications protocol operating on the transport layer. Unlike
TCP, UDP does not need to establish a connection to function. This makes it faster than TCP, but
GroupA6
easier to spoof. This, combined with the stateless nature of UDP, also makes it easy to bypass
firewalls by appearing to look like legitimate traffic.Ordinarily, traffic through this port will
contain a certain byte that lets the SQL monitor know that a client is trying to dynamically
discover how to connect to the server. However, other instructions can be sent through this port.
For instance, if a hacker wanted to execute a stack based overflow vulnerability, the first byte of
the packet can be set to 0x04, which tells the SQL Monitor to open a registry key using the
remaining data supplied in the packet. If this is followed by a large number of bytes, an overflow
occurs and the stack, including the saved return address, is overwritten. The hacker can include a
new return address and when the current process is finished the processor will go there instead
and begin executing the malicious code.A heap based overflow attack can also be carried out
using a similar process. If the data appended to a packet is formatted a certain way, it can
overflow the heap based buffer and exploit a vulnerability in SQL server in which return values
are not validated before being passed to the next function. The result is a corrupted heap
structure which has the effect of a denial of service attack.When this vulnerability was
discovered in 2002, there wasn’t much that could be done to eliminate the risk until Microsoft
was able to create a patch. The risk could be reduced, however, by following best practices and
adding additional rules to the firewall to help filter out packets destined for port 1434 that, while
ordinarily may look legitimate, shouldn’t be destined for that port. An example would be a DNS
query response. Microsoft rated these particular threats as critical, and a patch was released in
2003.
The last type of buffer overflow that we are going to be dealing with is that which deals
with Steganography. Steganography is the act of hiding messages where they will not be
GroupA7
detected. The modern use of this concept has been hiding information in image or video files.
However, messages aren’t the only things that can be hidden in digital images. Buffer overflows
can occur in almost any unprotected area. While it may usually carry its payload in malicious
programs, several image formats are vulnerable to this type of exploit, as well. When a
computer attempts to display the image, the code is unknowingly inserted into the machine’s
memory via a buffer overflow attack. Steganography can be used in both traditional and digital
images, although it is much harder to detect in digital pictures. One use of this has caused
controversy among privacy groups. Many laser printer manufacturers have begun printing tiny
yellow dots into images. These yellow dots are unique to each printer, and are embedded into
every document printed. Their orientation is unique to indicate the printer model, manufacturer,
and serial number. The government originally asked manufacturers to include these dots as a
way to track counterfeit bills, and many companies have complied.
A view of dots printed under a blue LED.(http://w2.eff.org/Privacy/printers/docucolor/)
GroupA8
This “traditional” form of steganography is considered easier, as it can be seen by anyone
looking in the correct place with a microscope. However, digital steganography is much harder
to detect. One reason for this is because of the amount of data contained by a digital photograph.
If a small message were embedded in a high-resolution photograph, it would be nearly
impossible to detect unless the exact location of the message was known in the data. For
instance, in a 24-bit image, each pixel totals 24 bits (8 bits for red, green, and blue). That leaves
256 shades of each color, for each pixel. If someone wanted to write a message using only the
last bit for every color, they would only need to modify 3 pixels for every letter. One shade of
difference per color, per pixel, would be almost unperceivable by the human eye, even if one
were comparing it with the original side by side. Text isn’t the only data that can be hidden using
digital steganography. It is even possible to hide a lower resolution image inside of a higher-
resolution image, with almost no perceived difference in the original image. Using this
technique, it may even be possible to “hide” a video inside of another video, if one were to know
how to extract the data. Despite the possibilities with steganography, the real security issues are
related to buffer overflows in images. Many major systems and image files have been exploited.
For windows, there are exploits for formats such as Jpeg, Bitmap, and Gif files. For Linux, there
was a hole found in the Portable Network Graphics format (PNG). However, the most recent
example has been a TIFF exploit in both the PlayStation Portable game system, and the Apple
iPhone.
The Apple iPhone is a closed system. While it has a fully functioning operating system,
it is limited to only Apple applications. However, a group of programmers found a way to get
around this. By visiting a website on the iPhone, http://www.jailbreakme.com/, the phone would
GroupA9
download a TIFF image file. The image had code embedded into it. This code crashes the
internet browser, and overwrites part of the file system. Once this occurs, the user is free to
install any compatible application they like. Buffer overflows in images generally all work in the
same way. The program opening the image expects it to be a certain size, and allocates memory
for that size. However, it is possible to trick the program by way of an altered file header, or by
altering the structure of an image. When the image is loaded, it unexpectedly overfills the
buffer, and keeps writing data into the next block of physical memory. If crafted correctly, a
pointer to more memory locations can be written to this block, allowing a program to be
executed, or further code to be written.
There are several ways to protect a computer from a buffer overflow attack in an image.
The most secure would be to disable loading of images in the web browser, and only open
pictures from trusted sources. This, however, is not practical. Another way is to periodically
check for updates and patches for these vulnerabilities, such as Microsoft Update’s GDI+
vulnerability patch that affected jpegs in Windows operating systems. A third and easiest
approach is to simply be on the lookout. Don’t visit “shady” websites, or anywhere else that
may wish to infect your computer. In conclusion, images and videos can be used for many
purposes that were not originally intended. They can carry hidden messages without being
detected, or contain embedded code. This code can be used to simply crash a system, or
overwrite memory to execute malicious code. To prevent this, always use the latest patches and
security tips for your computers and digital devices.
Buffer overflow is a major problem in the world of computers as we know it. Someone
can easily take over a whole computer network by using buffer overflow to gain access to private
GroupA10
accounts, destroy valuable information by getting into an SQL database, or just mess around with
things by using this well known type of exploit. With the speed of computer doubling almost 2
fold every 18 months, buffer overflows are always going to be prevalent. Technology is
advancing so fast that the “minor” problems go overlooked. The business world of computers
today is more worried about how fast the machine will run and how much they can make off of
the system. A simple unallocated array in one line of 50 million lines of code doesn’t mean
anything to a lot of people anymore. Software deadlines and the promise of extra money drives
programmers to put out less than perfect software. As long as this is the case, buffer overflow
will always be a problem with code. A software developer that cuts corners because of deadlines
is almost certainly going to have trouble down the road with their code. A good hacker can find
vulnerabilities in any system, no matter how small it may seem. All they need is one small hole
and they are in. Buffer overflow attacks will become more and more troublesome as the
technological world leans for towards marketability instead of user satisfaction.
GroupA11
Works Cited
“Buffer Overruns in SQL Server 2000.” Microsoft Security. 24 July 2002.
<http://www.microsoft.com/technet/security>
Cobb, Michael. “How Buffer-overflow Vulnerabilities Occur.” Search Security. 12 December
2005. < http://searchsecurity.techtarget.com>
Fayolle, Pierre-Alain. “A Buffer Overflow Study: Attacks & Defense.” Network & Distributed
Systems. 2002.
Ghost_Rider. “Introduction to Buffer Overflow.” <http://www.governmentsecurity.org>
“Grover, Sandeep. “Buffer Overflow Attacks & Their Countermeasures.” Linux Journal. 10
March 2003. <http://www.linuxjournal.com/article/6701>
Kharrazi, Mehdi. “Image Steganography: Concepts and Practice.” 22 April 2004.
Ogorkiewicz, Maciej. “Analysis of Buffer Overflow Attacks.” Windows Security. 23 July 2003.
<http://www.windowsecurity.com>
“Unauthenticated Remote Compromise in MS SQL Server 2000.” 25 July 2002.
< http://www.nextgenss.com/advisories/mssql-udp.txt>