october 28, 2010coms w41561 coms w4156: advanced software engineering prof. gail kaiser...

October 28, 2010 COMS W4156 1

COMS W4156: Advanced Software Engineering

Prof. Gail Kaiser

[email protected]

http://bank.cs.columbia.edu/classes/cs4156/

mailto:[email protected]



Topics covered in this lecture

• Security Testing

• Stress Testing


Security Testing


Software Security Overview

• Software failures usually happen spontaneously in the real world, without intentional mischief

• Software security is about making software behave correctly in the presence of a malicious attack

• Standard software testing literature is concerned only with what happens when software fails, regardless of intent

• The difference between software safety and software security is the presence of an intelligent adversary intent on breaking the system


Security Flaws

• Security flaws are any conditions or circumstances that can result in – denial of service to authorized users– provision of service to unauthorized users– unauthorized disclosure of data– unauthorized destruction or modification of data


Security Flaws

• Any part of a program that can cause the system to violate its security requirements, usually concerning– identification and authentication of users (who are

you?)– authorization of particular actions (are you allowed

to do it?)– accountability for actions taken (who did it?)


Genesis of Security Flaws

• May be introduced intentionally or inadvertently

• Different strategies can be used to avoid, detect or compensate for accidental flaws as opposed to those intentionally inserted


Accidentally Introduced

• aka vulnerability

• Increasing the resources devoted to code inspections and testing may be reasonably effective in reducing the number of accidentally introduced flaws

• Can be difficult to detect because residual flaws may be more likely to occur in rarely-invoked parts of the software


Maliciously Introduced

• May be most productive to take measures to hire more trustworthy programmers and devote more effort to penetration testing (and educate users not to download software from the Web!)

• Can be difficult to detect because has been intentionally hidden


Genesis of Security Flaws

Characterizing “intent” is tricky: • Some features intentionally placed in programs can

at the same time inadvertently introduce security flaws - e.g., a feature that facilitates remote debugging or system maintenance may at the same time provide a trapdoor

• A trapdoor is a hidden piece of code that responds to a special input, allowing its user access to resources without passing through normal security enforcement


Malicious Flaws: Trap Door

• For example, an automated teller machine (ATM) might be required to check a personal identification number (PIN) read from a card against the number keyed in by the user.– If the numbers match, the user is permitted to enter

transactions– By adding a disjunct to the condition that implements this

test, the programmer can provide a trapdoorif PINcard=PINkeyed OR PINkeyed=9999then {permit transactions}

– The code in this example would be easy for a code reviewer, although not an ATM user, to spot


Malicious Flaws: Trojan Horse

• Trojan horse generally refers to a program that masquerades as a useful service but exploits rights of the program’s user—rights not possessed by the author of the Trojan horse—in a way the user does not intend or realize

• A Trojan horse (or other malware) that replicates itself by copying its code into other program files is a virus

• Malware that replicates itself by creating new processes or files to contain its code, instead of modifying existing storage entities, is a worm


Malicious Flaws: Time Bomb

• A time-bomb is a piece of code that remains dormant in the host system until a certain detonation time or event occurs

• Triggered, a time-bomb may deny service by crashing the system, deleting files, or degrading system response time

• Might be placed within either replicating or non-replicating Trojan horse


Inadvertent Flaws: Domain flaw

• Domain flaws, which correspond to holes in fences, occur when the intended boundaries between protection environments are porous

• For example, a user who creates a new file and discovers that it contains information from a file deleted by a different user (can also occur with in-memory objects)


Inadvertent Flaws: Serialization flaw

• A serialization flaw permits the asynchronous behavior of different system components to be exploited to cause a security violation

• A program may appear to correctly validate all of its parameters, but the flaw permits the asynchronous behavior of another program to change one of those parameters after it has been checked but before it is used (TOCTTOU = time-of-check-to-time-of-use)


Inadvertent Flaws: Aliasing flaw

• Where two names exist for the same object, which can cause its contents to change unexpectedly and, consequently, invalidate checks already applied to it


Inadvertent Flaws: Validation flaw

• Occur when a program fails to check that the parameters supplied or returned to it conform to its assumptions about them

• These assumptions may include the number of parameters provided, the type of each, the location or maximum length of a buffer, or the access permissions on a file


Inadvertent Flaws: Validation flaw

• Incomplete validation – where some but not all parameters are checked

• Inconsistent validation - where different interface routines to a common data structure fail to apply the same set of checks


Inadvertent Flaws: Identification/Authentication flaw

• An identification/authentication flaw is one that permits a protected operation to be invoked without sufficiently checking the identity and authority of the invoking agent

• Could be counted as validation flaws, since presumably some routine is failing to validate authorizations properly


Inadvertent Flaws: Boundary condition flaw

• Boundary condition flaws typically reflect omission of checks to assure constraints are not exceeded (e.g., on buffer size, table size, file allocation, or other resource consumption)

• These flaws may lead to system crashes or degraded service, or they may cause unpredictable behavior

• One of the most widely exploited vulnerabilities


Buffer Overflows

• Simple bugs can be exploited to cause security problems• A lot of the challenge is the immense popularity of the C

programming language for systems programs (e.g., OS, application server, web server, database system):– Character strings in C are arrays of chars– There is no array bounds checking done in C– Strings are often allocated on the stack

• Attacker’s goal: overflow array in a controlled fashion, to inject executable code onto the stack and overwrite the return address to point to it


Buffer Overflow Solutions

• The role of specifications:“File names may be up to 1024 bytes long”

vs.“File names may be up to 1024 bytes long; longer file names must be rejected”

• Alerts programmer and tester to the real requirements


Buffer Overflow Solutions

• C: – Check buffer lengths– Use safer library functions gets() fgets()strcpy() strncpy()strcat() strncat()sprintf() snprintf()

• Use C++ and class String• Use Java or C#• Canaries, heap allocation, protected memory pages,

…


Inadvertent flaws: Debug Commands

• Induce a server to enter debug mode

• Implicit debugging-related elements left in production code and I/O messages

• Debugging code often does not validate data and may access unintended parts of application


Inadvertent flaws: Default Accounts

• Many applications have at least one user activated by default, in many cases with a standard password

• Administrator, test, guest accounts with varying privileges


So what to do?


Validate Input and Output

• Trust nothing supplied by the user• User input and output to and from the system is the

route for malicious payloads into or out of the system• All user input and user output should be checked to

ensure it is both appropriate and expected• The correct strategy for dealing with system input

and output is to allow only explicitly defined characteristics and drop all other data

• A common mistake is to filter for specific strings or payloads (“signatures”) in the belief that specific problems can be prevented


Fail Securely

• Any security mechanism should be designed in such a way that when it fails, it fails closed

• It should fail to a state that rejects all subsequent security-related requests rather than allows them

• Example: If a firewall fails it should drop all subsequent packets (catastrophic vs. graceful failure)


Keep It Simple

• If a security system is too complex for its user base, it will either not be used or users will try to find measures to bypass it

• Economy of mechanism, psychological acceptability

• Do not expect users to enter 12 passwords and let the system ask for a (randomly selected) specific one of those 12 passwords


Use and Reuse Trusted Components

• Invariably other system designers (either on your development team or on the Internet) have faced the same problems as you

• In many cases they will have improved components through an iterative process and learned from common mistakes along the way

• When someone else has got it right, take advantage of it!


Defense in Depth

• Relying on one component to perform its function 100% of the time is unrealistic

• Good systems don't predict the unexpected, but plan for it

• If one component fails to catch a security event, a second one should catch it (multiple security layers)


Only as Secure as the Weakest Link

• “This system is 100% secure, it uses 128bit SSL”

• The focus of security mechanisms is at the wrong place: As in the real world, there is no point in placing all of one's locks on one's front door to leave the back door swinging in its hinges

• Attackers will find the weakest point and attempt to exploit it


Security By Obscurity Won’t Work

• Hiding things from prying eyes may buy some amount of time

• But obscuring information is very different from protecting it

• You are relying on the premise that no one will stumble onto your obfuscation

• Related to open design: The security of a mechanism should not depend on the secrecy of its design or implementation


Least Privilege

• Systems should be designed in such a way that they run with the least amount of system privilege they need to do their job (the "need to know" approach)

• If an account doesn't need root privileges to operate, don't assign them (e.g., run as “nobody”, not as <myuserid> or “root”)

• Minimize time that privilege can be used


Fail-Safe Defaults

• Unless a subject is given explicit access to an object, it should be denied access

• Allow as default - in case of mistake, access allowed, often not noticed

• Deny as default – in case of mistake, access denied, usually noticed quickly (legit user screaming!)


Compartmentalization (Separation of Privileges)

• Compartmentalizing users, processes and data helps contain problems if they do occur

• If one user account or process is compromised, others will remain ok

• Example: “sandbox” email attachments


Complete Mediation• All accesses to objects must be checked to

ensure that they are allowed• Including normal operation, initialization,

maintenance• Performance vs. security issues - access

checks should not be cached• Example problem: Forked and exec’d

processes inherit file descriptors, with permissions checked only when first opened


Data Validation• Most of the common attacks on systems can be

prevented, or the threat of their occurring can be significantly reduced, by appropriate data validation

• Refers to both input to and output from an application

• If a system takes a typical architectural approach of providing common services then one common component can filter all input and output, thus optimizing the rules and minimizing efforts


Data Validation Strategies

1. Accept only known valid data

2. Reject known bad data

3. Sanitize all data

• All three methods must check– Data type– Syntax– Length


Accept Only Known Valid Data

• The preferred way to validate data• Applications should accept only input that is known

to be safe and expected• Example:

– A password reset system takes in usernames as input– Valid usernames would be defined as ASCII A-Z and 0-9– The application should check that the input is of type

string, is comprised of A-Z and 0-9 and is of a valid length


Reject Known Bad Data

• Relies on the application knowing about specific malicious payloads

• This strategy can limit exposure, but it is very difficult for any application to maintain an up-to-date database of web application attack signatures

• Useless for zero day attacks


Sanitize All Data

• Canonicalization – convert to standard simple format before filtering

• Attempting to make bad data harmless is effective second line of defense, especially when dealing with rejecting bad input

• Should not be relied upon as a primary defense technique


Never Rely on Client-Side Data Validation

• Client-side validation can always be bypassed• With any client-side processing an attacker can

simply watch the return value and modify it at will• All data validation must be done on the trusted

server or under control of the application• Data validation on the client side, for purposes of

ease of use or user friendliness, is acceptable - but should not be considered a true validation process

• All validation should be on the server side, even if it is redundant to cursory validation performed on the client side


Generic Meta-Characters Problem

• Meta-characters are non-printable and printable characters, which affect the behavior of programming language commands, operating system commands, individual program procedures and database queries

• Meta-characters can be encoded in non-obvious ways, so canonicalization of data (conversion to a common character set) before stripping meta-characters is essential


Some Examples• [ ; ] Semicolons for additional command-execution

• [ & ] Used for command-execution

• [ x00 ] Null bytes for truncating strings and filenames

• [ x04 ] EOT for faking file ends

• [ x0a ] New lines for additional command-execution

• [ x08 ] Backspace


Some More Examples

• [ ' " ] Quotation marks (often in combination with database-queries)

• [ /\ ] Slashes and backslashes for faking paths and queries

• [../] two dots and a slash or backslash - for faking file system paths

• [ $ ] Programming/scripting- language related


Cross-Site Scripting• Normally an attack on the system user and not the system

itself • The victim is tricked into making a specific and carefully

crafted HTTP request• The attacker has previously discovered an application that

doesn't filter input and will return to the user the requested page and the malicious code added to the request

• When the web server receives the page request it sends the page and the piece of code that was requested

• When the user's browser receives the new page, the malicious script is parsed and executed in the security context of the user


Cross-Site Scripting


Cross-Site Scripting

• Problem significantly reduced if only expected input accepted, but still problems if the application needs to accept HTML as input

• Web server should set character set, and then make sure data is free from special byte sequences in that encoding – particularly dynamic output


Direct SQL Injection• Some applications do not validate user input and

allow users to make direct database calls to the database

• Attacker appends an additional database query to legitimate data

• Can be used to:– Change SQL values (e.g., passwords)– Concatenate SQL statements– Add function calls and stored procedures to a statement– Typecast and concatenate retrieved data


Direct SQL Injection

• Filter SQL commands directly prior to their execution• Note some escaped input must be allowed, e.g., last

name “O’Neil”• Construct all queries using prepared statements,

rather than strings, to encapsulate variables and escape special characters automatically in a manner suited to the target database

• Build queries from data values, never other SQL queries or parts thereof


Direct OS Commands

• System interfaces in programming and scripting languages pass input commands to the underlying OS, commonly used for file handling, sending emails, etc.

• May be possible to:– Alter system commands, including parameters– Execute additional commands and OS command

line tools


Path Traversal and Path Disclosure

• Web applications may store data inside and/or outside WWW-ROOT in designated locations

• Attacker can construct a malicious request to return data about physical file locations

http://example.com/a/b/../../../../etc/passwd

• Traversing back to system directories that contain binaries makes it possible to execute system commands outside designated paths


Path Traversal and Path Disclosure

• Attacker looks for old, backup, temp, hidden files, e.g., guesses file names and tries them

• Includes known vulnerable files or applications

• Counter by using path normalization functions

• Filter input strings like “../” as well as their unicode variants

• Use “chrooted” servers


Null Bytes

• Many applications, developed in whatever programming language, often pass data to underlying lower level C functions

• C perceives the null byte as the termination of a string

• Can be used to:– Disclose physical paths and OS information– Truncate strings, e.g., passed to SQL queries– Bypass validity checks that look for substrings in

parameters


Parameter Manipulation

• No data sent to a client (e.g., web browser) can be relied upon to stay the same unless cryptographically protected at the application layer (not just transport layer)

• Web parameter tampering:– Cookies– Form fields– URL query strings– HTTP headers


Risk-based Testing

• Think of risk on three dimensions:1. The ways the program could fail

2. How likely it is that the program could fail that way

3. What the consequences of that failure could be

• Use threat modeling, failure mode catalogs, bug taxonomies, etc. to channel testing strategies

• Reduce RASQ (Relative Attack Surface Quotient) ~= number of input channels


Stress Testing


Stress Testing: QuickTests• A quicktest is a cheap test that has some value

but requires little preparation, knowledge, or time to perform

• Classic example: the Shoe test• Find an input field, move the cursor to it, put your

shoe on the keyboard, and go to lunch• Uses the auto-repeat on the keyboard for a

cheap stress test (will often overflow input buffers)


Another Classic Quicktest

• Find a dialog box so constructed that pressing a key leads to another dialog box (perhaps an error message) that also has a button - connected to the same key - that returns to the first dialog box

• A related approach: Repeat the same input or series of inputs numerous times

• This may expose some types of long-sequence errors (stack overflows, memory leaks, etc.)


Interference Tests• Force the screen to refresh

• Change the video resolution

• Toggle “accessibility” options

• Change the system date/time

• Change localization settings

• Move the mouse and click somewhere random

• Click lots of times, all over the screen, really fast


Interference Tests

• Set this program or another program’s timer reminder to go off

• Change focus to another application

• Load enough other applications to force the program out of memory

• Put the database under use by a competing program that accesses the same records, preferably locking them


Interference Tests

• Cancel this program’s current task or another program’s task

• Kill or pause the program’s client or server

• Pause the program, for both short and long time periods

• Leave the program running for a long time

• Run lots of other applications in the meantime, with heavy disk use


File System Interference Tests

• Remove a CD-ROM, flash drive or other media while in use

• Fill the file system to its capacity

• Assign an invalid file name

• Vary file name and access permissions

• Change or corrupt the contents of a file that the program is reading or writing


Scalability Tests

• Connect very large numbers of users simultaneously

• Connect, disconnect, reconnect the same user repeatedly

• Benchmarks and tools available for testing databases, application servers, web servers

Upcoming Assignments


Upcoming Assignments

• First Iteration Demo Wednesday 3 November – Thursday 11 November.

• First Iteration Final Report due Friday 12 November.

• Midterm Exam posted Thursday 11 November and due Friday 19 November.


http://bank.cs.columbia.edu/classes/cs4156/homeworks/firstreport.htm#demo

http://bank.cs.columbia.edu/classes/cs4156/homeworks/firstreport.htm#report

COMS W4156: Advanced Software Engineering

Prof. Gail [email protected]



mailto:[email protected]


october 28, 2010coms w41561 coms w4156: advanced software engineering prof. gail kaiser...

Documents

software slide

coms w41566 security

security requirements

system slide

coms w41561 coms w4156

coms w41568

coms w41569

hidden slide