input validation with regular expressions

41
Input Validation with Regular Expressions COEN 351

Upload: summer

Post on 11-Feb-2016

36 views

Category:

Documents


1 download

DESCRIPTION

Input Validation with Regular Expressions. COEN 351. Input Validation. Security Strategies Black List List all things that are NOT allowed List is difficult to create Adding insecure constructs on a continuous basis means that the previous version was unsafe - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Input Validation with Regular Expressions

Input Validation with Regular Expressions

COEN 351

Page 2: Input Validation with Regular Expressions

Input Validation Security Strategies

Black List List all things that are NOT allowed

List is difficult to create Adding insecure constructs on a continuous basis means

that the previous version was unsafe Testing is based on known attacks.

List from others might not be trustworthy. White List

List of things that are allowed List might be incomplete and disallow good content

Adding exceptions on a continuous basis does not imply security holes in previous versions.

Testing can be based on known attacks. List from others can be trusted if source can be trusted.

Page 3: Input Validation with Regular Expressions

Perl Regular Expressions

Regular Expression = PatternTemplate that either matches or does not

match a string

Page 4: Input Validation with Regular Expressions

Excursus: Getting Input in Perl

Use <STDIN> to read from standard input Use ‘defined’ construct to tell if read was

successful

while(defined($line=<STDIN>)) {print “I saw $line”;

}

Page 5: Input Validation with Regular Expressions

Excursus: Getting Input in Perl

Non-sensical shortcut Uses standard loop variable $_

while(<STDIN>) {print "I saw $_";

}

foreach(<STDIN>) {print "I saw $_";

}

Gets line, executes body of loop.

Gets all the lines, then executes body of loop.

$_ is the default loop variable.

Page 6: Input Validation with Regular Expressions

Excursus: Getting Input in Perl

The STDIN is a default chomp acts on default variable $_

while(<>) {chomp;print "I saw $_\n";

}

Page 7: Input Validation with Regular Expressions

Perl Regular Expressions

Matching and substitution are fundamental tasks in Perl

Implemented using one letter operators:m/PATTERN/m//

pattern matchings/PATTERN/REPLACEMENT/s///

Substitution

Page 8: Input Validation with Regular Expressions

Perl Regular Expressions Meta-characters in a pattern need escaping with

backslash \ | ( ) [ ] { } ^ $ * + ?

Page 9: Input Validation with Regular Expressions

Perl Regular Expressions

InterpolationPerl substitutes strings in strings:

$foo = “bar”;/$foo$/;

Equivalent to:/bar$/;

Page 10: Input Validation with Regular Expressions

Perl Regular Expression:Binding Operator Pattern matching is so frequent in Perl that

there is a special operator Normally, pattern matching is done on

default operand $_ =~ binds a string expression to a pattern

match (substitution, transliteration)

Page 11: Input Validation with Regular Expressions

Perl Regular Expression:Binding Operator =~ has left operand a string =~ has right operand a pattern

Could be interpreted at run time. Returns true / false depending on the

success of match. !~ operation is the same, but result is

negated.

Page 12: Input Validation with Regular Expressions

Perl Regular Expression:Binding Operator

$_ =~ $pat;

is equivalent to $_ =~ /$pat/;

but is less efficient since giving the pattern directly since the regular expression will be recompiled at run time

Page 13: Input Validation with Regular Expressions

Perl Regular Expression:Binding Operator Exampleif ( ($k,$v) = $string =~ m/(\w+)=(\w*)/) {

print “Key $k Value $v\n”;}

Since =~ has precedence over =, it is evaluated first.The binding operator binds variable $string to a pattern looking for expressions like “ key=word. The binding expression is done in a list context, hence, the resulting matches are returned as a list.The list is then assigned to ($k,$v).The result of the assignment is the number of things assigned, i.e. typically 2.Since 2 is not 0, this is equivalent to true and hence the if-block is entered.

Page 14: Input Validation with Regular Expressions

Perl Regular Expressions Qualifiers:

* matches the preceding character zero or more times. Pattern “abc*d” is matched by

rabd zabccccd

Use parentheses to group letters

#/perl/bin/perl

while(<>) { chomp; last if $_ eq 'stop'; if (/abc*d /) { print "Matched: |$`<$&>$'|\n"; } else { print "No match.\n"; }}

#/perl/bin/perl

while(<>) { chomp; last if $_ eq 'stop'; if (/a(bc)*d /) { print "Matched: |$`<$&>$'|\n"; } else { print "No match.\n"; }}

Page 15: Input Validation with Regular Expressions

Perl Regular Expressions

Qualifiers: ‘*’ matches zero or more instances ‘+’ matches one or more instances

“ab(cde)+fg” ‘?’ matches none or one

Page 16: Input Validation with Regular Expressions

Perl Regular Expressions

Alternatives ‘|’ “or”

Either the right or the left side matches

Page 17: Input Validation with Regular Expressions

Perl Regular Expressions

Character ClassesList of possible characters inside a square

bracketExample:

[a-cw-z]+ [a-zA-Z0-9]

Negation provided by caret [^n\-z] matches any character but ‘n’, ‘-’, ‘z’

Page 18: Input Validation with Regular Expressions

Perl Regular Expressions

Character classes shortcuts \w (word) is a shortcut for [A-Za-z0-9] \s (space) is a shortcut for [\f\t\n\r ] \d (digit) is a shortcut for [0-9] [^\d] anything but a digit [^\s] anything but a space character [^\w] anything but a word character

Page 19: Input Validation with Regular Expressions

Perl Regular Expressions

Perl regex semantics are based on: Greed

Perl tries to match as much of an expression as is possible Eagerness

Perl gives the first possible match The left-most match wins

Backtracking The entire expression needs to match Perl regex evaluation backtracks if match is impossible

Page 20: Input Validation with Regular Expressions

Perl Regular Expressions Eagerness Example:

What is the result of this snippet

$string = “boo hoo“;$string =~ s/o*/e/; #left side of =~ needs to be an l-value

boo hoobe hoobee hooboo heoboo heeeboo hoo

Page 21: Input Validation with Regular Expressions

Perl Regular Expressions

Quantifiers *, +, ? are not always enough Specify number of occurrences by placing

comma separated range in curly brackets /a{2,12}/

2 to 12 ‘a’ /a{5,}/

5 or more ‘a’ /a{5}/

exactly 5 ‘a’

Page 22: Input Validation with Regular Expressions

Perl Regular Expressions Anchors

pattern can match everywhere in the string unless you use anchors

^ beginning of string $ end of string /b start or end of a group of w-characters /B non-word boundary anchor

Examples: /^hello/ matches only at beginning of string /world$/ matches only at the end of string

Page 23: Input Validation with Regular Expressions

Perl Regular Expressions Parentheses and Memory

( ) group together part of a pattern Also remember corresponding match part of string. These are put into a backreference

Made by backslash followed by number Available as $1, … after matching

Examples /(.)\1/ matches any character followed by itself /../ matches any two characters /([‘”]).*\1/ matches any string starting with single or double quotes followed by

zero or more arbitrary characters followed by the same type of quotes. “doesn’t match’ “does match” ‘does match’

Page 24: Input Validation with Regular Expressions

Perl Regular ExpressionsValidating e-mail Out of channel verification:

Ask for email addresses twice to weed out typos. Send email to address given. Still need to prevent command-line insertion

Lookup of DNS records for MX records Assumes site connectivity

Regular expressions Typically have subtle errors

tom&[email protected] is valid, but fails simple regex [email protected] is valid, deliverable, but probably fake

Page 25: Input Validation with Regular Expressions

Perl Regular ExpressionsValidating email if ( $email =~ /\@/ ) { … }

checks for an ampersand if ( $email =~ /\S+\@\S+/ )

checks for non-white space characters divided by an ampersand

matches thomas@hotmail if ( $email =~ /\S+\@\S+\.\S+ ) if ( $email =~ /[\w\-]+\@[\w\-]+\.[\w\-]+/

matches most valid emails, but allows multiple emails if ( $email =~ /^[\w\-]+\@[\w\-]+\.[\w\-]+$/

anchored at beginning and end of word

Page 26: Input Validation with Regular Expressions

Perl Regular Expressions Checking for strings that only contain alphabetic

characters. ASCII based regex is insufficient:

if($var =~ /^[a-zA-Z]+$/) Does not work for characters with diacritic marks

Best solution is to use Unicode properties if($var =~ /^[^\W\d_]+$/) Explanation:

\w matches alphabetic, numeric, underscore (alphanumunder) \W is a non-alphanumunder [^\W\d_] is a character that is neither non-alphanumunder, digit, or

underscore, hence an alphabetic character Could also use POSIX character classes, but those depend on

locale

Page 27: Input Validation with Regular Expressions

Perl Regular Expressions

Making regex readablePlace semantic units into a variable with an

appropriate name$optional_sign = ‘[-+]?‘;$mandatory_digits = ‘\d+’;$decimal_point = ‘\.?’;$optinonal_digits = ‘\d*’;$number = $optional_sign

.$mandatory_digits .$decimal_point

.$optional_digits;if ( /($number)/) { … }

Page 28: Input Validation with Regular Expressions

Perl Regular Expressions

Page 29: Input Validation with Regular Expressions

Perl Regular Expressions

Page 30: Input Validation with Regular Expressions

Perl Regular Expressions

Page 31: Input Validation with Regular Expressions

Perl Regular Expressions

Page 32: Input Validation with Regular Expressions

Perl Regular Expressions

Page 33: Input Validation with Regular Expressions

Perl Regular Expressions

Page 34: Input Validation with Regular Expressions

Perl Regular Expressions

Page 35: Input Validation with Regular Expressions

Perl Regular Expressions

Page 36: Input Validation with Regular Expressions

Perl Regular Expressions

Page 37: Input Validation with Regular Expressions

Perl Regular Expressions

Page 38: Input Validation with Regular Expressions

Perl Regular Expressions

Page 39: Input Validation with Regular Expressions

Perl Regular Expressions

Page 40: Input Validation with Regular Expressions

Perl Regular Expressions

Page 41: Input Validation with Regular Expressions

Perl Regular Expressions