cs346 regular expressions1 pattern matching regular expression

27
CS346 Regular Expressions 1 Pattern Matching Regular Expression

Upload: hector-doyle

Post on 05-Jan-2016

237 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 1

Pattern Matching

Regular Expression

Page 2: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 2

Pattern Matching

JavaScript provides two ways to do pattern matching:

1. Using RegExp objects 2. Using methods on String objects

RE in both ways are the same Same as in Perl

Page 3: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 3

Simple patterns

Two categories of characters in patterns:

a. normal characters (match themselves)

b. metacharacters (can have special meanings in patterns--do not match themselves)

\ | ( ) [ ] { } ^ $ * + ? .

- A metacharacter is treated as a normal character if it is backslashed

- period (.) is a special metacharacter - it matches any character except newline

Page 4: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 4

create RegExp objects

var varname = / reg_ex_pattern / flags

Simplest example: exact match To match occurrence of “our” in a

string containing your, our, sour, four, pour

var toMatch = /our/;

Page 5: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 5

1. Matching in RegExp objects

Tests a string for pattern matches. This method returns a Boolean that indicates whether or not the specified pattern exists within the searched string. This is the most commonly used method for validation. Use test() method of RegExp object

Format: regexp.test( string_to_be_tested ) test() returns a Boolean

var tomatch=/our/;var result = tomatch.test(“pour”); //boolean result

Example: 16-0-checkName.html

Page 6: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 6

Pattern Modifiers (Adding flags)

Flag(s) Purpose

i Makes the match case insensitive/oak/i matches "OAK" and "Oak"

g Performs a global match not just the first

ig Makes the match case insensitive and global

Page 7: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 7

2. Matching in Strings

search() method Returns the position in the specified string of the RE

pattern (position is relative to zero); returns -1 if it fails

var str = "Gluckenheimer"; var position = str.search(/n/); /* position is now 6 */

match() method compares a RE and a string to see whether they

match. replace() method

finds out if a RE matches a string and then replaces a matched string with a new string

Page 8: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 8

search() method

Format: string.search(reg-exp) Searches the string for the first match to the given regular

expression returns an integer that indicates the position in the string

(zero-indexed). If no match is found, the method will return –1.

Similar to the indexOf() method, Example: To find the location of the first absolute link

within a HTML document:: pos = htmlString.search(/^<a href =

”http:\/\/”$/i);if ( pos != -1) { alert( ‘First absolute link found at’ + pos +’position.’);}else { alert ( ‘Absolute links not found’);}

Page 9: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 9

Match() method

match() method Format: string.match( regular_expression ) returns an array of all the matching strings found in the

string given. If no matches are found, then match() returns false.

Example: To check the proper format for a phone number entered by a user, with the form of 

(XXX) XXX-XXXX.

function checkPhone( phone ) {  phoneRegex = /^\(\d\d\d\) \d\d\d-\d\d\d\d$/; if( !phone.match( phoneRegex ) ) {  alert( ‘Please enter a valid phone number’ );  return false; } return true;}

Page 10: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 10

replace() method

Format string.replace(reg_exp) Properties: replaces matches to a given regular expression with

some new string. Example: To replace every newline character (\n) with a break

<br /> tag, comment = document.forms[0].comments.value;  /* assumes that the HTML form is the first one present in the

document, and it has a field named “comments” */

comment = comment.replace( /\n/g, “<br />”); function formatField( fieldValue ) {

 return fieldValue = fieldValue. replace(/\n/g, “<br />”);}

The function accepts any string as a parameter, and returns the new string with all of the newline characters replaced by <br /> tags.

Page 11: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 11

Character classes – [ ]

Sequence of characters in brackets defines a set of characters, any one of which matches

e.g. [abcd]

Dashes used to specify spans of characters in a class

e.g. [a-z]

A caret at the left end of a class definition means the opposite

e.g. [^0-9]

Page 12: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 12

Character class abbreviations

Abbreviation Equiv. Pattern Matches

\d [0-9] a digit

\D [^0-9] not a digit

\w [A-Za-z_0-9] a word char.

\W [^A-Za-z_0-9] not a word char.

\s [ \r\t\n\f] a whitespace char.

\S [^ \r\t\n\f] not a whitespace

char.

Page 13: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 13

From Chapter 25 of text - Perl

Symbol Matches Symbol Matches ^ Beginning of line \d Digit (i.e., 0 to 9) $ End of line \D Nondigit \b Word boundary \s Whitespace \B Nonword boundary \S Nonwhitespace \w Word (alphanumeric)

character \n Newline

\W Nonword character \t Tab Fig. 25.9 Some of Perl’s metacharacters.

Note the difference of usage of ^ here and in a class

Page 14: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 14

Quantifiers

Quantifiers in braces - Repetitions

Quantifier Meaning{n} exactly n repetitions{m,} at least m repetitions{min, max} at least min but max

repetitions allowed

Page 15: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 15

Some other common Quantifiers

* zero or more repetitions

e.g., \d* means zero or more digits + one or more repetitions e.g., \d+ means one or more digits ? zero or one e.g., \d? means zero or one digit . exactly one character except

newline character e.g., /.l/ matches al or @l but not \n

nor l

Page 16: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 16

Anchors

The pattern can be forced to match only at the left end with ^; at the end with $

e.g., /^Lee/ matches "Lee Ann" but not "Mary Lee Ann"

/Lee Ann$/ matches "Mary Lee Ann", but not "Mary Lee Ann is nice“

The anchor operators (^ and $) do not match characters in the string--they match positions, at the beginning or end

Page 17: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 17

Examples

test() See 16-1checkURL.html See 16-2validEmail.html

search() method in String See 16-3check_phone.html

Page 18: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 18

replace method()

replace(RE_pattern, string)

Finds a substring that matches the pattern replaces it with the string g modifier applicable

var str = "Some rabbits are rabid"; str.replace(/rab/g, "tim");

str is now "Some timbits are timid“ Matched substrings stored in $1, $2, etc $1 and $2 are both set to "rab"

Page 19: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 19

match(pattern)

Most general pattern-matching method Returns an array of results of the pattern-matching

operation

With the g modifier, returns an array of the substrings that matched

Without the g modifier, first element of the returned array has the matched substring, the other elements have the values of $1, … obtained by parenthesized parts of pattern

var str = "My 3 kings beat your 2 aces"; var matches = str.match(/[ab]/g);

- matches is set to ["b", "a", "a"]

Page 20: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 20

match(pattern) example

16-4matchExample.htmlvar str = “Having a take-home exam that

takes 3 hours to complete is better than a 1-hour in-class exam”;

var matches = str.match( /\d/g );

matches is set to [3, 1]

Page 21: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 21

Parentheses in RE

Example: 16-5complexMatchEx.htmlvar str = "I have 118 credits; but I need 120 to graduate";

matches = str.match(/(\d+)([^\d]+)(\d+)/);document.write(matches, "<br />");

1st element of matches is the match, 2nd is the value of $1, 3rd element $2, 4th element $3 etc.

matches array:118 credits; but I need 120,118, credits; but I need ,120______________________ ___ _______________ ___ match with RE $1 $2 $3

Page 22: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 22

Alternate patterns

Use the alternation operator | Example: 16-6matchAlternatives.html

Page 23: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 23

split(parameter) of String

splits a string into substrings based on a pattern

“:" and /:/ both work

Example: 16-7splitEx.html

Page 24: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 24

Program Structure

Example 16-3check_phone.html Limitations? How can you make it more flexible? Can you generalize it for checking

multiple fields

Page 25: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 25

Uniform Program Structure for multiple tests

regex_name.test( string_to_be_tested ) to test each field

if test() returns false, compile an error message

See 16-8Structure.html

Page 26: CS346 Regular Expressions1 Pattern Matching Regular Expression

Examples of curly braces { }

16-9-curly_braces.html

CS346 Regular Expressions 26

Page 27: CS346 Regular Expressions1 Pattern Matching Regular Expression

CS346 Regular Expressions 27

Table – Regular Expression Codes

See “Regular Expression Codes.doc”