practical extraction & report language
DESCRIPTION
Practical Extraction & Report Language. Picture taken from http://www.wendy.org/DPW2006/shirt.htm. Agenda. Why Perl? Getting/Installing Perl Using Perl Structure of basic program (Hello world) Variables & Operators Regular Expressions Other Topics. Why Perl. - PowerPoint PPT PresentationTRANSCRIPT
Practical Extraction & Report Language
Picture taken from http://www.wendy.org/DPW2006/shirt.htm
Agenda
• Why Perl?
• Getting/Installing Perl
• Using Perl
• Structure of basic program (Hello world)
• Variables & Operators
• Regular Expressions
• Other Topics
Why Perl
• Perl is built around regular expressions– REs are good for string processing– Therefore Perl is a good scripting language– Perl is especially popular for CGI scripts
• Perl makes full use of the power of UNIX• Short Perl programs can be very short
– “Perl is designed to make the easy jobs easy, without making the difficult jobs impossible.” -- Larry Wall, Programming Perl
Getting/Installing Perl
• Windows– www.activestate.com– Download “ActivePerl”– Run installer
• Linux– Mostly installed when Linux is installed
• userX@machineY$ which perl– Get it from
• Linux distribution CDs– Update your installation and during package selection, select
perl• ActiveState.com• CPAN
Other Possibilities
• Using Virtual Machines– VMWare
• Install VMWare workstation on windows• Install Linux under VMware workstation (select perl
to be installed)
– Cygwin• Install Cygwin on windows• It will provide a Linux interface such that perl can
be used• http://www.cygwin.com/mirrors.html
Using Perl
• Windows– Write a program and save it with .pl extension– C:\perl\bin>perl program_name.pl
• Linux– Write a program and save it with .pl extension– userX@machineY$ perl program_name.pl– userX@machineY$ ./program_name.pl– Same under VMware & Cygwin
chmod +x program_name.pl
Structure of a basic program
#!/usr/bin/perl
# Program to do the obvious
print 'Hello world.';
First line is special,Path to perl installation this path can be different e.g.,
/bin/perl
# denotes comment, any thing after # till
the end of line is comment
Built in function Function argument,
in this case a string constant
userX@machineY$ perl hello.pl
Statement ends with semicolon
Variables
• Scalar variables– Only one value at a time
• List variables– List of values (Arrays)
Scalar Variables
• The scalar variable means that it can store only one value.
• They should always be preceded with the $ symbol. e.g., $var1
• There is no necessity to declare the variable before hand. (but recommended)
• There are no data types such as character or numeric. If you treat the variable as character then it can store a character. If you treat it as string it can store one word . if you treat it as a number it can store one number.
Example scalar
#!/perl/bin
$x = "100\n";
print $x;
$x = $x + 1;
print $x;
Output:
100
101
List/Array Variables
• They are like arrays. It can be considered as a group of scalar variables.
• They are always preceded by the @symbol.• eg @items = (“Apple",“Bell",“Chair");
• Like in C the index starts from 0. • If you want the second name you should use $items[1] • Watch the $ symbol here because each element is a
scalar variable.• $# Followed by the list variable gives the length of the list
variable.• $#items will provide index of last element @items• $len = @items; #will assign length of array to $len
Is the result of two statements same $len = @items and print @items ?
Example List/Array
#!/perl/bin
@myarray = (1721, 2974, “string");print @myarray;$myarray[0]= “string”;$myarray[1]= “1234”;$myarray[2]= “5646”;print @myarray;print “$myarray[0]” . “$myarray[1]” . “$myarray[0]”;
Operations on Arrays
• Push– push adds one or more things to the end of a
list– push (@items, “table", “chair");– push returns the new length of the list
• Pop– pop removes and returns the last element– $myitem = pop(@items);
• shift, unshift, reverse
Example (Push & Pop)
#!/perl/bin
@myarray = (1721, 2974, “string");print “@myarray\n”;
push(@myarray,”newval1”,”newval2”);print “@myarray\n”;
$popvalue =pop(@myarray);print “$myarray\n”;print “@myarray”;
Operators
• Arithmetic
• String
• Single and Double quotes
• Conditional
Arithmetic in Perl
$a = 1 + 2; # Add 1 and 2 and store in $a$a = 3 - 4; # Subtract 4 from 3 and store in $a$a = 5 * 6; # Multiply 5 and 6$a = 7 / 8; # Divide 7 by 8 to give 0.875$a = 9 ** 10; # Nine to the power of 10, that is, 910
$a = 5 % 2; # Remainder of 5 divided by 2++$a; # Increment $a and then return it$a++; # Return $a and then increment it--$a; # Decrement $a and then return it$a--; # Return $a and then decrement it
String and assignment operators
$a = $b . $c; # Concatenate $b and $c$a = $b x $c; # $b repeated $c times
$a = $b; # Assign $b to $a$a += $b; # Add $b to $a$a -= $b; # Subtract $b from $a$a .= $b; # Append $b onto $a
Single and double quotes
• $a = 'apples';• $b = 'bananas';• print $a . ' and ' . $b;
– prints: apples and bananas
• print '$a and $b';– prints: $a and $b
• print "$a and $b";– prints: apples and bananas
Conditions
Strings Numberseq == #equal tone != #not equal tolt < #less thangt > #greater thanle <= #less then or equal toge >= #greater then or equal to
Logical&& #And|| #Or! #negation
Control structures
• Loops– Foreach– For– while
• Condition– If / else
• Subroutines
foreach
# Visit each item in turn and call it $myitem
@item = (“item1”,”item2”,”item3”);
foreach $myitem (@items){ print "$myitem\n"; }
for loops
• for loops are just as in C or Java
• for ($i = 0; $i < 10; ++$i){ print "$i\n";}
while loops
#!/usr/local/bin/perl$a = 1;
while ($a != 10){
$a++;}
do..while loops
#!/usr/local/bin/perl$a = 1;do{
$a++;}while ($a != 10);
if statements
if ($a){ print "The string is not empty\n";}else{ print "The string is empty\n";}
if - elsif statementsif (!$a) {
print "The string is empty\n"; }elsif (length($a) == 1)
{ print "The string has one character\n"; }
elsif (length($a) == 2){ print "The string has two characters\n"; }
else { print "The string has many characters\n"; }
Calling subroutines
• Assume you have a subroutine printargs that just prints out its arguments
• Subroutine calls:
– printargs(“arg1", “arg2");• Prints: “arg1 arg2"
– $returnvalue =printargs(“arg1", “arg2"); • Prints: “arg1 arg2“• $returnvalue will be assigned two
Defining subroutines
• Here's the definition of printargs:sub printargs
{ print "@_\n"; }
• Parameters are put in the array @_ which can be accessed using– $_[0], $_[1] etc
How many parameters are passed to sub routine?
Returning a result
• Use return statement
sub maximum{ if ($_[0] > $_[1]) {
return $_[0]; }
else {
return $_[1]; }}
$biggest = maximum(37, 24);
Basic pattern matching
• $sentence =~ /the/– True if $sentence contains "the"
• $sentence = "The dog bites.";if ($sentence =~ /the/) # is false– …because Perl is case-sensitive
• !~ is "does not contain"
RE special characters
. # Any single character except a newline
^ # The beginning of the line or string
$ # The end of the line or string
* # Zero or more of the last character
+ # One or more of the last character
? # Zero or one of the last character
RE examples
^.*$ # matches the entire string
hi.*bye # matches from "hi" to "bye" inclusive
x +y # matches x, one or more blanks, and y
^Dear # matches "Dear" only at beginning
bags? # matches "bag" or "bags"
hiss+ # matches "hiss", "hisss", "hissss", etc.
Other Topics
• Split() and join()
• File handling
• Perl 5– Modules
• http://www.pageresource.com/cgirec/index2.htm