practical extraction & report language

33
Practical Extraction & Report Language Picture taken from http://www.wendy.org/DPW2006/shirt.htm

Upload: thaddeus-munoz

Post on 03-Jan-2016

45 views

Category:

Documents


4 download

DESCRIPTION

Practical Extraction & Report Language. Picture taken from http://www.wendy.org/DPW2006/shirt.htm. Agenda. Why Perl? Getting/Installing Perl Using Perl Structure of basic program (Hello world) Variables & Operators Regular Expressions Other Topics. Why Perl. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Practical Extraction & Report Language

Practical Extraction & Report Language

Picture taken from http://www.wendy.org/DPW2006/shirt.htm

Page 2: Practical Extraction & Report Language

Agenda

• Why Perl?

• Getting/Installing Perl

• Using Perl

• Structure of basic program (Hello world)

• Variables & Operators

• Regular Expressions

• Other Topics

Page 3: Practical Extraction & Report Language

Why Perl

• Perl is built around regular expressions– REs are good for string processing– Therefore Perl is a good scripting language– Perl is especially popular for CGI scripts

• Perl makes full use of the power of UNIX• Short Perl programs can be very short

– “Perl is designed to make the easy jobs easy, without making the difficult jobs impossible.” -- Larry Wall, Programming Perl

Page 4: Practical Extraction & Report Language

Getting/Installing Perl

• Windows– www.activestate.com– Download “ActivePerl”– Run installer

• Linux– Mostly installed when Linux is installed

• userX@machineY$ which perl– Get it from

• Linux distribution CDs– Update your installation and during package selection, select

perl• ActiveState.com• CPAN

Page 5: Practical Extraction & Report Language

Other Possibilities

• Using Virtual Machines– VMWare

• Install VMWare workstation on windows• Install Linux under VMware workstation (select perl

to be installed)

– Cygwin• Install Cygwin on windows• It will provide a Linux interface such that perl can

be used• http://www.cygwin.com/mirrors.html

Page 6: Practical Extraction & Report Language

Using Perl

• Windows– Write a program and save it with .pl extension– C:\perl\bin>perl program_name.pl

• Linux– Write a program and save it with .pl extension– userX@machineY$ perl program_name.pl– userX@machineY$ ./program_name.pl– Same under VMware & Cygwin

chmod +x program_name.pl

Page 7: Practical Extraction & Report Language

Structure of a basic program

#!/usr/bin/perl

# Program to do the obvious

print 'Hello world.';

First line is special,Path to perl installation this path can be different e.g.,

/bin/perl

# denotes comment, any thing after # till

the end of line is comment

Built in function Function argument,

in this case a string constant

userX@machineY$ perl hello.pl

Statement ends with semicolon

Page 8: Practical Extraction & Report Language

Variables

• Scalar variables– Only one value at a time

• List variables– List of values (Arrays)

Page 9: Practical Extraction & Report Language

Scalar Variables

• The scalar variable means that it can store only one value.

• They should always be preceded with the $ symbol. e.g., $var1

• There is no necessity to declare the variable before hand. (but recommended)

• There are no data types such as character or numeric. If you treat the variable as character then it can store a character. If you treat it as string it can store one word . if you treat it as a number it can store one number.

Page 10: Practical Extraction & Report Language

Example scalar

#!/perl/bin

$x = "100\n";

print $x;

$x = $x + 1;

print $x;

Output:

100

101

Page 11: Practical Extraction & Report Language

List/Array Variables

• They are like arrays. It can be considered as a group of scalar variables.

• They are always preceded by the @symbol.• eg @items = (“Apple",“Bell",“Chair");

• Like in C the index starts from 0. • If you want the second name you should use $items[1] • Watch the $ symbol here because each element is a

scalar variable.• $# Followed by the list variable gives the length of the list

variable.• $#items will provide index of last element @items• $len = @items; #will assign length of array to $len

Is the result of two statements same $len = @items and print @items ?

Page 12: Practical Extraction & Report Language

Example List/Array

#!/perl/bin

@myarray = (1721, 2974, “string");print @myarray;$myarray[0]= “string”;$myarray[1]= “1234”;$myarray[2]= “5646”;print @myarray;print “$myarray[0]” . “$myarray[1]” . “$myarray[0]”;

Page 13: Practical Extraction & Report Language

Operations on Arrays

• Push– push adds one or more things to the end of a

list– push (@items, “table", “chair");– push returns the new length of the list

• Pop– pop removes and returns the last element– $myitem = pop(@items);

• shift, unshift, reverse

Page 14: Practical Extraction & Report Language

Example (Push & Pop)

#!/perl/bin

@myarray = (1721, 2974, “string");print “@myarray\n”;

push(@myarray,”newval1”,”newval2”);print “@myarray\n”;

$popvalue =pop(@myarray);print “$myarray\n”;print “@myarray”;

Page 15: Practical Extraction & Report Language

Operators

• Arithmetic

• String

• Single and Double quotes

• Conditional

Page 16: Practical Extraction & Report Language

Arithmetic in Perl

$a = 1 + 2; # Add 1 and 2 and store in $a$a = 3 - 4; # Subtract 4 from 3 and store in $a$a = 5 * 6; # Multiply 5 and 6$a = 7 / 8; # Divide 7 by 8 to give 0.875$a = 9 ** 10; # Nine to the power of 10, that is, 910

$a = 5 % 2; # Remainder of 5 divided by 2++$a; # Increment $a and then return it$a++; # Return $a and then increment it--$a; # Decrement $a and then return it$a--; # Return $a and then decrement it

Page 17: Practical Extraction & Report Language

String and assignment operators

$a = $b . $c; # Concatenate $b and $c$a = $b x $c; # $b repeated $c times

$a = $b; # Assign $b to $a$a += $b; # Add $b to $a$a -= $b; # Subtract $b from $a$a .= $b; # Append $b onto $a

Page 18: Practical Extraction & Report Language

Single and double quotes

• $a = 'apples';• $b = 'bananas';• print $a . ' and ' . $b;

– prints: apples and bananas

• print '$a and $b';– prints: $a and $b

• print "$a and $b";– prints: apples and bananas

Page 19: Practical Extraction & Report Language

Conditions

Strings Numberseq == #equal tone != #not equal tolt < #less thangt > #greater thanle <= #less then or equal toge >= #greater then or equal to

Logical&& #And|| #Or! #negation

Page 20: Practical Extraction & Report Language

Control structures

• Loops– Foreach– For– while

• Condition– If / else

• Subroutines

Page 21: Practical Extraction & Report Language

foreach

# Visit each item in turn and call it $myitem

@item = (“item1”,”item2”,”item3”);

foreach $myitem (@items){ print "$myitem\n"; }

Page 22: Practical Extraction & Report Language

for loops

• for loops are just as in C or Java

• for ($i = 0; $i < 10; ++$i){ print "$i\n";}

Page 23: Practical Extraction & Report Language

while loops

#!/usr/local/bin/perl$a = 1;

while ($a != 10){

$a++;}

Page 24: Practical Extraction & Report Language

do..while loops

#!/usr/local/bin/perl$a = 1;do{

$a++;}while ($a != 10);

Page 25: Practical Extraction & Report Language

if statements

if ($a){ print "The string is not empty\n";}else{ print "The string is empty\n";}

Page 26: Practical Extraction & Report Language

if - elsif statementsif (!$a) {

print "The string is empty\n"; }elsif (length($a) == 1)

{ print "The string has one character\n"; }

elsif (length($a) == 2){ print "The string has two characters\n"; }

else { print "The string has many characters\n"; }

Page 27: Practical Extraction & Report Language

Calling subroutines

• Assume you have a subroutine printargs that just prints out its arguments

• Subroutine calls:

– printargs(“arg1", “arg2");• Prints: “arg1 arg2"

– $returnvalue =printargs(“arg1", “arg2"); • Prints: “arg1 arg2“• $returnvalue will be assigned two

Page 28: Practical Extraction & Report Language

Defining subroutines

• Here's the definition of printargs:sub printargs

{ print "@_\n"; }

• Parameters are put in the array @_ which can be accessed using– $_[0], $_[1] etc

How many parameters are passed to sub routine?

Page 29: Practical Extraction & Report Language

Returning a result

• Use return statement

sub maximum{ if ($_[0] > $_[1]) {

return $_[0]; }

else {

return $_[1]; }}

$biggest = maximum(37, 24);

Page 30: Practical Extraction & Report Language

Basic pattern matching

• $sentence =~ /the/– True if $sentence contains "the"

• $sentence = "The dog bites.";if ($sentence =~ /the/) # is false– …because Perl is case-sensitive

• !~ is "does not contain"

Page 31: Practical Extraction & Report Language

RE special characters

. # Any single character except a newline

^ # The beginning of the line or string

$ # The end of the line or string

* # Zero or more of the last character

+ # One or more of the last character

? # Zero or one of the last character

Page 32: Practical Extraction & Report Language

RE examples

^.*$ # matches the entire string

hi.*bye # matches from "hi" to "bye" inclusive

x +y # matches x, one or more blanks, and y

^Dear # matches "Dear" only at beginning

bags? # matches "bag" or "bags"

hiss+ # matches "hiss", "hisss", "hissss", etc.

Page 33: Practical Extraction & Report Language

Other Topics

• Split() and join()

• File handling

• Perl 5– Modules

• http://www.pageresource.com/cgirec/index2.htm