1 perl: subroutines (for sorting). 2 good programming strategies for subroutines #!/usr/bin/perl #...

31
1 Perl: subroutines (for sorting)

Upload: christina-singleton

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

1

Perl: subroutines (for sorting)

Page 2: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

2

Good Programming Strategies for Subroutines

#!/usr/bin/perl# example why globals are bad$one = <STDIN>;$two = <STDIN>;$max = &larger;print "$max\n";

sub larger { if ($one > $two) { $one; } else { $two; }In this example, it is tedious to use the same code for 2 different variables, such

as $fred and $barney

Page 3: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

3

#!/usr/bin/perl# example why globals are bad$one = <STDIN>;$two = <STDIN>;$max = &larger;print "$max\n";

$fred = <STDIN>; #hack to make "larger" work with 2 different variables$barney = <STDIN>;

$keep_one = $one; # redundant stuff$keep_two = $two;$one = $fred;$two = $barney;$max = &larger;print "$max\n";$one = $keep_one;$two = $keep_two;

sub larger { if ($one > $two) { $one; } else { $two; }#there HAS to be a better way

Page 4: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

4

Arguments Passed to Subroutines

• To pass parameters to a subroutine, simply list them in parentheses after the subroutine call

$n = &larger (10, 15);

$X = &larger ($one, $two);

Page 5: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

5

Arguments Passed to Subroutines

New automatic variable (@_) • parameters passed to a subroutine are automatically stored in this

array – as many as are required• recall array syntax: $_[0] is first element of @_ array

– do NOT confuse this with $_ -- the default variable– example: $i is scalar, and distinct from @i and $i[0] – both arrays

• scope – the variables in @_ are local to the subroutine• this means that when the subroutine is called, the values of the

variables used in calling the subroutine are copied into the automatic argument array – so changes are only local to the subroutine

• This also means that if a subroutine calls a subroutine, the local scope of @_ is maintained – there would be two separate and distinct versions of @_– this is why recursion works– this is taken care of by perl – we don't have to worry about it – we just

have to understand that it occurs

Page 6: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

6

Arguments Passed to Subroutines

We can re-write our subroutine with parameter passing as:

#!/usr/bin/perl$x = 5;$y = 18;$max = &larger ($x, $y); #call the subroutineprint "$max";

###########################sub larger { if($_[0] > $_[1]) { $_[0]; #local copy of X } else { $_[1]; #local copy of Y}

However, this is hard to read, write, check, and debug

Another problem – what happens if an extra parameter is included – you don't know from the context of the name of the subroutine "larger" that it only accepts 2 parameters.

$x = larger(10, 15, 30); #30 will be ignored

Page 7: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

7

Arguments Passed to Subroutines

• all variables in perl are global by default• you can specify local variables with "my"sub larger { my ($a, $b); #my $a; my $b; ($a, $b) = @_; #list assignment

# $a = $_[0];# $b = $_[1];

if ($a > $b) { $a; } else {$b;}#note, you may omit ;'s for single lines of code within a #block

}

Page 8: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

8

Argument Passing to Sub's

• The variables $a and $b are private ("scoped") to only this code block – the subroutine

• They will not affect any other $a/$b in the program

• Changes to $a/$b in other parts of the program will NOT affect $a/$b in this subroutine

• The subroutine is modular and reusable, it can be placed into virtually any perl program and operate predictably.

Page 9: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

9

More condensed without comments

sub larger { my ($a, $b); ($a, $b) = @_; if ($a > $b) { $a; } else {$b;}}# even simplersub larger { my ($a, $b) = @_; #subroutines often have this line if ($a > $b) { $a; } else {$b;}}

Page 10: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

10

Variable-length Parameter Lists• many traditional programming languages require subroutine

parameter lists to be strictly typed– predefined number and type of parameters

• this may be enforced in perl as well

sub larger { if(@_ != 2) { # scalar context print "Warning: 2 arguments expected\n" }::}

Page 11: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

11

Another way: Variable-length Parameter Lists

#!/usr/bin/perl

$max = &larger (3, 5, 10, 4, 5);

sub larger{ my ($largest) = shift @_; # shift element off of the LHS of the array foreach (@_) { # default variable used ($_) if($_ > $ largest) { $largest = $_; } } $largest; }

Page 12: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

12

Pragma: use strict;• A pragma is used to convey information to the perl compiler• use strict;

– tells perl's compiler that it should enforce some good programming rules for this code block

– essentially forces you to declare all variables with "my"

• -w (warnings pragma) – already talked about• same as: use warnings;• -w applies to all code (modules, subroutines, etc.)

#!/usr/bin/perl$i =~ m/tag/; #warnings: Name "main::i" used only once: possible typo at test.pl line2 # Use of uninitialized value in pattern match (m//) at line 2.

Page 13: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

13

use strict;#!/usr/bin/perluse strict;$i++;$i =~ m/tag/;

# Global symbol "$i" requires explicit package name at ./strict.pl line 3.# Global symbol "$i" requires explicit package name at ./strict.pl line 4.# Execution of ./strict.pl aborted due to compilation errors.

perldoc warningsperldoc perllexwarnperldoc strict

More useful than appears:example

Page 14: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

14

#!/usr/bin/perl# maxBad.pl# Example program to show use of subroutines and strict

use strict;

my @numbers = (1, 10, 11, 3, 9, 8, 5, 3);my $max;

$max = &max(@numbers);print "max = $max\n";if($number[1] == $max) { print "The largest number is at position 1 in array\n"; }

sub max { my $large = shift @_; foreach my $i (@_) { if($i > $large) { $large = $i; } $large;}

Global symbol "@number" requires explicit package name at ./maxBad.pl line 12.Execution of ./maxBad.pl aborted due to compilation errors.

Missing right curly or square bracket at ./maxBad.pl line 24, at end of linesyntax error at ./maxBad.pl line 24, at EOF

Page 15: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

15

Comment on Pragmas

use warnings;use strict;

Ideally these would be used from the beginning of every program that is longer than a few lines.

It can be quite a challenge to take a large and complicated program and clean up all the warnings/errors

Adding these checks after the program is written and debugged defeats the benefit of reducing development time by finding "mistakes".

Page 16: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

16

return Operator• return EXPRESSION

– immediately returns a value from a subroutine

my @seqs = qw/TTT GTC CTG ATG GTA CGA/;my $index = &which_codon_is("ATG", @seqs);

sub which_codon_is { my ($this_codon, @list) = @_; foreach (0..$#list) { # indices of list if($this_codon eq $list[$_] ) { # missing ) on web return $_; } } return(-1); # -1 if not found} # return is optional here – could just put -1

Page 17: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

17

#!/usr/bin/perl# translate# Take input from STDIN,# convert DNA to AA's## Modified -- to have function call another function

$line = <>;while( $line = <>) { chomp($line); #take care of new lines $sequence = $sequence.$line; }# amino acid hash# key values pairs%aminos = ( "TTT", "F", "TTC", "F", "TTA", "L", "TTG", "L", "CTT", "L", "CTC", "L", "CTA", "L", "CTG", "L", "ATT", "I", "ATC", "I", "ATA", "I", "ATG", "M", "GTT", "V", "GTC", "V", "GTA", "V", "GTG", "V", "TCT", "S", "TCC", "S", "TCA", "S", "TCG", "S", "CCT", "P", "CCC", "P", "CCA", "P", "CCG", "P", "ACT", "T", "ACC", "T", "ACA", "T", "ACG", "T", "GCT", "A", "GCC", "A", "GCA", "A", "GCG", "A", "TAT", "Y", "TAC", "Y", "TAA", ".", "TAG", ".", "CAT", "H", "CAC", "H", "CAA", "Q", "CAG", "Q", "AAT", "N", "AAC", "N", "AAA", "K", "AAG", "K", "GAT", "D", "GAC", "D", "GAA", "E", "GAG", "E", "TGT", "C", "TGC", "C", "TGA", ".", "TGG", "W", "CGT", "R", "CGC", "R", "CGA", "R", "CGG", "R", "AGT", "S", "AGC", "S", "AGA", "R", "AGG", "R", "GGT", "G", "GGC", "G", "GGA", "G", "GGG", "G", );

$peptide = &tlate($sequence);print "$peptide\n";

############# Subroutines

sub tlate { $seq = $_[0]; $peptide=""; $length = length($seq); while($length>=3) { ($codon,$seq) = getCodon($seq); if($aminos{$codon}) { $peptide = $peptide.$aminos{$codon}; } else { $peptide = $peptide."."; #just put in a stop for XXX, ATX, etc } $length = length($seq); #print "$length\r"; } print "\n"; return($peptide) }

sub getCodon { $sequence = $_[0]; $codon = substr($sequence,0,3); # look up codon $sequence = substr($sequence,3,$length-3); # remove that codon #print "codon=$codon\n"; return($codon,$sequence); }

Page 18: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

18

Output

./subsubRoutine.plMain programs1 a = 4 b = 6s2 a = 6 b = 10return1 = 60val = product =

Page 19: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

19

Sort SubroutineA sort subroutine might (INCORRECTLY?) be expected to take 2

values, compare those values, and return them in sorted order:

# POOR examplesub sort_sub { my $a = shift; my $b = shift; # my ($a, $b) = @_; #same thing if ($a > $b) { return ($b, $a)} else {return ($a, $b)} }

Page 20: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

20

Sort Subroutine Problems• this subroutine may be called hundreds/thousands of times• inefficient because

– allocate variables– assign values– return 2 values

• alternative way– both $a and $b have been allocated in the "calling" code block– return one coded value

• -1 if $a < $b #Don't swap• 1 if $a > $b # swap• 0 if order doesn't matter

• Note – sorting is a special case that is performed so often that it has been highly optimized (that's why it gets all of this "special" attention)

Page 21: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

21

numeric sort subroutine

sub by_num {

# expect $a and $b

if($a < $b) {-1} elsif ($a > $b) {1} else {0}

} # ;'s are optional if one line in code block

Page 22: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

22

"sort" with custom subroutineThe custom sorting subroutine (without &) may be specified for the "sort" operator in perl. It assumes

the -1, 1, 0 cases

#!/usr/bin/perluse strict;

my @nums = qw/5 2 9 12 37 2 18/;@results = sort by_num @nums;

sub by_num { if($a < $b) {-1} elsif ($a > $b) {1} else {0}}

Note, in "by_num", we need don't need to declare $a (even with strict)If we do, this will not work correctly.

This is an example of "pass by reference"The values of the array are passed to the subroutine by reference – the actual values are not passed back and forth.That is why this works – and why changing $a changes the array.

Page 23: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

23

sort Syntax (pg 217)

sort SUBNAME LIST

sort BLOCK list

sort LIST

#Note, this is a special case where the subroutine is called WITHOUT the &

Page 24: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

24

Example#!/usr/bin/perl# sort.pl#

@array = (9, 8, 7, 1, 2, 3, 9, 69, 5,);

@sorted = sort by_num @array;

print "sorted = @sorted\n";

sub by_num { print "$a $b\n"; if($a < $b) {-1} elsif ($a > $b) {1} else {0}}

# example -- sort-brok.pl

Page 25: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

25

Shortcut to Sorting

• sorting like this is so common, a special operator exists to replace:if($a < $b) {-1} elsif ($a > $b) {1} else {0}

Therefore, we can also replace the whole subroutine:@results = sort by_num @nums;@results = sort { $a <=> $b} @nums;Descending order:@results = sort { $b <=> $a} @nums;

cmp can be used for string comparisons – but this is how the sort operator already sorts.

Page 26: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

26

Example Profile#!/usr/bin/perl# sort-prof.pl (for profile)_# # Note how if I change the value of $a, it changes in the original array

use strict;

my @nums = qw/5 2 9 12 37 2 18/;my @results = sort by_num @nums;print "@nums\n";print "@results\n";

sub by_num { print "$a $b\n"; if($a == 5) { $a = 55} if($a < $b) {-1} elsif ($a > $b) {1} else {0} }

Page 27: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

27

Example Profile5 29 1237 22 99 5555 122 1818 372 22 99 1818 1218 5555 3755 2 9 12 37 2 182 2 9 12 18 37 55

Page 28: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

28

Sorting a Hash#!/usr/bin/perluse strict;

my %score = ("tim" => 195, "tracy" => 205, "indy" => 30);# sort keys of hash, based on values of hashmy @winners = sort by_score keys %score;foreach (@winners) { print "$_ $score{$_}\n";}

sub by_score { $score{$b} <=> $score{$a} } #note descending order

#tracy 205#tim 195#indy 30

Page 29: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

29

Regular Expression Example

#!/usr/bin/perl

$i = "4-18-69";

# match 1 digit, dashe, 2 digits, dash, and 2 digitsif($i =~ m/\d-\d\d-\d\d/) { print "$i\n"; print "$&\n"; }

#But what happens if day is 1 digit#match both cases$i = "4-1-69";if($i =~ m/\d-(\d\d|\d)-\d\d/) { print "$i\n"; print "$&\n"; }

#Match 1 or more digits#But, this also matches 100 for day$i = "4-1-69";if($i =~ m/\d-\d+-\d\d/) { print "$i\n"; print "$&\n"; }

#What about 2 digit years VS 4 digit years# Carefull, this will match 19 before 69$i = "4-1-1969";if($i =~ m/\d-\d+-\d\d|\d\d\d\d/) { print "$i\n"; print "$&\n"; }

#$i = "4-1-1969";if($i =~ m/\d-\d+-(\d\d\d\d|\d\d)/) { print "$i\n"; print "$&\n"; }

#What about 2 digit months# But this will match 13$i = "12-1-1969";if($i =~ m/\d+-\d+-\d\d\d\d|\d\d/) { print "$i\n"; print "$&\n"; }

#How about this#Doesn work, becuase it matches 1$i = "13-1-1969";if($i =~ m/1|2|3|4|5|6|7|8|9|10|11|12-\d+-\d\d\d\d|\d\d/) { print "$i\n"; print "$&\n"; }

# Added ^ and $# Added parenthises for grouping# Will NOT match because of 14$i = "14-1-1969";if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-\d+-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; }

# Now it will match$i = "12-1-1969";if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-\d+-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; }

# Note limit on days to 1 or 2 digits# also, storing results into vars$i = "9-28-1969";if($i =~ m/^(1|2|3|4|5|6|7|8|9|10|11|12)-(\d{1,2})-(\d\d\d\d|\d\d)$/) { print "$i\n"; print "$&\n"; $month = $1; $day = $2; $year = $3; print "$month $day $year\n"; }

Page 30: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

30

Page 31: 1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max

31

End