an introduction to perl mbg8680 2006 gerard tromp
TRANSCRIPT
![Page 1: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/1.jpg)
An Introduction to Perl
MBG8680 2006
Gerard Tromp
![Page 2: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/2.jpg)
ReferencesBooks:
Wall L, Christiansen T, Orwant J. Programming Perl. Sebastopol, CA: O'Reilly, 2000:1-1070
Cozens S. Advanced Perl Programming Sebastopol, CA: O'Reilly, 2005:1-281
Christiansen T, Torkington N. Perl Cookbook. Sebastopol, CA: O'Reilly, 1998:1-757
Perl Manual pagesWeb: (not exhaustive – try google: learning perl)
http://www.oreilly.com/ http://www.perl.com/ (O’Reilly maintains) http://www.cpan.org (Comprehensive Perl Archive
Network) http://learn.perl.org/
![Page 3: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/3.jpg)
What is Perl?Scripting language
Interpreted at run-time
Developed as improved awk/nawk Data/Text extraction tool on UNIX
• Aho, Weinberger and Kernigan (Bell Laboratories)
– A. Aho, B. Kernighan, and P. Weinberger. AWK -- A pattern scanning and processing language. Software Practice and Experience, 9(4):267--280, 1979
Extremely powerful pattern matching capabilities (regular expression engine)
![Page 4: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/4.jpg)
What is Perl? (2)
Extensible Modules and Packages
• (CPAN: www.cpan.org)
General programming language Can be used for:
• system calls (date, time, sockets, network)• file IO
Complex programming tasks• Genome builds are performed with Perl
![Page 5: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/5.jpg)
What is Perl? The official description.
Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.
The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Its major features are that it's easy to use, supports both procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has one of the world's most impressive collections of third-party modules.
![Page 6: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/6.jpg)
Some important concepts
Perl uses punctuation and some characters to distinguish specific meaning (as do most computer languages). Train your eyes to note the difference
between: •( ), [ ], { } – important delimiters•$, @, % – important data types
– variables
![Page 7: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/7.jpg)
Basics – variables
Variable Syntax: Variables contain data Types
• scalar $ $foo simple value, e.g., string, number
• array @ @foo list of values• hash % %foo paired lists of keys – values• subroutine & &foo block (chunk) of code that can be
called• typeglob * *foo all things called foo
![Page 8: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/8.jpg)
Basics – functions (procedures)Function syntax:
Perl does not distinguish between functions, procedures and subroutines (other languages do)
Function syntax is defined in the manual pages• man “function name”• see perdoc perlfunc
Some functions take no arguments, other variable/optional arguments, e.g.,• print FILEHANDLE LIST (LIST is list of variables)
• print LIST• print
![Page 9: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/9.jpg)
Basics – operators
Operators “do things”: Mathematical
• addition + $foo + $bar• multiplication * $foo * $bar• division / $foo / $bar• subtraction - $foo - $bar• modulus % $foo % $bar• exponentiation ** $foo ** $ bar
![Page 10: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/10.jpg)
Basics – operators (2)
assignment• simple = $a = 3; $a=“abc”• complex
– mathematical
*= multiply $a *= 3 ($a==9)
-= subtract $a -= 4 ($a==5)
+= subtract $a += 5 ($a==10) – string
.= concatenate $a .= “d” ($a==abcd)
x= repeat $a x= 3
||= conditional $a ||= “a”
![Page 11: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/11.jpg)
Basics – operators (3)
Logical• and &&, and $a && $b• or ||, or $a || $b• not !, not ! $a• xor xor $a xor $b
![Page 12: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/12.jpg)
Basics – operators (4)
Test
numeric string• equality == eq• inequality != ne • less than < lt • greater than > gt • less than or equal <= le • comparison <=> cmp
![Page 13: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/13.jpg)
Basics – controlFlow control (execute till condition is met)
conditional• if if( CONDITION ){ }
if( CONDITION ){ }elsif( CONDITION ){ }else( CONDITION ){ }
• unless unless( CONDITION ){ }
• while while( CONDITION ){ }
• for for( $a=1; $a<10; $a++ ){ }
• foreach foreach( LIST ){ }
![Page 14: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/14.jpg)
Basics – control (2)
Flow control (execute till condition is met) termination
• next next;next if ( CONDTION);
skips current loop• last last;
last if ( CONDTION);terminates loop
![Page 15: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/15.jpg)
Text Manipulation in Perl.
Text manipulation was the primary reason for developing Perl originally
The text manipulation “engine” in Perl is an extended Unix Regular Expression (REGEX) History
• Derived from “regular sets” (mathematical language theory)
• Part of Unix editors ‘qed’ and ‘ed’ -> grep/egrep• Incorporated into sed, awk (nawk)• Extended in some current versions of Unix (Linux) to reflect
the Perl extensions• Incorporated into Java regular expressions
![Page 16: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/16.jpg)
Regular Expressions
Way to specify a set of strings without enumerating each possibility Way to specify a pattern to match
Distinct syntax Delimiters /PATTERN/ traditional
?PATTERN? almost any other character
Metacharacters• special interpretation to specific characters/character
combinations
![Page 17: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/17.jpg)
Regular Expression – Metacharacters (1)
[ ] list / character class
Match any character listed between brackets
[^ ] negated list Match any character except listed characters
$ terminal anchor Match end of string
^ proximal anchor Match beginning of string
. Single-character wildcard
Match any character (once)
* multi-characterWildcard
Match as many characters as possible (greedy)
| alternation / or Match pattern preceding or following
![Page 18: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/18.jpg)
Regular Expression – Metacharacters (2)
Unix escape characters (metacharacters) \ – backslash
• “escapes” meaning of special (non-alphanumeric) character, e.g., $,%,^
• converts some alphabetical characters into special metacharacters– \n newline– \r carriage return– \t tab– \f form-feed– \a alarm (BEL)– \0 ASCII NULL– \e escape
![Page 19: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/19.jpg)
Regular Expression – Metacharacters (3)Perl extensions
\s [ \t\n\r\f] whitespace
\S [^ \t\n\r\f] not whitespace
\w [a-zA-Z_0-9] word character
\W [^a-zA-Z_0-9] not word character
\d [0-9] digit
\D [^0-9] non-digit
\b true at word boundary
\B true not at word boundary
![Page 20: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/20.jpg)
Regular Expression – Quantifiers
Quantifiers allow specification of how many times the previous character/pattern should be matched
Originally limited in Unix * match 0 or more times {Min, Max} match at least Min times and
no more than Max times {Min,} match at least Min times {,Max} match no more than Max times
{Count} match exactly count times
![Page 21: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/21.jpg)
Regular Expression – Quantifiers (2)Perl extensions
+ match at least once
? match zero or 1 times
*? match minimum of 0 or more times
+? match at least once but minimum times
?? match minimum of 0 or 1 times
{}? minimal form of specific quantifiers
![Page 22: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/22.jpg)
Regular Expression – CapturingCapturing allows (a portion of) the
pattern to be used elsewhereOriginally limited in Unix (awk/sed)
\(PATTERN\) escaped parentheses Captured pattern(s) stored in buffers: $1, $2 … $n For input line:
“This is a test” the pattern:
/\([Tt]his\).*\(t[es]*t\)/yields two buffers:
$1 == “This”; $2 == “test”
![Page 23: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/23.jpg)
Regular Expression – Perl CapturingCapturing allows (a portion of) the
pattern to be used elsewhereIn Perl – do NOT escape parentheses
(PATTERN) parentheses Captured pattern(s) stored in variables: $1, $2 … $n For input line:
“This is a test” the pattern:
/([Tt]his).*(t[es]*t)/yields two variables:
$1 == “This”; $2 == “test”
![Page 24: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/24.jpg)
Regular Expression – Perl Capturing and Clustering(?#…) comment – ignore
(?:…) cluster, but do not capture
(?=…) test to see if pattern matches ahead – look ahead
(?!…) look ahead to test if pattern does NOT match (negative look ahead)
(?<=…) look behind
(?<!…) Negative look behind
![Page 25: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/25.jpg)
Perl quotes
Different quote characters have specific meaning and properties.
Interpolation is the expansion of variables occurs for some quote types but not others
![Page 26: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/26.jpg)
Perl quotes (2)Conven-tional Generic Interpretation
Inter-polation
' ' q// Literal string No
" " qq// Literal string Yes
` ` qx Command execution Yes
( ) qw// Word List No
// m// Pattern match Yes
s/// s/// Pattern substitution Yes
y/// tr/// Character translation No
" " qr// Regular expression Yes
![Page 27: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/27.jpg)
$x = “abc”; @x = ( abc, def, ghi, klm); %x = (1, abc, 2, def, 3, ghi, 4, klm);
what does the following produce? print $x, “\n”; print $x[3], “\n”; print $x{2}, “\n”;
Variable assignment
abc
klm
def
What happened and why?
![Page 28: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/28.jpg)
A Simple Command-line Script Using an Array
Type the following on a line in the PuTTY window (shell window)
perl –e ‘@x=(2,5,7,9,11); print “@X\n";’
perl –e ‘@x=(2,5,7,9,11); print “$x[4]\n";’
perl –e ‘@x=(2,5,7,9,11); foreach $x (@x) {print “$x\n"; }’
NOTE: command-line scripts are tricky since the entire script must be enclosed in single quotes
![Page 29: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/29.jpg)
A Simple Command-line Script Using a Hash
Type the following on a line
perl –e ‘%x=(2,5,7,9,11,15); print “%x\n";’
perl –e ‘%x=(2,5,7,9,11,15); print “$x{5}\n";’
perl –we ‘%x=(2,5,7,9,11,15); print “$x{5}\n";’
perl –e ‘%x=(2,5,7,9,11,15); print “$x{7}\n";’
perl –e ‘%x=(2,5,7,9,11,15); foreach $x (keys %x) {print “$x\t$x{$x}\n"; }’
![Page 30: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/30.jpg)
A Simple (file) ProgramA program to extract specific URL data from html generated
by NCBI Map viewer “view as table”
#! /usr/bin/perl –w
while(<>){if ( /href=\"(http:.*?list_uids=)(\d+)\">([-\w\*]+?)</ ){
print "$1$2\t$2\t$3\n";&mysub($1,$2,$3);
}}
sub mysub{# … do something;
}
![Page 31: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/31.jpg)
Dissection of a simple program
Examine program line by line
1 #! /usr/bin/perl –w23 while(<>){4 if ( /href=\"(http:.*?list_uids=)(\d+)\">([-\w\*]+?)</ ){5 print "$1$2\t$2\t$3\n";6 &mysub($1,$2,$3);7 }8 }910 sub mysub{11 # … do something;12 }
![Page 32: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/32.jpg)
Dissection of a simple program (2) Invocation line
1 #! /usr/bin/perl –w
This line ‘starts’ the Perl program Syntax is derived from Unix shell script syntax
• #! (pound-bang) – tells Unix shell that the next arguments is the name or path and name
of a program (executable)
• /usr/bin/perl– tells Unix shell which executable (perl) to find in which path (directory
location)
• -w– “flag(s)” passed to executalbe (program)
– tell program to “do things” or adopt specific behavior
– here: turn on perl warnings
![Page 33: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/33.jpg)
Dissection of a simple program (3)
Control loop and input operator3 while(<>){
# elided lines 4 – 7 8 }
while ( CONDTION ) BLOCK • execute loop until condition becomes false• here CONDITION is <> , an input operator
– reads from STDIN, a C filehandle accessible to every program– reads until the end-of-file, i.e., until no further data
•BLOCK is a block (chunk) of code
![Page 34: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/34.jpg)
Dissection of a simple program (4)
IF LOOP – IF ( CONDITION) BLOCK 4 if ( /href=\"(http:.*?list_uids=)(\d+)\">([-\w\*]+?)</ ){5 print "$1$2\t$2\t$3\n";6 &mysub($1,$2,$3);7 }
if ( /PATTERN/ ) BLOCK • if PATTERN matches execute the BLOCK
href=\"(http:.*?list_uids=)(\d+)\">([-\w\*]+?)<• what are literals?• what are character classes?• what does the pattern match?
![Page 35: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/35.jpg)
Dissection of a simple program (5)
IF LOOP – IF ( CONDITION) BLOCK 4 if ( /href=\"(http:.*?list_uids=)(\d+)\">([-\w\*]+?)</ ){5 print "$1$2\t$2\t$3\n";6 &mysub($1,$2,$3);7 }
print "$1$2\t$2\t$3\n";
• what does the line do?• what is $1, $2 and $3? where is it in the pattern?
&mysub($1,$2,$3); • what is &mysub?
• what are $1, $2 and $3 with respect to mysub?
![Page 36: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/36.jpg)
Dissection of a simple program (6)
Subroutine10 sub mysub{11 # … do something;12 }
sub mysub BLOCK • subroutine declaration and code BLOCK• BLOCK consists of { code}• everything after # is a comment• this is a null subroutine – does nothing• perl does not require declaration of parameters• all perl parameters are made available to subroutine as
an array – @_
![Page 37: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/37.jpg)
Getting Help
perldoc
perldoc perldoc perldoc perl perldoc perlintro perldoc perlfaq perldoc 'topic'
very important program
how to use perldoc
list of available topics*
useful material like this lecture
common questions answered
– information on specified topic from list above
* perldoc will extract documentation embedded in packages. The list returned
by ‘perldoc perl' is for the base perl installation
![Page 38: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/38.jpg)
Getting Help (2)
Books – see referencesWeb – see referencesUnix ‘man’ command.
Although perldoc will return help/information for most perl-related items, there are still a few that only have ‘man pages’
![Page 39: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/39.jpg)
Hands-on Problems
Write a Perl script that will do the following. 1: for chromosome 20, create a tab-delimited list
of:• gene names • gene ids (GeneID number)• chromosomal location (beginning, end) • orientation
2: extend the columns to include (where appropriate):
• HUGO HGNC ID• OMIM ID
![Page 40: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/40.jpg)
Improvements on the script(s)
Wouldn’t it be great if you could skip the browsing part and go straight to the web page in Perl? look at LWP module
• http://search.cpan.org/dist/libwww-perl/lib/LWP.pm
What is a ‘module’?
![Page 41: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/41.jpg)
Perl modules
A module is a collection of scripts (code) that have already been written for you Strictly speaking, a module is a collection
of one or more packages A package is small collection of code
package NAME;BLOCK1;
![Page 42: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/42.jpg)
Perl modules (2)
Why packages? allows namespace to be uncluttered keeps related code in one place allows reusability of code
Modules? can think of as extended packages can be procedural (traditional) or object-
oriented
![Page 43: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/43.jpg)
Perl modules (3)
modules must be installed from source (CPAN) module included in script by:
• use MODULE;– executes the module at compile time– complains immediately if not found
• require MODULE;– executes the module at run time– only complains later
![Page 44: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/44.jpg)
Perl modules (4)
Module allows access to module specific functions (methods)
Some Modules have hundreds of functions
Functions are written as generically as possible to make them extensible
![Page 45: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/45.jpg)
Perl modules (5)
DBI database interface abstract database interface that makes database
access as generic as possibleDBI::DBD
DBI database driver (specific to database or interface, e.g., Oracle, Sybase, MySQL, WINODBC32)
performs the database-specific calls and allows DBI to ‘hide’ them from the user
• interprets DBI generic calls to database in database-specific manner
![Page 46: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/46.jpg)
Modules
Insufficient time to delve into these very important bioinformatic modules
DBI http://search.cpan.org/~timb/DBI-1.51/DBI.pm
BioPerl http://search.cpan.org/~birney/bioperl-1.4/Bio/Perl.pm http://www.bioperl.org/wiki/Main_Page http://www.bioperl.org/wiki/Bptutorial.pl http://doc.bioperl.org/releases/bioperl-1.4
![Page 47: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/47.jpg)
Homework ProblemYou have performed a large-scale SNP
genotyping project. The data are provided to you in a tabular list in the
following format:• Some header lines
– includes blank lines
– column descriptions
• columns– Gene ID
– Polymorphism ID
– Fragment (no data [-])
– Subject ID
– Allele 1
– Allele 2
![Page 48: An Introduction to Perl MBG8680 2006 Gerard Tromp](https://reader035.vdocument.in/reader035/viewer/2022062421/56649ce55503460f949b33eb/html5/thumbnails/48.jpg)
Homework Problem (2)You have to write a script to transform the
data into a wide table that has Individual ID as rows Polymorphisms as columns Genotype data as a string “Allele1/Allele2” Polymorphisms must be grouped by gene Genes must be in order (left to right)
Notes There will be about 5,300 individuals, 200 genes
and a total of about 1,300 polymorphisms the solution is to use hashes and nested hashes