introduction to unix text processing harendra guturu 11 oct 2013
DESCRIPTION
Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013 Thank you to Cory McLean, Gus Katsiapis , Aaron Wenger, & Jim Notwell . Stanford UNIX resources. Host: cardinal.stanford.edu - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/1.jpg)
Introduction to UNIX Text Processing
Sandeep Chinchali10 Oct 2014
Thank you to Cory McLean, Gus Katsiapis, Aaron Wenger, Harendra Guturu, & Jim Notwell.
![Page 2: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/2.jpg)
Stanford UNIX resources
• Host: cardinal.stanford.edu• To connect from Unix/Linux/Mac:
Open a terminal:ssh [email protected] [email protected] [email protected]
• To connect from Windows:– SecureCRT/SecureFX (software.stanford.edu)– PuTTy (http://goo.gl/s0itD)
![Page 3: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/3.jpg)
Many useful text processing UNIX commands• awk bzcat cat column cut grep head join sed sort tail tee tr uniq wc zcat …
• UNIX commands work together via text streams.
• Example usage and others available at http://tldp.org/LDP/abs/html/textproc.htmlhttp://en.wikipedia.org/wiki/Cat_%28Unix%29#Other
3
![Page 4: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/4.jpg)
Huge suite of tools
4
![Page 5: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/5.jpg)
Knowing UNIX commands eliminates having to reinvent the wheel
• For homework #1 last year, to perform a simple file sort, submissions used:– 35 lines of Python– 19 lines of Perl– 73 lines of Java– 1 line of UNIX commands
5
![Page 6: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/6.jpg)
Anatomy of a UNIX command
command [options] [FILE1] [FILE2]• options: -n 1 -g -c = -n1 -gc• output is directed to “standard output” (stdout)• if no input file is specified, input comes from
“standard input” (stdin)– “-” also means stdin in a file list
6
![Page 7: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/7.jpg)
The real power of UNIX commands comes from combinations through piping (“|”)
• Pipes are used to pass the output of one program (stdout) as the input (stdin) to another
• Pipe character is <Shift>-\
grep “CS273a” grades.txt | sort -k 2,2gr | uniq
7
Find all lines in the file that have “CS273a” in them somewhere
Sort those lines by second column, in numerical order, highest to lowest
Remove duplicates and print to standard output
![Page 8: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/8.jpg)
Output redirection (>, >>)
• Instead of writing everything to standard output, we can write (>)or append (>>) to a file
grep “CS273a” allClasses.txt > CS273aInfo.txt
cat addlInfo.txt >> CS273aInfo.txt
8
![Page 9: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/9.jpg)
UCSC KENT SOURCE UTILITIEShttp://genomewiki.ucsc.edu/index.php/Kent_source_utilities
9
![Page 10: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/10.jpg)
/afs/ir/class/cs273a/bin/@sys/• Many C programs in this directory that do manipulation of
sequences or chromosome ranges• Run programs with no arguments to see help message
overlapSelect [OPTION]… selectFile inFile outFile
Many useful options to alter how overlaps computed
10
Output is all inFile elements that overlap any selectFile elements
selectFile
inFile
outFile
![Page 11: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/11.jpg)
Kent Source and Mysql
• Linux + Mac Binaries– http://hgdownload.soe.ucsc.edu/admin/exe/
• Using MySQL on browser– http://genome.ucsc.edu/goldenPath/help/
mysql.html
11
![Page 12: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/12.jpg)
Interacting with UCSC Genome Browser MySQL Tables
• Galaxy (a GUI to make SQL commands easy)– http://main.g2.bx.psu.edu/
• Direct interaction with the tables:mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A –Ne “<STMT>“
e.g.mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A –Ne \ “select count(*) from hg18.knownGene“;
+-------+| 66803 |+-------+
http://dev.mysql.com/doc/refman/5.1/en/tutorial.html 12
![Page 13: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/13.jpg)
PRACTICE EXERCISEStop for
13
![Page 14: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/14.jpg)
Exercise From Class## inspect your data head 50kbx5kbWindows.txt # Gets top 10 lines column -t 50kbx5kbWindows.txt | head # Spaces the columns for nice viewing and then shows top 10 lines ## NOTE: Don't process output from column since the pretty output usually is mangled in terms of field separators ## get the desired fields cat 50kbx5kbWindows.txt | cut -f1,4 | sort -k1,1 -0 50kbx5kbWindows.cut.txt ## data verifications md5sum 50kbx5kbWindows.cut.txt 50kbx5kbWindows.awk.txt ## sort the bed file by name cat mm9.50kbx5kbWindows.bed | sort -k4,4 > mm9.50kbx5kbWindows.sort.bed ## NOTE: cat fileName | sort -k4,4 > fileName ## Will kill the file, but ## cat fileName | sort -k4,4 -o fileName ## Will not ## select windows overlapping a p300 peak, join on the name, sort to find the most ## enriched windows, clean, and take the top 20 overlapSelect mm9.wgEncodeUwDnaseCerebellumC57bl6MAdult8wksPkRep1.bed mm9.50kbx5kbWindows.sort.bed stdout | join -t$'\t' -1 4 -2 1 - 50kbx5kbWindows.cut.txt | sort -k5,5 -gr | awk '{print $2"\t"$3"\t"$4"\t"$5}' | head -20 > top20.bed ## remove the temporary files rm 50kbx5kbWindows.cut.txt mm9.50kbx5kbWindows.sort.bed
14
![Page 15: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/15.jpg)
Other operations with bash/shell
• http://www.catonmat.net/blog/set-operations-in-unix-shell/
• Bash noclobber– $ set -o noclobber – $ echo "Can we overwrite it again?" >file.txt – -bash: file.txt: cannot overwrite existing file – $ echo "Can we overwrite it again?" >| file.txt
• Bash Dual pipes (tricky, be careful)– sort <(cat file1) <(cat file2) 15
![Page 16: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/16.jpg)
SPECIFIC UNIX COMMANDS
16
![Page 17: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/17.jpg)
man, whatis, apropos
• UNIX program that invokes the manual written for a particular program
• man sort– Shows all info about the program sort– Hit <space> to scroll down, “q” to exit
• whatis sort– Shows short description of all programs that have
“sort” in their names• apropos sort– Shows all programs that have “sort” in their names or
short descriptions
![Page 18: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/18.jpg)
cat• Concatenates files and prints them to
standard output• cat [OPTION] [FILE]…
• Variants for compressed input files:zcat (.gz files)bzcat (.bz2 files)
18
ABCD
123
ABCD123
![Page 19: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/19.jpg)
head, tail
• head: first ten linestail: last ten lines
• -n option: number of lines– For tail, -n+K means line K to the end.
• head –n5 : first five lines• tail –n73 : last 73 lines• tail –n+10 | head –n 5 : lines 10-14
19
![Page 20: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/20.jpg)
cut
• Prints selected parts of lines from each file to standard output
• cut [OPTION]… [FILE]…• -d Choose delimiter between columns
(default TAB)• -f Fields to print-f1,7 : fields 1 and 7-f1-4,7,11-13: fields 1,2,3,4,7,11,12,13
20
![Page 21: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/21.jpg)
cut example
21
CS 273 aCS.273.aCS 273 a
file.txt
cut –f1,3 file.txt =cat file.txt | cut –f1,3
CS aCS.273.aCS
cut –d ‘.’ –f1,3 file.txtCS 273 aCS.aCS 273 a
In general, you should make sure your file columns are all delimited with the same character(s) before
applying cut!
![Page 22: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/22.jpg)
wc
• Print line, word, and character (byte) counts for each file, and totals of each if more than one file specified
• wc [OPTION]… [FILE]…• -l Print only line counts
22
![Page 23: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/23.jpg)
sort
• Sorts lines in a delimited file (default: tab)• -k m,n sorts by columns m to n (1-based)• -g sorts by general numerical value (can handle
scientific format)• -r sorts in descending order• sort -k1,1gr -k2,3– Sort on field 1 numerically (high to low because of r).– Break ties on field 2 alphabetically.– Break further ties on field 3 alphabetically.
23
![Page 24: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/24.jpg)
uniq
• Discard all but one of successive identical lines from input and print to standard output
• -d Only print duplicate lines• -i Ignore case in comparison• -u Only print unique lines
24
![Page 25: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/25.jpg)
uniq example
25
CS 273aCS 273aTA: Cory McLeanCS 273a
file.txtuniq file.txt
CS 273aTA: Cory McLeanCS 273a
uniq –u file.txt TA: Cory McLeanCS 273a
uniq –d file.txt CS 273a
In general, you probably want to make sure your file is sorted before applying uniq!
![Page 26: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/26.jpg)
grep
• Search for lines that contain a work or match a regular expression
• grep [options] PATTERN [FILE…]• -i ignore case• -v Output lines that do not match• -E regular expressions• -f <FILE>: patterns from a file (1 per line)
26
![Page 27: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/27.jpg)
grep example
grep -E “^CS[[:space:]]+273$” file
27
Search through “file”
For lines that start with CS
Then have one or more spaces (or tabs)
And end with 273
CS 273aCS273CS 273cs 273CS 273
file
CS 273CS 273
![Page 28: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/28.jpg)
tr
• Translate or delete characters from standard input to standard output
• tr [OPTION]… SET1 [SET2]• -d Delete chars in SET1, don’t translate
28
cat file.txt | tr ‘\n’ ‘,’
Thisis anExample.
file.txt
This,is an,Example.,
![Page 29: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/29.jpg)
sed: stream editor
• Most common use is a string replace.• sed –e “s/SEARCH/REPLACE/g”
29
cat file.txt | sed –e “s/is/EEE/g”
Thisis anExample.
file.txtThEEEEEE anExample.
![Page 30: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/30.jpg)
join
• Join lines of two files on a common field• join [OPTION]… FILE1 FILE2• -1 Specify which column of FILE1 to join on• -2 Specify which column of FILE2 to join on• Important: FILE1 and FILE2 must already be
sorted on their join fields!
30
![Page 31: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/31.jpg)
join example
31
CS273a Comp Tour Hum Gen.CS229 Machine LearningDB210 Devel. Biol.
file2.txtBejerano CS273aVilleneuve DB210Batzoglou DB273a
file1.txt
join -1 2 -2 1 file1.txt file2.txt
CS273a Bejerano Comp Tour Hum Gen.DB210 Villeneuve Devel. Biol.
![Page 32: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/32.jpg)
SHELL SCRIPTING
32
![Page 33: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/33.jpg)
Common shells
• Two common shells: bash and tcsh• Run ps to see which you are using.
33
![Page 34: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/34.jpg)
Multiple UNIX commands can be combined into a single shell script.
#!/bin/bashset -beEu -o pipefailcat $1 $2 > tmp.txtpaste tmp.txt $3 > $4export A=“Value”
34
#!/bin/tcsh -ecat $1 $2 > tmp.txtpaste tmp.txt $3 > $4setenv A “Value”
script.sh script.csh
Command prompt% ./script.sh file1.txt file2.txt file3.txt out.txt% ./script.csh file1.txt file2.txt file3.txt out.txt
Scripts must first be set to be executable:% chmod u+x script.sh script.csh
Means die on error.
http://www.faqs.org/docs/bashman/bashref_toc.htmlhttp://www.the4cs.com/~corin/acm/tutorial/unix/tcsh-help.html
![Page 35: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/35.jpg)
for loop
# BASH for loop to print 1,2,3 on separate linesfor i in `seq 1 3`do echo ${i}done
# TCSH for loop to print 1,2,3 on separate linesforeach i ( `seq 1 3` ) echo ${i}end
35
Special quote character, usually left of “1” on keyboard that indicates we should execute the command within the quotes
![Page 36: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/36.jpg)
SCRIPTING LANGUAGES
36
![Page 37: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/37.jpg)
awk
• A quick-and-easy shell scripting language• http://www.grymoire.com/Unix/Awk.html• Treats each line of a file as a record, and splits
fields by whitespace• Fields referenced as $1, $2, $3, … ($0 is entire
line)
37
![Page 38: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/38.jpg)
Anatomy of an awk script.
awk ‘BEGIN {…} {…} END {…}’
38
before first line after last lineonce per line
![Page 39: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/39.jpg)
awk example
• Output the lines where column 3 is less than column 5 in a comma-delimited file. Output a summary line at the end.
39
awk -F',‘'BEGIN{ct=0;}{ if ($3 < $5) { print $0; ct=ct+1; } }END { print "TOTAL LINES: " ct; }'
![Page 40: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/40.jpg)
Useful things from awk
• Make sure fields are delimited with tabs (to be used by cut, sort, join, etc.
awk ‘{print $1 “\t” $2 “\t” $3}’ whiteDelim.txt > tabDelim.txt
• Good string processing using substr, index, length functions
awk ‘{print substr($1, 1, 10)}’ longNames.txt > shortNames.txt
40
String tomanipulate
Startposition
Length
substr(“helloworld”, 4, 3) = “low” index(“helloworld”, “low”) = 4
length(“helloworld”) = 10 index(“helloworld”, “notpresent”) = 0
![Page 41: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/41.jpg)
Python
• A scripting language with many useful constructs
• Easier to read than Perl• http://wiki.python.org/moin/BeginnersGuide• http://docs.python.org/tutorial/index.html
• Call a python program from the command line:python myProg.py
41
![Page 42: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/42.jpg)
Number types
• Numbers: int, float>>> f = 4.7>>> i = int(f)>>> j = round(f)>>> i4>>> j5.0>>> i*j20.0>>> 2**i16
42
![Page 43: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/43.jpg)
Strings>>> dir(“”)[…, 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith',
'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'replace', 'rfind', 'rindex', 'rjust', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>> s = “hi how are you?”>>> len(s)15>>> s[5:10]‘w are’>>> s.find(“how”)3>>> s.find(“CS273”)-1>>> s.split(“ “)[‘hi’, ‘how’, ‘are’, ‘you?’]>>> s.startswith(“hi”)True>>> s.replace(“hi”, “hey buddy,”)‘hey buddy, how are you?’>>> “ extraBlanks ”.strip()‘extraBlanks’ 43
![Page 44: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/44.jpg)
Lists• A container that holds zero or more objects in sequential
order>>> dir([])[…, 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse',
'sort']>>> myList = [“hi”, “how”, “are”, “you?”]>>> myList[0]‘hi’>>> len(myList)4>>> for word in myList:
print word[0:2]
hihoaryo
>>> nums = [1,2,3,4]>>> squares = [n*n for n in nums]>>> squares[1, 4, 9, 16]
44
![Page 45: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/45.jpg)
Dictionaries• A container like a list, except key can be
anything (instead of a non-negative integer)>>> dir({})[…, clear', 'copy', 'fromkeys', 'get', 'has_key', 'items',
'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
>>> fruits = {“apple”: True, “banana”: True}>>> fruits[“apple”]True>>> fruits.get(“apple”, “Not a fruit!”)True>>> fruits.get(“carrot”, “Not a fruit!”)‘Not a fruit!’>>> fruits.items()[('apple', True), ('banana', True)]
45
![Page 46: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/46.jpg)
Reading from files
>>> openFile = open(“file.txt”, “r”)>>> allLines = openFile.readlines()>>> openFile.close()>>> allLines[‘Hello, world!\n’, ‘This is a file-reading\n’, ‘\texample.\n’]
46
Hello, world!This is a file-reading example.
file.txt
![Page 47: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/47.jpg)
Writing to files>>> writer = open(“file2.txt”, “w”)>>> writer.write(“Hello again.\n”)>>> name = “Cory”>>> writer.write(“My name is %s, what’s yours?\n” % name)>>> writer.close()
47
Hello again.My name is Cory, what’s yours?
file2.txt
![Page 48: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/48.jpg)
Creating functionsdef compareParameters(param1, param2): if param1 < param2: return -1 elif param1 > param2: return 1 else: return 0
def factorial(n): if n < 0: return None elif n == 0: return 1 else: retval = 1 num = 1 while num <= n: retval = retval*num num = num + 1 return retval
48
![Page 49: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/49.jpg)
Example program
#!/usr/bin/env pythonimport sys # Required to read arguments from command line
if len(sys.argv) != 3: print “Wrong number of arguments supplied to Example.py” sys.exit(1)
inFile = open(sys.argv[1], “r”)allLines = inFile.readlines()inFile.close()
outFile = open(sys.argv[2], “w”)for line in allLines: outFile.write(line)
outFile.close()
49
Example.py
![Page 50: Introduction to UNIX Text Processing Harendra Guturu 11 Oct 2013](https://reader036.vdocument.in/reader036/viewer/2022062323/56815f0c550346895dcdcc80/html5/thumbnails/50.jpg)
Example program
python Example.py file1 file2
sys.argv = [‘Example.py’, ‘file1’, ‘file2’]
50
#!/usr/bin/env pythonimport sys # Required to read arguments from command line
if len(sys.argv) != 3: print “Wrong number of arguments supplied to Example.py” sys.exit(1)
inFile = open(sys.argv[1], “r”)allLines = inFile.readlines()inFile.close()
outFile = open(sys.argv[2], “w”)for line in allLines: outFile.write(line)
outFile.close()