Download - CIT 500: IT Fundamentals
![Page 1: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/1.jpg)
CIT 500: IT Fundamentals
Text Processing
1
![Page 2: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/2.jpg)
Topics
1. Displaying files: cat, less, od, head, tail2. Creating and appending3. Concatenating files4. Comparing files5. Printing files6. Sorting files7. Searching files and regular expressions8. Sed and awk
![Page 3: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/3.jpg)
Displaying Files
1. cat2. less3. od4. head5. tail
![Page 4: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/4.jpg)
Displaying files: catcat [options] [file1 [file2 … ]]
-e Displays $ at the end of each line.
-n Print line numbers before each line.
-t Displays tabs as ^I and formfeeds as ^L
-v Display nonprintable characters, except for tab, newline, and formfeed.
-vet Combines –v, -e, -t to display all nonprintable characters.
![Page 5: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/5.jpg)
Displaying files: less
less [file1 [file2 … ]]h Displays help.q Quit.space Forward one page.return Forward one line.b Back one page.y Back one line.:n Next file.:p Previous file./ Search file.
![Page 6: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/6.jpg)
Displaying files: odod [options] [file1 [file2 … ]]
-c Also display character values.
-x Display numbers in hexadecimal.
> file /kernel/genunix/kernel/genunix: ELF 32-bit MSB relocatable SPARC> od -c /kernel/genunix0000000 177 E L F 001 002 001 \0 \0 \0 \0 \0 \0 \0 \0 0000020 \0 001 \0 002 \0 \0 \0 001 \0 004 246 230 \0 \0 \00000040 \0 033 ^ ` \0 \0 \0 \0 \0 4 \0 \0 \0 \0 \0 0000060 \0 017 \0 \n 235 343 277 240 310 006 004 244 020
![Page 7: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/7.jpg)
Displaying files: head and tail
Display first/last 10 lines of file.
head [-#] [file1 [file2 … ]]-# Display first # lines.
tail [-#] [file1 [file2 … ]]
-# Display last # lines.
-f If data is appended to file, continue
displaying new lines as they are added.
![Page 8: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/8.jpg)
File Size
Determining File Size– ls –l
wc [options] file-list
![Page 9: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/9.jpg)
CIT 140: Introduction to IT Slide #9
Word count: wcwc [options] target1 [target2, …]
-c Count bytes in file only.
-l Count lines in file only.
-w Count words in file only.
![Page 10: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/10.jpg)
Creating and Appending to Files
Creating files> cat >fileHello worldCtrl-d
Appending to files> cat >> fileHello world line 2Ctrl-d> cat fileHello worldHello world line 2
![Page 11: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/11.jpg)
Concatenating Files
> cat >file1
This is file #1
> cat >file2
This is file #2
> cat file1 file2 >joinedfile
> cat joinedfile
This is file #1
This is file #2
![Page 12: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/12.jpg)
Comparing files: diffdiff [options] oldfile newfile
-b Ignore trailing blanks and treat other strings of blanks as equivalent.
-c Output contextual diff format.
-e Output ed script for converting oldfile to newfile.
-i Ignore case in letter comparisons.
-u Output unified diff format.
![Page 13: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/13.jpg)
diff [options][file1][file2]
Comparing Files with diff
![Page 14: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/14.jpg)
diff Example> diff Fall_Hours Spring_Hours1c1< Hours for Fall 2004---> Hours for Spring 20056a7> 1:00 - 2:00 p.m.9d9< 3:00 - 4:00 p.m.12,13d11< 2:00 - 3:00 p.m.< 4:00 - 4:30 p.m.
![Page 15: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/15.jpg)
uniq [options][+N][input-file][output-file]
> cat sampleThis is a test file for the uniq command.It contains some repeated and some nonrepeated lines.Some of the repeated lines are consecutive, like this.Some of the repeated lines are consecutive, like this.Some of the repeated lines are consecutive, like this.And, some are not consecutive, like the following.Some of the repeated lines are consecutive, like this.The above line, therefore, will not be considered a repeatedline by the uniq command, but this will be considered repeated!line by the uniq command, but this will be considered repeated!
> uniq sampleThis is a test file for the uniq command.It contains some repeated and some nonrepeated lines.Some of the repeated lines are consecutive, like this.And, some are not consecutive, like the following.Some of the repeated lines are consecutive, like this.The above line, therefore, will not be considered a repeatedline by the uniq command, but this will be considered repeated!
Removing Repeated Lines
![Page 16: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/16.jpg)
uniquniq [options] input [output file]
-c Precedes each output line with a count of the number of times the line occurred in the input.
-d Suppresses the writing of lines that are not repeated in the input.
-u Suppresses the writing of lines that are repeated in the input.
![Page 17: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/17.jpg)
Removing Repeated Linesuniq [options][+N][input-file][output-file]
> uniq -c sample
1 This is a test file for the uniq command.
1 It contains some repeated and some nonrepeated lines.
3 Some of the repeated lines are consecutive, like this.
1 And, some are not consecutive, like the following.
1 Some of the repeated lines are consecutive, like this.
1 The above line, therefore, will not be considered a repeated
2 line by the uniq command, but this will be considered repeated!
> uniq -d sample
Some of the repeated lines are consecutive, like this.
line by the uniq command, but this will be considered repeated!
> uniq -d sample out
> cat out
Some of the repeated lines are consecutive, like this.
line by the uniq command, but this will be considered repeated!
![Page 18: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/18.jpg)
Printing Files
![Page 19: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/19.jpg)
Printing FilesPrinting Files
lp [options] file-list
lpr [options] file-list
![Page 20: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/20.jpg)
lpq [options]
Printing Files
![Page 21: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/21.jpg)
Canceling Your Print Jobcancel [options] [printer]
Printing Files
![Page 22: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/22.jpg)
Canceling Your Print Job (Contd)lprm [options][jobID-list][user(s)]
Printing Files
![Page 23: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/23.jpg)
Sorting
Ordering set of items by some criteria.
Systems in which sorting is used include:– Words in a dictionary.– Names of people in a telephone directory.– Numbers.
![Page 24: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/24.jpg)
Sorting: sortsort [-f] [-i] [-k #] [-d] [-l] [-v] files
-d Sort in dictionary order (default.)
-f Ignore case of letters.
-i Ignore non-printable characters.
-k # Sort by field number #
-n Sort in numerical order.
-r Reverse order of sort
-u Do not list duplicate lines in output.
![Page 25: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/25.jpg)
sort Example> cat days.txtSundayMondayTuesdayWednesdayThursdayFridaySaturday> sort days.txtFridayMondaySaturdaySundayThursdayTuesdayWednesday
![Page 26: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/26.jpg)
sort Example> cat days.txtSundayMondayTuesdayWednesdayThursdayFridaySaturday> sort -r days.txtWednesdayTuesdayThursdaySundaySaturdayMondayFriday
![Page 27: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/27.jpg)
sort Example> cat numbers.txt10155715820019> sort numbers.txt10120015571589> sort -n numbers.txt95810120015571
![Page 28: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/28.jpg)
Searching Files: grepgrep [-i] [-l] [-n] [-v] pattern file1 [file2, ...]
Search for pattern in the file arguments.
-i Ignore case of letters in files.
-l Print only the names of files that contain matches.
-n Print line numbers along with matching lines.
-v Print only nonmatching lines.
![Page 29: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/29.jpg)
Simple Searches> grep catt /usr/share/dict/wordscattail...wildcatting> grep -c catt /usr/share/dict/words29> grep –c –v catt /usr/share/dict/words98540> wc –l /usr/share/dict/words 98569 /usr/dict/words> grep –n catt /usr/share/dict/words28762:cattail…97276:wildcatting
![Page 30: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/30.jpg)
Regular Expressions^ Beginning of line$ End of line[a-z] Character range (all lower case)[aeiou] Character range (vowels). Any character* Zero or more of previous pattern{n} Repeat previous match n times{n,m} Repeat previous match n to m timesa|b Match a or b
![Page 31: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/31.jpg)
Regular Expression Searches> egrep ^dogg /usr/share/dict/wordsdogged…doggy’s> egrep dogg$ /usr/share/dict/words> egrep mann$ /usr/share/dict/wordsBertelsmann…Weizmann> egrep ^mann /usr/share/dict/wordsmanna…mannishness's
![Page 32: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/32.jpg)
Regular Expression Searches> egrep 'catt|dogg' /usr/share/dict/wordsboondoggleboondoggled...wildcatting> egrep 'catt|dogg' /usr/share/dict/words | wc –l54> egrep '^(catt|dogg)‘ /usr/share/dict/wordscattail…doggy’s
![Page 33: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/33.jpg)
Character classes> egrep [0-9] /usr/share/dict/words> egrep –c ^xz /usr/share/dict/words0> egrep -c ^[xz] /usr/share/dict/words153> egrep -c [xz]$ /usr/share/dict/words321> egrep -c [aeiou][aeiou][aeiou][aeiou] /usr/dict/words36> egrep [aeiou][aeiou][aeiou][aeiou][aeiou] /usr/share/dict/wordsqueueing> egrep [aeiou]{5} /usr/share/dict/wordsqueueing> egrep -c :[0-9][0-9]: /etc/passwd9> egrep -c ':[0-9]{2,3}:' /etc/passwd18
![Page 34: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/34.jpg)
Extracting Fields: cutcut [-f #] [-d delim] file
Select sections from each line of file.
-f # Select field #.
-d delim Use delim instead of tab to separate fields.
-b # Select specified bytes instead of fields.
![Page 35: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/35.jpg)
Cut Examples> cut -d: -f 1 /etc/passwd | head -5rootdaemonbinsyssync> cut -d: -f 1,3 /etc/passwd | head -5root:0daemon:1bin:2sys:3sync:4> cut -d: -f 1,3-5,7 /etc/passwd | head -5root:0:0:root:/bin/bashdaemon:1:1:daemon:/bin/shbin:2:2:bin:/bin/shsys:3:3:sys:/bin/shsync:4:65534:sync:/bin/sync
![Page 36: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/36.jpg)
Cut Examples> cut -c1-4 /etc/passwd | head -5rootdaembin:sys:sync> cut -d: -f7 /etc/passwd | cut -c1-4 | head -5/bin/bin/bin/bin/bin> cut -d: -f7 /etc/passwd | cut –c6-20 | head -5bashshshshsync
![Page 37: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/37.jpg)
Searching + Extracting: awk
37
awk [-F delim] ‘/pattern/ {action}’
Execute awk program on each line of file.
-F delim Use delim to separate fields
Patterns are regular expressions.
Actions are extremely powerful, as awk is a
simple programming language, but we’ll just
use print $#, where # is the field we want to print.
![Page 38: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/38.jpg)
Awk Examples> awk -F: '{print $1}' /etc/passwd|head -5rootdaemonbinsyssync> awk -F: '{print $1, $3}' /etc/passwd|head -5root 0daemon 1bin 2sys 3sync 4> awk -F: '/root/ {print $1, $3}' /etc/passwdroot 0> awk -F: '/bin\/false/ {print $1, $3}' /etc/passwddhcp 101syslog 102klog 103
![Page 39: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/39.jpg)
Stream Editor: sed
39
sed [-n] ‘/pattern/action’ files
sed [-n] ‘[line1,line2]s/pat1/pat2/options’ files
Filter and modify (if specified) each line of file.
-n Do not print lines unless action specifies printing.
Patterns are regular expressions.
Actions: p = print matching lines,
d = delete matching lines
s = replace pattern1 with pattern2
![Page 40: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/40.jpg)
Using Sed like Grep > sed -n '/catt/p' /usr/share/dict/wordscattail…wildcatting> sed -n '/catt/p' /usr/share/dict/words | wc -l29> sed '/catt/d' /usr/share/dict/words | wc -l98540> sed -n '/^dogg/p' /usr/share/dict/wordsdogged…doggy’s> sed -n '/dogg$/p' /usr/share/dict/words> sed -n '/mann$/p' /usr/share/dict/wordsBertelsmann…Weizmann
![Page 41: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/41.jpg)
Sed Examples> cat phones.txtOur phone bill for last year was $859,800,513.57.This is our list of phone numbers:859-572-7568859-572-7721859-572-7568859-572-5468859-572-6930859-572-5334859-572-5320859-572-5659859-572-7568859-572-7739859-572-0000859-572-6544859-572-6346859-572-5330859-572-7551859-572-5571859-572-7786859-572-1453859-572-6025859-572-5333
![Page 42: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/42.jpg)
Sed Substitutions> sed 's/859/(513)/' phones.txt | head -5Our phone bill for last year was $(513),800,513.57.This is our list of phone numbers:(513)-572-7568(513)-572-7721(513)-572-7568> sed 's/859-/(513)-/' phones.txt | head -5Our phone bill for last year was $859,800,513.57.This is our list of phone numbers:(513)-572-7568(513)-572-7721(513)-572-7568> sed '3,99s/859/(513)/' phones.txt | head -5Our phone bill for last year was $859,800,513.57.This is our list of phone numbers:(513)-572-7568(513)-572-7721(513)-572-7568
![Page 43: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/43.jpg)
Sed Substitutions> sed 's/[0-9]*-[0-9]*-[0-9]*/Number Redacted/' phones.txt | head -5Our phone bill for last year was $859,800,513.57.This is our list of phone numbers:Number RedactedNumber RedactedNumber Redacted> sed 's/\([0-9]*-[0-9]*-[0-9]*\)/Phone number is \1/' phones.txt | head -5Our phone bill for last year was $859,800,513.57.This is our list of phone numbers:Phone number is 859-572-7568Phone number is 859-572-7721Phone number is 859-572-7568> sed 's/\([0-9]*\)-\([0-9]*\)-\([0-9]*\)/(\1) \2-\3/' phones.txt | head -5Our phone bill for last year was $859,800,513.57.This is our list of phone numbers:(859) 572-7568(859) 572-7721(859) 572-7568
![Page 44: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/44.jpg)
Sed and Awk Applications
Sed
• Double space a file.• DOS to UNIX line endings.• Trim leading spaces.• Delete consecutive blank
lines.• Remove blanks from
begin/end of file.
Awk
• Manage small file db.• Generate reports.• Validate data.• Produce indexes.• Extract fields from UNIX
command output.
44
![Page 45: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/45.jpg)
Sed and Awk vs. Ruby and Others
Sed and Awk– Small languages– Cryptic syntax– Best for writing one liners in the shell
Ruby, Python, Perl, etc.– Large languages– Easy syntax– Best for writing longer programs
45
![Page 46: CIT 500: IT Fundamentals](https://reader035.vdocument.in/reader035/viewer/2022062519/56815377550346895dc17d4e/html5/thumbnails/46.jpg)
References
1. Syed Mansoor Sarwar, Robert Koretsky, Syed Ageel Sarwar, UNIX: The Textbook, 2nd edition, Addison-Wesley, 2004.
2. Nicholas Wells, The Complete Guide to Linux System Administration, Thomson Course Technology, 2005.
46