unix utilities 1
TRANSCRIPT
-
8/3/2019 Unix Utilities 1
1/13
Unix Utilities 1
awk
sed
-
8/3/2019 Unix Utilities 1
2/13
awk
awk is a programmable, pattern-matching, and processingtool that works equally well with text and numbers.
The command line syntax for awkis:
awk 'pattern { action }' [filename]
- since awk can take its input from its standard input,providing the name of a file is optional
- single quotes are used to protect the pattern and actionfrom the shell
-
8/3/2019 Unix Utilities 1
3/13
The two key components of the command line format are the
pattern (a regular expression), and the action. awkwill search its
input for lines that match the specified pattern. If it finds a matchit will perform the specified action, which could be writing the
entire line (record) or individual fields from the line to standard
output.
If a pattern is not specified, the action will be performed on every
line. If the action is absent, then the default action of printing the
line to standard output will be performed
The awk command can be easily used to extract specific fields of
information.
-
8/3/2019 Unix Utilities 1
4/13
Lets consider an example with a file unix_training havingunix traininglearn unixunix classlearning unixunix course
Lets look at the simplest task in awk, displaying all the input lines in the fileunix_training$ awk '{ print ; }' unix_training
unix traininglearn unixunix classlearning unix
unix course Here the awk command is used to print each line of the input. When the printcommand is given without arguments, it prints the input line exactly as it was read.
Notice that there is a semicolon ( ;) after the print command. This semicolon isrequired to let awk know that the command has concluded.
Field Editing.One of the feature available with awk is that it automaticllay divides input
lined into fields. A field is a set of characters that are separated by one ormore field separator characters. The default field separator characters aretab and space.
When a line is read, awk places the fields that it has parsed into the variable1 for the first field, 2 for the second field, and so on. To access a field, usethe field operator, $. Thus, the first field is $1.
-
8/3/2019 Unix Utilities 1
5/13
Lets consider the file unix_trainingunix traininglearn unixunix class
learning unixunix course Use the awk tool to select the records from the file
containing the word learn and then print the two fields foreach matching record in reverse order with a spacebetween them. $ awk /learn/{ print $2 $1 } unix_training
unix learnunix learning
Similar to shell script arguments, each field in the record can
be referenced using the $ sign followed by a numberindicating its position in the record. $0 references the entirerecord.
Note that $1,$2 used within the curly braces are not the$1,$2 that uses to signify the first and second argumentpassed to a script.
-
8/3/2019 Unix Utilities 1
6/13
Lets take another example in which we shall pass std input . $ df -k | grep -v used | awk '{ print $6 "\t" $5 }
/ 27%/usr 79%/boot 17%/proc 0%/dev/fd 0%/etc/mnttab 0%/var 3%/var/run 1%/tmp 1%
/opt 2%/export/home 1%
The above command extracts the file system mount point andpercentage of available space (by file system) for everyrecord awkreceives from df. The grep command is used to
discard df's column headers, and the "\t" in the print statementinserts a tab between the two resultant fields. Notice thatthere is no pattern specified, which resulted in the actionbeing performed on every record.
This command can be used to monitor the systems file
system space usage.
-
8/3/2019 Unix Utilities 1
7/13
sed ( streameditor)
The power and usefulness of sed can be seen when thesame edit, or a series of edits, has to be performed multipletimes in a single file, or one or more times in multiplefiles. Imagine having to use vi interactively to make the
same changes to 100 different files sed receives text input, whether from stdin or from a file,
performs certain operations on specified lines of the input, oneline at a time, then outputs the result to stdout or to a file.
The command line syntax for sedis:sed [-e] 'command1' [-e command2...] [file]
Notice that more than one sed command can be performed byprefacing e option for each command. This e option not required ifonly one command is used. Specifying file is optional.
-
8/3/2019 Unix Utilities 1
8/13
Basic Sed operators
Operator Name Effect
[address-range]/p Print Print [specified address range]
[address-range]/d Delete Delete [specified address range]
s/pattern1/pattern2/ Substitute Substitute pattern2 for first instance ofpattern1 in a line
[address-range]/s/pattern1/pattern2
/
Substitute Substitute pattern2 for first instance ofpattern1 in a line, over address-range
[address-range]/y/pattern1/pattern2
/
Transform replace any character in pattern1 with thecorresponding character in pattern2, overaddress-range(equivalent of tr)
g Global Operate on everypattern match withineach matched line of input
-
8/3/2019 Unix Utilities 1
9/13
ExamplesOperator Effect
8d Delete 8th line of input
/^$/d Delete all blank lines1,/^$/d
/[Uu]nix/!d
Delete from beginning of input up to, and including first blankline
Deletes lines that do not contain the word unix
/UNIX/p
/[0-9]/p
Print only lines containing UNIX" (with -n option).
Prints the lines that contains the any digits
s/Windows/Linux/ Substitute "Linux" for first instance of "Windows" found in eachinput line
s/Windows/Linux/g Substitute Linux" for every instance of Windows" found in eachinput line.
s/ *$// Delete all spaces at the end of every line.
s/00*/0/g Compress all consecutive sequences of zeroes into a singlezero.
/UNIX/d
s/UNIX//g
Delete all lines containing UNIX".
Delete all instances of UNIX", leaving the remainder of eachline intact.
-
8/3/2019 Unix Utilities 1
10/13
Consider an example with a sample file named address_list
Name Address City StateJohn Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OKTerry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston MA
Lets replace MA with Massachusetts$ sed s/MA/Massachusetts/ address_list
John Daggett, 341 King Road, Plymouth Massachusetts
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MassachusettsHubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston Massachusetts
Three lines are affected by the sed operation and all the linesare displayed on the screen
-
8/3/2019 Unix Utilities 1
11/13
Multiple Substitutions with Sed
$ sed 's/MA/Massachusetts/; s/CA/California/' address_list
The above example shows how we can do multiple
substitutions using the ;(semi colon) to separate them.
Lets consider another example to place a comma after the city.
$ sed 's/ \([A-Z]\{2\}\)$/, \1/' address_list
John Daggett, 341 King Road, Plymouth, MA
Alice Ford, 22 East Broadway, Richmond, VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa, OK
Terry Kalkas, 402 Lans Road, Beaver Falls, PA
Eric Adams, 20 Post Road, Sudbury, MA
Hubert Sims, 328A Brook Road, Roanoke, VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View, CA
Sal Carpenter, 73 6th Street, Boston, MA
-
8/3/2019 Unix Utilities 1
12/13
Deleting Lines Consider the file address_list.
Name Address City State
John Daggett, 341 King Road, Plymouth MAAlice Ford, 22 East Broadway, Richmond VA
Orville Thomas,11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston MA
The address of John Daggett is no more and needs to be deleted from the address_list.
The sed command to delete is
$ sed '/^[Jj]ohn/d' address_list
-
8/3/2019 Unix Utilities 1
13/13
Few notations that are used in the above example are
1. \(slash) This will remember whatever is placed inside so that youcan use it in the regular expression. According to how many \( \) youused, you will need to use \1, \2, \3 and so on in the replacementexpression.
2. [A-Z] The character set. The [] expression means match any singlecharacter in that set. The A-Z part means that the set includes all charactersfrom A through Z. Note, this is case sensitive and a-z will not match thesame set of characters that A-Z will. If you just put [AZ] in, it will onlymatch A or Z, not B, C, D, E, ... , W, X or Y.
3.\{2\} - This simply means the match must have 2 of the previousexpression. The $ character at the end of the matching expressionmatches the end of the line.
Together this all means, match a space followed by exactly 2 characters that arefrom A through Z and are at the end of the line. In this file, as long as theformat remains the same, it will only match the last field containing thestate's abbreviation following the city name. So it is relatively easy to addthe comma between those two fields.