unix utilities 1

Upload: sreedobbidi

Post on 06-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Unix Utilities 1

    1/13

    Unix Utilities 1

    awk

    sed

  • 8/3/2019 Unix Utilities 1

    2/13

    awk

    awk is a programmable, pattern-matching, and processingtool that works equally well with text and numbers.

    The command line syntax for awkis:

    awk 'pattern { action }' [filename]

    - since awk can take its input from its standard input,providing the name of a file is optional

    - single quotes are used to protect the pattern and actionfrom the shell

  • 8/3/2019 Unix Utilities 1

    3/13

    The two key components of the command line format are the

    pattern (a regular expression), and the action. awkwill search its

    input for lines that match the specified pattern. If it finds a matchit will perform the specified action, which could be writing the

    entire line (record) or individual fields from the line to standard

    output.

    If a pattern is not specified, the action will be performed on every

    line. If the action is absent, then the default action of printing the

    line to standard output will be performed

    The awk command can be easily used to extract specific fields of

    information.

  • 8/3/2019 Unix Utilities 1

    4/13

    Lets consider an example with a file unix_training havingunix traininglearn unixunix classlearning unixunix course

    Lets look at the simplest task in awk, displaying all the input lines in the fileunix_training$ awk '{ print ; }' unix_training

    unix traininglearn unixunix classlearning unix

    unix course Here the awk command is used to print each line of the input. When the printcommand is given without arguments, it prints the input line exactly as it was read.

    Notice that there is a semicolon ( ;) after the print command. This semicolon isrequired to let awk know that the command has concluded.

    Field Editing.One of the feature available with awk is that it automaticllay divides input

    lined into fields. A field is a set of characters that are separated by one ormore field separator characters. The default field separator characters aretab and space.

    When a line is read, awk places the fields that it has parsed into the variable1 for the first field, 2 for the second field, and so on. To access a field, usethe field operator, $. Thus, the first field is $1.

  • 8/3/2019 Unix Utilities 1

    5/13

    Lets consider the file unix_trainingunix traininglearn unixunix class

    learning unixunix course Use the awk tool to select the records from the file

    containing the word learn and then print the two fields foreach matching record in reverse order with a spacebetween them. $ awk /learn/{ print $2 $1 } unix_training

    unix learnunix learning

    Similar to shell script arguments, each field in the record can

    be referenced using the $ sign followed by a numberindicating its position in the record. $0 references the entirerecord.

    Note that $1,$2 used within the curly braces are not the$1,$2 that uses to signify the first and second argumentpassed to a script.

  • 8/3/2019 Unix Utilities 1

    6/13

    Lets take another example in which we shall pass std input . $ df -k | grep -v used | awk '{ print $6 "\t" $5 }

    / 27%/usr 79%/boot 17%/proc 0%/dev/fd 0%/etc/mnttab 0%/var 3%/var/run 1%/tmp 1%

    /opt 2%/export/home 1%

    The above command extracts the file system mount point andpercentage of available space (by file system) for everyrecord awkreceives from df. The grep command is used to

    discard df's column headers, and the "\t" in the print statementinserts a tab between the two resultant fields. Notice thatthere is no pattern specified, which resulted in the actionbeing performed on every record.

    This command can be used to monitor the systems file

    system space usage.

  • 8/3/2019 Unix Utilities 1

    7/13

    sed ( streameditor)

    The power and usefulness of sed can be seen when thesame edit, or a series of edits, has to be performed multipletimes in a single file, or one or more times in multiplefiles. Imagine having to use vi interactively to make the

    same changes to 100 different files sed receives text input, whether from stdin or from a file,

    performs certain operations on specified lines of the input, oneline at a time, then outputs the result to stdout or to a file.

    The command line syntax for sedis:sed [-e] 'command1' [-e command2...] [file]

    Notice that more than one sed command can be performed byprefacing e option for each command. This e option not required ifonly one command is used. Specifying file is optional.

  • 8/3/2019 Unix Utilities 1

    8/13

    Basic Sed operators

    Operator Name Effect

    [address-range]/p Print Print [specified address range]

    [address-range]/d Delete Delete [specified address range]

    s/pattern1/pattern2/ Substitute Substitute pattern2 for first instance ofpattern1 in a line

    [address-range]/s/pattern1/pattern2

    /

    Substitute Substitute pattern2 for first instance ofpattern1 in a line, over address-range

    [address-range]/y/pattern1/pattern2

    /

    Transform replace any character in pattern1 with thecorresponding character in pattern2, overaddress-range(equivalent of tr)

    g Global Operate on everypattern match withineach matched line of input

  • 8/3/2019 Unix Utilities 1

    9/13

    ExamplesOperator Effect

    8d Delete 8th line of input

    /^$/d Delete all blank lines1,/^$/d

    /[Uu]nix/!d

    Delete from beginning of input up to, and including first blankline

    Deletes lines that do not contain the word unix

    /UNIX/p

    /[0-9]/p

    Print only lines containing UNIX" (with -n option).

    Prints the lines that contains the any digits

    s/Windows/Linux/ Substitute "Linux" for first instance of "Windows" found in eachinput line

    s/Windows/Linux/g Substitute Linux" for every instance of Windows" found in eachinput line.

    s/ *$// Delete all spaces at the end of every line.

    s/00*/0/g Compress all consecutive sequences of zeroes into a singlezero.

    /UNIX/d

    s/UNIX//g

    Delete all lines containing UNIX".

    Delete all instances of UNIX", leaving the remainder of eachline intact.

  • 8/3/2019 Unix Utilities 1

    10/13

    Consider an example with a sample file named address_list

    Name Address City StateJohn Daggett, 341 King Road, Plymouth MA

    Alice Ford, 22 East Broadway, Richmond VA

    Orville Thomas, 11345 Oak Bridge Road, Tulsa OKTerry Kalkas, 402 Lans Road, Beaver Falls PA

    Eric Adams, 20 Post Road, Sudbury MA

    Hubert Sims, 328A Brook Road, Roanoke VA

    Amy Wilde, 334 Bayshore Pkwy, Mountain View CA

    Sal Carpenter, 73 6th Street, Boston MA

    Lets replace MA with Massachusetts$ sed s/MA/Massachusetts/ address_list

    John Daggett, 341 King Road, Plymouth Massachusetts

    Alice Ford, 22 East Broadway, Richmond VA

    Orville Thomas, 11345 Oak Bridge Road, Tulsa OK

    Terry Kalkas, 402 Lans Road, Beaver Falls PA

    Eric Adams, 20 Post Road, Sudbury MassachusettsHubert Sims, 328A Brook Road, Roanoke VA

    Amy Wilde, 334 Bayshore Pkwy, Mountain View CA

    Sal Carpenter, 73 6th Street, Boston Massachusetts

    Three lines are affected by the sed operation and all the linesare displayed on the screen

  • 8/3/2019 Unix Utilities 1

    11/13

    Multiple Substitutions with Sed

    $ sed 's/MA/Massachusetts/; s/CA/California/' address_list

    The above example shows how we can do multiple

    substitutions using the ;(semi colon) to separate them.

    Lets consider another example to place a comma after the city.

    $ sed 's/ \([A-Z]\{2\}\)$/, \1/' address_list

    John Daggett, 341 King Road, Plymouth, MA

    Alice Ford, 22 East Broadway, Richmond, VA

    Orville Thomas, 11345 Oak Bridge Road, Tulsa, OK

    Terry Kalkas, 402 Lans Road, Beaver Falls, PA

    Eric Adams, 20 Post Road, Sudbury, MA

    Hubert Sims, 328A Brook Road, Roanoke, VA

    Amy Wilde, 334 Bayshore Pkwy, Mountain View, CA

    Sal Carpenter, 73 6th Street, Boston, MA

  • 8/3/2019 Unix Utilities 1

    12/13

    Deleting Lines Consider the file address_list.

    Name Address City State

    John Daggett, 341 King Road, Plymouth MAAlice Ford, 22 East Broadway, Richmond VA

    Orville Thomas,11345 Oak Bridge Road, Tulsa OK

    Terry Kalkas, 402 Lans Road, Beaver Falls PA

    Eric Adams, 20 Post Road, Sudbury MA

    Hubert Sims, 328A Brook Road, Roanoke VA

    Amy Wilde, 334 Bayshore Pkwy, Mountain View CA

    Sal Carpenter, 73 6th Street, Boston MA

    The address of John Daggett is no more and needs to be deleted from the address_list.

    The sed command to delete is

    $ sed '/^[Jj]ohn/d' address_list

  • 8/3/2019 Unix Utilities 1

    13/13

    Few notations that are used in the above example are

    1. \(slash) This will remember whatever is placed inside so that youcan use it in the regular expression. According to how many \( \) youused, you will need to use \1, \2, \3 and so on in the replacementexpression.

    2. [A-Z] The character set. The [] expression means match any singlecharacter in that set. The A-Z part means that the set includes all charactersfrom A through Z. Note, this is case sensitive and a-z will not match thesame set of characters that A-Z will. If you just put [AZ] in, it will onlymatch A or Z, not B, C, D, E, ... , W, X or Y.

    3.\{2\} - This simply means the match must have 2 of the previousexpression. The $ character at the end of the matching expressionmatches the end of the line.

    Together this all means, match a space followed by exactly 2 characters that arefrom A through Z and are at the end of the line. In this file, as long as theformat remains the same, it will only match the last field containing thestate's abbreviation following the city name. So it is relatively easy to addthe comma between those two fields.