unix and software tools (p51ust) awk programming

24
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming Ruibin Bai (Room AB326) Division of Computer Science The University of Nottingham Ningbo, China

Upload: morton

Post on 01-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Unix and Software Tools (P51UST) Awk Programming. Ruibin Bai (Room AB326) Division of Computer Science The University of Nottingham Ningbo, China. What is awk?. A pattern matching program for processing texts, initially implemented in 1977. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Unix and Software Tools (P51UST)  Awk Programming

1P51UST: Unix and Software Tools

Unix and Software Tools (P51UST)

Awk Programming

Ruibin Bai (Room AB326)

Division of Computer Science

The University of Nottingham Ningbo, China

Page 2: Unix and Software Tools (P51UST)  Awk Programming

What is awk?

• A pattern matching program for processing texts, initially implemented in 1977.

• The name AWK is derived from family names of its authors: Alfred Aho, Peter Weinberger and Brain Kerinighan

• There are different versions of awk:– awk - the original version, sometimes called old awk, or

oawk

– New awk - additional features added in 1985. Often called nawk

– GNU awk (gawk)- has even more features

Page 3: Unix and Software Tools (P51UST)  Awk Programming

What does awk do?

• A text file is thought of as being made up of records and fields

• On this file you can:– Do arithmetic and string operations

– Use loops and conditionals (if-then-else)

– Produce formatted reports

Page 4: Unix and Software Tools (P51UST)  Awk Programming

What does awk do? (2)

• awk (new awk) also allows you to:– Execute UNIX commands from within a script

– Process the output from UNIX commands

– Work with multiple input streams

– Define functions

Page 5: Unix and Software Tools (P51UST)  Awk Programming

What does awk do? (3)

• awk can also be combined with shell scripting!– Shell is very easy and quick to write, but it lacks

functionality.

– awk and shell are designed to be integrated

• Simply invoke the awk interpreter from within the shell script, rather than from the command line!

Page 6: Unix and Software Tools (P51UST)  Awk Programming

Awk Syntax

The awk command has the following syntax

awk [-F field_sep] ‘program’ target-files(s)

Or

awk [-F field_sep] –f program.file target-file(s)

6P51UST: Unix and Software Tools

Page 7: Unix and Software Tools (P51UST)  Awk Programming

Awk Syntax

awk [-F field_sep] ‘program’ target-files(s)

• The program is one or more awk programming commands, typed in at command-line, enclosed by single quotes.

• target-files is one or more of the input files the command is to process.

• Option –F : allows you to change awk’s field separator. Default field separator is white space (one or more spaces or tabs).

7P51UST: Unix and Software Tools

Page 8: Unix and Software Tools (P51UST)  Awk Programming

Awk Syntax

awk [-F field_sep] –f program.file target-file(s)

• Option –f specifies that the filename that follows contains the awk programming commands.

• Awk takes its instructions from that file rather than from the command line.

• target-files is one or more of the input files the command is to process.

• Using –f option is preferred–More efficient to debug, modify and enhance your awk programming commands.

–You could use this awk script again in future

–Easier to manage if the program grown over time.

8P51UST: Unix and Software Tools

Page 9: Unix and Software Tools (P51UST)  Awk Programming

Example – First Taste of awk

• When type in the following command in a terminal$ cat fingr.txt

zliyccj Cai Jiangliang pts/4 1d Apr 8 10:51

zliyccj Cai Jiangliang *pts/17 15:12 Apr 8 19:11

zliyccj2 Chen Jianjun *pts/6 13:48 Apr 8 21:29

zliyccj2 Chen Jianjun pts/30 14:47 Apr 8 20:31

zliyccl Cao Lizhou pts/98 20:09 Apr 8 15:08

zliychj2 He Jiansen pts/19 17:04 Apr 8 18:14

zliychl Huang Lun *pts/26 14:23 Apr 8 18:04

...

• Task: extract the username field?

9P51UST: Unix and Software Tools

Page 10: Unix and Software Tools (P51UST)  Awk Programming

Example – First Taste of awk

• Field separator: white space and tabs, default

• Target-file: fingr.txt

• Commands:awk ‘{print $1}’ fingr.txt

Or

awk -F “ ” '{print $1}' fingr.txt

10P51UST: Unix and Software Tools

Page 11: Unix and Software Tools (P51UST)  Awk Programming

Awk Variables

• Some common awk variables

11P51UST: Unix and Software Tools

variables Contents

FS Field separator, usually white space

OFS Output field separator, usually white space

RS Record separator, default “\n”

NR Number of the current lines

NF The number of fields, based on FS

$1-$(NF) Contains strings from the first field to the last field

$0 Whole current line

FILENAME The name of the current target file

ARGV Contains an array of the command-line arguments (not supported by awk)

ARGC The number of command-line arguments (not supported by awk).

Page 12: Unix and Software Tools (P51UST)  Awk Programming

Awk Program Format

• General Formatpattern {action}

– A pattern selects which of the lines from the file or files, if any, are acted upon.

– Patterns can be relational expressions or regular expressions, or others.

– An action is defined as one or more awk commands enclosed in a pair of curly brackes ‘{ }’.

12P51UST: Unix and Software Tools

Page 13: Unix and Software Tools (P51UST)  Awk Programming

Awk Program Format

Program 1 Program 2

– The action associated with a particular pattern must begin on the same line as the pattern with which it is associated.

– Program 1 and Program 2 performs very differently.

13P51UST: Unix and Software Tools

pattern {

action 1action 2

}

pattern {action 1action 2

}

Page 14: Unix and Software Tools (P51UST)  Awk Programming

Awk Program Format

• Patterns without associated actions will print the lines that are matched, while actions without patterns will be applied on every line.

• For program 1, the actions will be performed only on lines that match the pattern.

• In program 2, each line that matches the pattern would be displayed and the actions would be performed on every line.

14P51UST: Unix and Software Tools

pattern {

action 1action 2

}

pattern {action 1action 2

}

Page 15: Unix and Software Tools (P51UST)  Awk Programming

Patterns (1)

• Relational expressions$1 < 4 {action}

$1 <= 4 {action}

$1 == “bac” {action}

($1 < $2 && $3 > $1) || $4 != “abc” {action}

• The tilde (~) $1 ~ /^z/ {action}

Note: Regular expression must be enclosed in forward slashes ‘/’

15P51UST: Unix and Software Tools

Symbol

Meaning

< (<=) Less than (Less than or equal to)

> (>=) Greater than (Greater than or equal)

!= Not equal to

== Equal

~ Contains (or matches) the regular expression

!~ Does not contain (or match) the regular expression

&& Logical AND

|| Logical OR

Page 16: Unix and Software Tools (P51UST)  Awk Programming

Example

(($1 < $2 && $3 > $1) && $4 !=“frog”) || $6 >0

Data:1 4 0 toad frog 03 7 9 frog salamander 03 7 9 fish salamander 00 0 1 cricket fish 95 1 3 toad spider -40 0 0 wasp bee 1

Which of these lines do you predict will match?

16P51UST: Unix and Software Tools

Page 17: Unix and Software Tools (P51UST)  Awk Programming

Patterns (2)

• Regular expressions– You should know this already

– A regular expression should be enclosed in forward slashes

awk ‘/6$/ {print $0}’ fingr.txt

17P51UST: Unix and Software Tools

Page 18: Unix and Software Tools (P51UST)  Awk Programming

Patterns (2)

• Regular expressions– You should know this already

– A regular expression should be enclosed in forward slashes

awk ‘/6$/ {print $0}’ fingr.txt

18P51UST: Unix and Software Tools

Page 19: Unix and Software Tools (P51UST)  Awk Programming

Pattern (3)

• Special Patterns: BEGIN, END

BEGIN: – Is the first pattern in an awk script

– The associated actions will be performed before the target file is opened.

– The open curly bracket must appear on the same line as the BEGIN

– A good place to print out headers, initilise counters, and set the field separator for the target file.

– Do not try to manipulate target files in the associated actions.

19P51UST: Unix and Software Tools

Page 20: Unix and Software Tools (P51UST)  Awk Programming

Pattern -BEGIN

BEGIN {

FS=“:”

OFS=“ ”

count=0

print (“This is a cool heading”)

}

20P51UST: Unix and Software Tools

Page 21: Unix and Software Tools (P51UST)  Awk Programming

Pattern (4)

• Special Patterns: END– Is the last pattern in an awk script

– The associated actions will be performed after the target file is successfully closed.

– The open curly bracket must appear on the same line as the END

– Useful for putting trailers on files, summing up the data presented, and doing other end-of-file housekeeping tasks.

21P51UST: Unix and Software Tools

Page 22: Unix and Software Tools (P51UST)  Awk Programming

Patterns – Putting Things Together

BEGIN {

print ("Here come the lines from the input file:")

print " "

}

{print $0}

END {

print " "

print "There were "NR" lines in the file "FILENAME"."

}

22P51UST: Unix and Software Tools

Page 23: Unix and Software Tools (P51UST)  Awk Programming

Examples

BEGIN { FS="\n" RS="" #empty line as a record separator }{ print $0}END{ print "total number of fields is " NF}

23P51UST: Unix and Software Tools

Page 24: Unix and Software Tools (P51UST)  Awk Programming

Next Lecture

• Awk commands

• Loops and conditionals

• Arrays

• Functions

24P51UST: Unix and Software Tools