filters and utilities. notes: this is a simple overview of the filtering capability some of these...
TRANSCRIPT
Filters and Utilities
Notes:
•This is a simple overview of the filtering capability
•Some of these commands are very powerful▫Only showing some of the basics of a few of
the commands
Reminder:
•Grave accent▫AKA backtick or backquote▫Used for command substitution in bash and
other Linux utilities and languages▫Typical use:
put a command between a pair of ` the std out of the command is substituted
▫Example: #echo The date is:`date`!#The date is:Sun Mar 17 15L51:28 EDT 2013!
What are Filters?
▫Use std in and std out Monitor the input Modify data as appropriate
Change Delete Move "as appropriate"
Send data to standard out
Filter examples• Simple
▫ pr▫ cmp▫ diff▫ comm▫ head▫ tail▫ cut▫ paste▫ sort▫ uniq▫ tr
• Complex▫ grep▫ sed
• Filter/script▫ awk
pr: Paginate Files• Prepare files for printing• Adds:
▫Headers▫Footers▫Formatted text
• Default adds 5 lines before and after text on page• Options:
▫Make columns▫Set page length▫Set page width▫Number lines in output
cmp: Byte by Byte Compare
•Compares two files•Terminates on first delta
▫Echoes the location of first mismatch Usually reports line and character position
▫Returns: True if identical False otherwise
comm: What Is Common between files•Compares files line by line
▫Requires sorted files to work properly•Returns 3 types of differently indented lines
▫ Lines unique to first file▫ Lines unique to second file▫ Lines common to both
• Output is “weird” in columns1st col is lines unique to 1st file
2nd col is lines unique to 2nd file3rd col is common lines
comm.sh in ~/ITIS3110/bashscriptscommbad.sh (with error)
diff: "How to make files the same"•Details how to change one file to make it
the same as the other▫For deltas instructions of how to change
head: Display beginning of file
•Show the first n lines of a file▫Default is 10▫Can change with –n x
•Example use:▫Want to re-edit the last file you edited:▫nano `ls –t | head –n 1`
ls –t: list by time head –n 1: list first entry Feed as a parameter to nano with the
backticks
tail: Display end of file
•Show the last n lines of a file▫Default is 10▫Can change with –n x
•Options▫-f
Monitor the file as it grows Must terminate with <ctrl-C>
▫-c Do the last n chars instead of lines
cut: Splitting a file vertically• Cuts a range out by:
▫Columns Good for fixed length entries -c range
-c1-4▫Fields
Good for delimited entries Tab is default
-d specifies delimiter -d/ set the / as the delimiter
-f specifies the fields to use -f1,4 specifies the first and fourth fields
paste: Paste files vertically
•Paste two files together line by line•Can be used on a single file to join
multiple sequential lines together▫-s
Do serial on a single file▫-d
Separate joined element with the list of delimiters
sort: Order files• Put files in order
▫ Default is ascending order on column 1 ASCII order
• Options:▫ -t
Define a delimiter▫ -k
Used with –t, which field to use Can have multiple keys
Use commas to separate ranges Use –k again to denote a new field
Can sort on columns in a field Use a dot to separate
▫ -n Treat a field as a number, not an ASCII character Remember the number 1 is different than the character "1"
▫ -u Remove repeated lines
uniq: Locating identical lines
•Returns only unique lines▫Options:
-u Return only the non-repeated lines
-d Return only the repeated lines
▫But only one copy of each -c
Return the count of how many times each line is repeated
tr: Translate characters• Changes one set of characters to another, default input is
the standard input• Example:
▫ #tr 'ab' 'cd'This is abnormalThis is cdnormclabsolutecdsoluteab a b ccd c d c^C Blue is std in Red is std out – bold is what changed
▫ Note: a c and b d, not ab cd▫ Note: ^D can be used to denote end of file to tr instead of
the shown ^C which stops the process tr
tr: Translate characters• More examples:
▫Can be used to translate case for a file tr a-z A-Z <file1ortr '[a-z]' '[A-Z]' <file1 Takes the input from file1 with the < redirection Turns all lower case letters to upper case Output goes to std out
▫Get rid of characters tr –d [a-z] <file1
Gets rid of all lower case chars from file1 Again output is std out
▫Compressing repeated chars tr –s ' ' <file1
Changes repeated spaces to a single space
Filters Using Regular Expressions
Regular Expression Review
Regular Expression
•A pattern to match strings of text which is:▫Concise▫Flexible
•Used by many programming languages and operating systems
Regular Expressions
•BRE▫Basic Regular Expression
•ERE▫Extended Regular Expression
•IRE▫Interval Regular Expression
•TRE▫Tagged Regular Expression
Character class
•Set of characters enclosed within square brackets [ ]▫Can be a list of single characters
[aD1] a, D, and the character 1 only
▫Can be a range of characters [a-zA-Z]
All the upper and lower case chars
▫Negate a class [^0-9]
Not the numeric chars 0-9
Regular Expressions
•*▫Refers to the immediately proceeding
character▫Any number of repeated character(s)
0 or more Used with other patterns
[A*]▫Anything that matches 0 or more ‘A’s in a row
▫s*print will match sprint, ssprint, sssprint and print!
• Note: this is not related to the familiar wildcard *
Regular Expressions• .
▫Any character Exactly one
▫S... with match Sort, Sxxx, … Any four char string starting with S
▫Note .* means 0 or more of any character•Pattern starting locations
▫^ Pattern starts at the beginning of a line
▫$ Pattern starts at the end of a line
Extended Regular Expressions
•|▫Either one of a set▫[a|b]
Matches if an a or a b•( and )
▫Chars between the parenthesis and what is before or after
▫‘animaltype:(dog|cat)’ look for animaltype:dog or animaltype:cat
Resume 9/8
Advanced Filters
grep
grep – Search a pattern
•Searches for a pattern in a file▫grep options pattern filename(s)
std in is used if there is no filename Can also pipe data to grep
▫Notes: Pattern does not need be quoted if no
delimiters or special chars in it Can always use quotes to be safe
grep - Options• -i
▫ Ignore case• -v
▫ Don’t display lines matching expression• -l
▫ Display filenames Useful when grepping multiple files
• -e▫ Useful when grepping for –
• -x▫ Match entire line
• -f file▫ Takes expression from a file
grep - examples
•Examples:▫#grep 3 bigfile3file 3 text
▫#grep file bigfilefile 1 textfile 1 textfile 3 textfile 1 textfile 1 textfile 1 text
#cat bigfile3file 1 textfile 1 textfile 3 textfile 1 textfile 1 textfile 1 text
sed
sed – Streaming Editor• Edit a file(s) with a specified action
▫ sed options 'address action' file(s)• Basics:
▫ Take input from the file(s)▫ Performs the action on the file(s)▫ Sends output to std out
• Uses:▫ Select part(s) of a file
By line By content
▫ Edit a file e.g. create a template, then use sed to customize for a run
• Oddities▫ Usually need –n to get rid of unwanted duplicated lines
sed – Line addressing• Select specific lines
▫ #sed '3q' tenline.fileLine 1Line 2Line 3 Selects the first 3 lines then quits
▫ #sed '$p' tenline.fileLast Line Prints last line
$ - last line p – print
▫ #sed '5,7p' tenline.fileLine 5Line 6Line 7 Prints lines 5 through 7
#cat tenline.fileLine 1Line 2Line 3Line 4Line 5Line 6Line 7Line 8Line 9Last Line
sed – Line addressing• Select specific lines with ;
▫ #sed '1p;3p;$p' tenline.fileLine 1Line 3Last Line Prints line 1, 3 and the last line ($)
• ! Will negate operations▫ #sed '3,$!p' tenline.fileLine 1Line 2 Does not print line 3 through the end
• Notes:▫ By default sed will echo the input lines as well as the
selected lines get duplicated lines Use –n to not echo the input lines
sed – Context addressing
•Use a pattern to identify lines to work with▫Use / to delimit the pattern
•Examples▫#sed –n '/2/p' tenline.fileLine 2 Find all lines with 2 in them and print
▫#sed –n '/^2/p' tenline.file Finds all lines that start with 2 and print ^ - starting the line
sed – Writing selected lines to a file
•Can use w to write the selected lines to a file
•Example▫sed –n '/2/w twos.file' tenline.file
w instead of p puts the output to a file -n does not print duplicated
sed – Text editing•Can edit the stream
▫i Insert
▫a Append
▫c Change
▫d Delete
▫s Substitute
sed - editing•Example: inserting
▫#sed '1i\>#!/bin/bash\># using the bash shell>' test.sh > $$ Notes:
1i inserts text starting line 1 Need \ as a continuation character within the quotes Input is the code or text in test.sh Redirecting the output to $$ (temporary file) Ends up with the 2 new lines at the beginning in $$ Can further modify $$
sed - editing
•Use s to indicate substitution•Example: substituting
▫sed 's/a/b/' file replaces a with b for the first instance on
each line▫sed 's/a/b/g' file
g (global) replaces a with b for all instances on each line