快速业务通道

LinuxCBT_Awk_

作者 佚名技术 来源 Linux系统 浏览 发布时间 2012-03-28

###FEATURES COMMONS TO BOTH AWK & SED###

1. Both are scripting languages
2. Both work primarily with text files
3. Both are programmable editors
4. Both accept command-line options and can be scripted (-f script_name)
5. Both GNU versions uspport POSIX (GREP) and EGREP RegExes
6. Lineage = ed (editor) -> sed -> awk

###SED''s FEATURES###
1. Non-interactive editor
2. Stream Editor
a. Manipulates input - performing edits as instructed
b. Sed accepts input on/from: STDIN (Keyboard), File, Pipe (|)
3. Sed Loops through ALL input lines of input stream or file, by DEFAULT
4. Does NOT operate on the source file, by default. (Will NOT clobber the original file, unless instructed to do so)
5. Supports addresses to indicate which lines to operate on: /^$/d - deletes blank lines
6. Stores active (current) line the ''pattern space'' and maintains a ''hold space'' for usage
7. Used primarily to perform Search-and-Replaces

###AWK''s FEATURES###
1. Field processor based on whitespace, by default
2. Used for reporting ( extracting specific columns) from data feed
3. Supports programming constructs:
a. loop (for, while, do)
b. conditioins (if, then, else)
c. arrays (lists)
d. functions (string, umeric, user-defined)
4. Automatically tokenizes words in a line for later usage - $1, $2, $3, etc. (This is based on the current delimiter)
5. Automatically loops through input like Sed, making lines availables for processing
6. Ability to execute shell commands using ''system()'' functions


###REGULAR EXPRESSIONS (RegEx) REVIEW###
Regular Expressions (RegExes) are key to mastering Awk & Sed

###METACHARACTERS###
^ - matches the character(s) at the beginning of a line
a. sed -ne ''/^dog/p'' animals.txt

$ - matches the character(s) at the end of a line
a. sed -ne ''/dog$/p'' animals.txt

Task: Match line which contains only ''dog'':
a. sed -ne ''/^dog$/p'' animals.txt
b. sed -ne ''/^dog$/p'' - reads from STDIN, Press Enter after each line, Terminate with CTRL-D
c. cat animals.txt | sed -ne ''/^dog$/p''
d. cat animals.txt | sed -ne ''/^dog$/Ip'' - Prints matches case-insensitively

. - matches any character (typically except new line)
a. sed -ne ''/^d...$/Ip'' animals.txt
b. sed -ne ''/^d.../Ip'' animals.txt

###REGEX QUANTIFIERS###
* - 0 or more matches of the previous character
- 1 or more matches of the previous character
? - 0 or 1 of the previous character

a. sed -ne ''/^d.\ /Ip'' animals.txt
Note: Escape quantifiers in RegExes using the escape character ''\''

###CHARACTERS CLASSES###
Allow to search for a range of characters
a. [0-9]
b. [a-z][A-Z]

a. sed -ne ''/^d.\ [0-9]/Ip'' animals.txt

Note: Character Classes match 1, and only 1 character


###INTRO TO SED###
Usage:
1. sed [options] ''instruction'' file | PIPE | STDIN
2. sed -e ''instruction1'' -e ''instruction2'' ...
3. sed -f script_file_name file
Note: Execute Sed by indicating instruction on one of the following:
1. Command-line
2. Script File

Note: Sed accepts instructions based on ''/pattern_tp_match/action''
###Print Specific Lines of a file###
Note: ''-e'' is optional if there is only 1 instruction to execute
sed -ne ''1p'' animals.txt - prints first line of file
sed -ne ''2p'' animals.txt - prints second line of file
sed -ne ''$p'' animals.txt - prints last printable line of file
sed -ne ''2,4p'' animals.txt - prints lines 2-4 from file
sed -ne ''1!p'' animals.txt - prints ALL EXCEPT line #1
sed -ne ''1,4!p'' animals.txt - prints ALL EXCEPT line 1 - 4
sed -ne ''/dog/p'' animals.txt - prints ALL line scontaining ''dog'' - case-sensitive
sed -ne ''/dog/Ip'' animals.txt - prints ALL line scontaining ''dog'' - case-insensitive
sed -ne ''/[0-9]/p'' animals.txt - prints ALL lines with AT LEAST 1 numeric
sed -ne ''/cat/,/deer/p'' animals.txt - prints ALL lines beginning with ''cat'', ending with ''deer''
sed -ne ''/deer/, 2p'' animals.txt - prints the line with ''deer'' plus 2 extra lines

###Delete Lines using Sed Addresses###
sed -e ''/^$/d'' animals.txt - deletes blank lines from file
Note: Drop ''-n'' to see the new output when deleting

sed -e ''1d'' animals.txt - deletes the first line form animals.txt
sed -e ''1,4d'' animals.txt - deletes lines 1-4 form animals.txt
sed -e ''1~2d'' animals.txt - deletes every 2nd line beginning with line 2 - 1, 3, 5...

###Saves Sed''s Changes using Output Redirection###

sed -e ''/^$/d'' animals.txt > animals2.txt - deletes blank lines from file and creates new output file ''animals2.txt


###SEARCH & REPLACE USING Sed###
General Usage:
sed -e ''s/find/replace/g'' animals.txt - replaces ''find'' with ''replace''
Note: Left Hand Side (LHS) supports literals and RegExes
Note: Right Hand Side (RHS) supports literals and back references

Examples:
sed -e ''s/LinuxCBT/UnixCBT/'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT
sed -e ''s/LinuxCBT/UnixCBT/I'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT (Case-Insensitives)

Note: Replacements occur on the FIRST match, unless ''g'' is appended to the s/find/replace/g sequence
sed -e ''s/LinuxCBT/UnixCBT/Ig'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT (Case-Insensitives)

Task:
1. Remove ALL blank lines
2. Substitute ''cat'', regardless of case, with ''Tiger''

Note: Whenever using ''-n'' option, you MUST specify the print modifier ''p''
sed -ne ''/^$/d'' -e ''s/cat/Tiger/Ig'' animals.txt - removes blank lines & substitutes ''cat'' with ''Tiger''
OR sed -e ''/^$/d; s/cat/Tiger/Igp'' animals.txt - does the same as above
Note: Simply separate multiple commands with semicolons

###Update Source File - Backup Source File###
sed -i.bak -e ''/^$/d; s/Cat/Tiger/Igp'' animals.txt - performs as above, but ALSO replaces the source file and backs it up


###Search & Replace (Text Substitution) Continued###
sed -e ''/address/s/find/replace/g/'' file
sed -e ''/Tiger/s/dog/mutt/g'' animals.txt
sed -ne ''/Tiger/s/dog/mutt/gp'' animals.txt - substitutes ''dog'' with ''mutt'' where line contains ''Tiger''
sed -e ''/Tiger/s/dog/mutt/gI'' animals.txt
sed -e ''/^Tiger/s/dog/mutt/gI'' animals.txt - Updates lines that begin with ''Tiger''
sed -e ''/^Tiger/Is/dog/mutt/gI'' animals.txt - Updates lines that begin with ''Tiger'' (Case-Insensitive)

###Focus on the Right Hand Side (RHS) of Search & Replace Function in SED###
Note: SED reserves a few characters to help with substitutions based on the matchsd pattern from the LHS
& = The full value of the LHS (Pattern Matched) OR the values in the pattern space

Task:
Intersperse each line with the word ''Animal ''
sed -ne ''s/.*/&/p'' animals.txt - replace the matched pattern with the matched pattern
sed -ne ''s/.*/Animal &/p'' animals.txt - Intersperses ''Animal'' on each line
sed -ne ''s/.*/Animal: &/p'' animals.txt - Intersperses ''Animal'' on each line

sed -ne ''s/.*[0-9]/&/p'' animals.txt - returns animals with at least 1 numeric at the end of the name
sed -ne ''s/.*[0-9]\{1\}/&/p'' animals.txt - returns animals with only 1 numeric at the end of the name
sed -ne ''s/[a-z][0-9]\{4\}$/&/pI'' animals.txt - returns animal(s) with 4 numeric values at the end of the line
sed -ne ''s/[a-z][0-9]\{1,4\}$/&/pI'' animals.txt - returns animal(s) with at leaset 1, up to 4 numeric values at the end of the name

###Grouping & Backreferences###
#Note: Segement matches into backreferences using escaped parenthesis: \(RegEx\)
sed -ne ''s/\(.*\)\([0-9]\)/&/p'' animals.txt - This creates 2 variables: \1 & \2
sed -ne ''s/\(.*\)\([0-9]\)$/\1/p'' animals.txt - This creates 2 variables: \1 & \2 but references \1
sed -ne ''s/\(.*\)\([0-9]\)$/\2/p'' animals.txt - This creates 2 variables: \1 & \2 but references \2
sed -ne ''s/\(.*\)\([0-9]\)$/\1 \2/p'' animals.txt - This creates 2 variables: \1 & \2 but references \1 and \2


###Apply Changes to Multiple Files###
Sed Supports Globbing: *, ?
sed -ne ''s/\(.*\)\([0-9]\)$/\1 \2/p'' animals*.txt - This creates 2 variables: \1 & \2 but references \1 and \2

###Sed Scripts###
Note: Sed supports scripting, which means, the ability to dump 1 or more instructions into 1 file

sed -f script_file_name text_file

sed -f animals.sed animals.txt

Task:
Perform multiple transformations on animals.txt
1. /^$/d - removes blank lines
2. s/dog/frog/Ig - substitute globally ''dog'' with ''frog'' - (case-insensitive)
3. s/tiger/lion/Ig - substitute globally ''tiger'' with ''lion'' - (case-insensitive)
4. s/.*/Animals: &/ - Interspersed ''Animals:''
5. s/animals/mammals/iG - replaced ''Animals'' with mammals''
6. s/\([a-z]*\)\([0-9]*\)/\1/Ip - Strips trailing numeric values from alphas

Sed Scripting Rules:
1. Sed applies ALL rules to each line
2. Sed applies ALL changes dynamically to the pattern space
3. Sed ALWAYS works with the current line


###Awk - Intro###
Features:
1. Reporter
2. Field Processor
3. Supports Scripting
4. Programming Constructs
5. Default delimiter is whitespace
6. Supports: Pipes, Files, and STDIN as sources of input
7. Automatically tokenizes processed columns/fields into the variables: $1, $2, $3 .. $n
8. Supports GREP and EGREP RegExes

Usage:
awk ''{instructions}'' file(s)
awk ''/pattern/ { procedure }'' file
awk -f script_file file(s)


Tasks:
Note: $0 represents the current record or row
1. Print enrire row, one at a time, form a input file (animals.txt)
a. awk ''{ print $0 }'' animals.txt

2. Print specific columns from (animals.txt)
a. awk ''{ print $1 }'' animals.txt - this print the 1st column form the file

3. Print multiple columns from (animals.txt)
a. awk ''{ print $1; print $2; }'' animals.txt
b. awk ''{ print $1,$2; }'' animals.txt

4. Print columns from lines containing ''deer'' using RegEx Support
a. awk ''/deer/ { print $0 }'' animals.txt

5. Print columns from lines containing digits
a. awk ''/[0-9]/ { print $0 }'' animals.txt

6. Remove blank lines with Sed and pipe output to awk for processing
a. sed -e ''/^$/d'' animals.txt | awk ''/[0-9]/ { print $0 }''

7. Print blank lines
a. awk ''/^$/ { print }'' animals.txt
b. awk ''/^$/ { print $0 }'' animals.txt

8. Print ALL lines beginning with the animal ''dog'' case-insensitive
b. Effect the change to ALL product files and create .new output files without clobbering the source file
for i in `ls -A products_*php`; do sed -e ''s/<b>Shipping<\/b>:&nbsp;Free<br>//'' $i > $i.new; done

2. Strip ''.new'' suffix from newly generated files
a. echo "products_linuxcbt.php.new" | sed -e ''s/\.new//''
b. for i in `ls -A products_*new | sed -e ''s/\.new//''`; echo $i; done
c. for i in `ls -A products_*new | sed -e ''s/\.new//''`; do mv $i.new $i; done

3. Remove ''Free Shipping'' from faq.php file
a. Code to remove: <li>Free Shipping
b. sed -e ''s/<li>Free Shipping//'' faq.php > faq.php.new


Use Awk & Sed Together to update specific rows in /var/log/message
Task:
a. Update Month information for kernel messages for September 3
awk ''$1 ~ /Sep/ && $2 ~ /3/ && $5 ~ /kernel/ { print }'' /var/log/message
b. awk ''$1 ~ /Sep/ && $2 ~ /3/ && $5 ~ /kernel/ { total; print } END { print "Total Records Updated:" total }'' /var/log/message | sed -ne ''s/Sep/September/p''

###Windows Support for GNU Sed & Awk###
Download GNU Sed & Awk from: http://gnuwin32.sourceforge.net

Windows Stuff:
gawk "BEGIN { max=ARGV[1]; for (i=1;i<=max; i) print i }" 10 - reads 10 from ARGV[1] and passes it to ''max'' var for use in the ''for'' loop

凌众科技专业提供服务器租用、服务器托管、企业邮局、虚拟主机等服务,公司网站:http://www.lingzhong.cn 为了给广大客户了解更多的技术信息,本技术文章收集来源于网络,凌众科技尊重文章作者的版权,如果有涉及你的版权有必要删除你的文章,请和我们联系。以上信息与文章正文是不可分割的一部分,如果您要转载本文章,请保留以上信息,谢谢!

分享到: 更多

Copyright ©1999-2011 厦门凌众科技有限公司 厦门优通互联科技开发有限公司 All rights reserved

地址(ADD):厦门软件园二期望海路63号701E(东南融通旁) 邮编(ZIP):361008

电话:0592-5908028 传真:0592-5908039 咨询信箱:web@lingzhong.cn 咨询OICQ:173723134

《中华人民共和国增值电信业务经营许可证》闽B2-20100024  ICP备案:闽ICP备05037997号