linux--shell Awk-白红宇

linux--shell Awk

阅读量：587 次

发布时间：2019-03-11

本文共 7231 字，大约阅读时间需要 24 分钟。

文章目录

Awk

简介

Awk是一种编程语言，诞生于1977年，其名称为三位作者姓的首字母缩写：Alfred Aho 、Peter Weinberger 和 Brian Kernighanawk主要用于linux/unix下对文本和数据进行扫描处理数据可以来自标准输入，文件，管道awk有众多的发行版本，awk,nawk,gawk,MKS awk,tawk包括开源产品和商业产品目前linux中常用的swk编译器版本mawk，gawkrhel系统默认的为gawkubuntu系列产品用mawk本节用gawk实现，rhel默认GUN开源项目awk解释器的开源代码实现

工作流程

首先对文件逐行扫描，从第一行到最后一行，逐行进行匹配特定模式的行,并在这些行上进行用户想要的操作，Awk基本结构由模式匹配和处理过程（处理动作）组成pattern  ｛action｝注意：awk读取文件每一行时，将对改行是否与给定的模式相匹配，如果匹配，则处理执行处理过程，否则不做任何操作，如果没有指定处理脚本，则把匹配的行标准输出，默认处理动作为print打印如果没有指定模式匹配，则默认匹配所有数据Awk有俩个特殊模式，BEGIN/END，被放置在没有读取任何数据之前以及在所有数据读取完成后执行

基础语法

格式：gawk  [选项] -f program-file  [--] file ………………#选项：-F fs， --field-separator fs指定fs为输入行的分隔符（默认的分隔符为空格或制表符）-v var=val，--assign var=val在执行处理动作之前，设置一个变量var值为var-f program-file，--file program-file从脚本文件中读取awk指令，以取代在命令参数中输入脚本-W compat，-W traditional，--compat，--traditional使用兼容模式运行AWK，GUN拓展选项将被忽略-W dump-variables[=file] ，--dump-variables[=file]打印全局变量(变量名，类型，值)到文件中，如果没有提供文件名，则自动输出至名为dump-variables文件中（演示失败）-W copyleft ，-W copyright，--copyleft，--copyright输出打印简短的GUN版本信息

awk程序语法结构：一个awk程序包含一系列的模式{动作指令}或者函数定义

模式可以是BEGIN/END/表达式用来限定操作对象的多个表达式使用逗号隔开

动作指令需要用{}引起来

案例

1。使用正则表达式匹配空行: /^$/,动作为打印"哈哈哈哈，有空行"[root@localhost ~]# awk '/^$/ {print "哈哈哈哈，有空行"}' quote.txt 哈哈哈哈，有空行哈哈哈哈，有空行哈哈哈哈，有空行哈哈哈哈，有空行[root@localhost ~]# 2.匹配包含anconda的行，并打印  /etc/syconfig/network[root@#localhost ~]# awk '/anaconda/' /etc/sysconfig/network# Created by anaconda[root@#localhost ~]# 3.利用脚本文件，判断是否有空行[root@#localhost ~]# cat awk1.sh /^$/  {print "哈哈哈哈哈哈哈啊哈哈啊"}[root@#localhost ~]# [root@#localhost ~]# awk -f awk1.sh quote.txt 哈哈哈哈哈哈哈啊哈哈啊哈哈哈哈哈哈哈啊哈哈啊哈哈哈哈哈哈哈啊哈哈啊哈哈哈哈哈哈哈啊哈哈啊[root@#localhost ~]#

awk操作指令

1.记录与字段Awk一次从文件中读取一条记录，并将记录存储在字段变量$0中，记录被分割为字段并存储在$1,$2,$3………………$NF中（分隔符默认为空格或制表符）NF：内建变量，记录的字段个数#输出一段话并输出第一个字段，第二个字段，第三个字段[root@#localhost ~]# echo I love my gril friend | awk '{print $1,$2,$3}'I love my[root@#localhost ~]# echo I love my gril friend | awk '{print $1,$2,$3,$5,$4}'I love my friend gril[root@#localhost ~]# #读取输入行数并输出该行[root@#localhost ~]# echo hello world | awk '{print $0,$NF}'hello world world[root@#localhost ~]# echo hahahhahaha | awk -F"a" '{print $NF}'[root@#localhost ~]# echo hahahhahaha | awk -F"h" '{print $NF}'a[root@#localhost ~]# 2.字段与分隔符默认awk读取数据以空格或制表符作为分隔符但可以通过-F来改变分隔符[root@#localhost ~]# awk -F: '{print $1}' /etc/passwd[root@#localhost ~]# awk 'BEGIN {FS=":"} {print $1}' /etc/passwd#指定多个分割符[root@#localhost ~]# echo 'hello the:world,!' | awk 'BEGIN {FS="[:,]"} {print $1,$2,$3,$4}'hello the world ! [root@#localhost ~]#3.内置变量变量名称                描述ARGC            命令行参数个数FILENAME        当前输入文档的名称FNR                当前输入文档的记录编号，多个输入文档时有用NR                输入流当前记录编号NF                当前记录的字段个数FS                字段分隔符OFS                输出字段分隔符，默认是空格ORS                输出记录分隔符，默认换行符 \nRS                输入记录分隔符，默认换行符#示例文件[root@#localhost ~]# cat test1.txt This is a test fileWelcome to Jacob's Class[root@#localhost ~]# cat test2.txt Hello the worldWow! I'm overwhelmed.Ask for more[root@#localhost ~]##输出当前文档的当前行编号，第一个文件俩行，第二个文件三行[root@#localhost ~]# awk '{print FNR}' test1.txt test2.txt #Awk将俩个文档作为一个整体的输入流，通过NR输入当前编号[root@#localhost ~]# awk '{print NR}' test1.txt test2.txt[root@#localhost ~]# awk '{print $1,$2,$3,$4}' test1.txt This is a testWelcome to Jacob's Class[root@#localhost ~]#     #通过OFS将输出分割符设置为'-',print在输出时第1,2,3个字段中间的分割符为‘-’[root@#localhost ~]# awk 'BEGIN {OFS="-"} {print $1,$2,$3,$4}' test1.txt This-is-a-testWelcome-to-Jacob's-Class[root@#localhost ~]##示例文件[root@#localhost ~]# cat test3.txt mail from:tomcat@gmail.comsubjectLhellodata:2018-11-12 17:00content:Hello,The worldmail from: jerry@gamil.comsubject:congregationdata:2018-11-12 08:31content:Congregation to you.mail from: jacob@gmail.comsubject:Testdata:2018-11-12 10:20content:This is a test mail[root@#localhost ~]# #读取输入数据，以空白行为记录分割符，即第一个空白行前的内容为一个记录 [root@#localhost ~]# awk 'BEGIN {FS="\n";RS=""} {print $2}' test3.txt subjectLhellosubject:congregationsubject:Test[root@#localhost ~]# 4.表达式与操作    表达式由变量，常量，函数，正则表达式，操作符组成    awk中变量有字符和数字变量，如果在awk中定义变量没有初始化，则初始化值为空字符或0    字符操作一定加引号    #定义变量    a='hello'    b=12        #操作符    +    -    *    /    除    %    取余    ^    幂运算    ++    --    +=    -=    *=    /=    '>'    <    >=    <=    ==  等于    !=   不等于    ~    匹配    !~    不匹配    &&    与    ||    或    #案例        [root@#localhost ~]# echo test | awk 'x=2 {print x+3}'        [root@#localhost ~]# echo hello | awk 'x=1,y=3 {print x*2,y*3}'    2 9    [root@#localhost ~]#     #统计所有空白行    [root@#localhost ~]# awk '/^$/ {print x+=1}' test3.txt     1    2    [root@#localhost ~]# awk '/^$/ {x+=1} END {print x}' test3.txt     2    [root@#localhost ~]#     #打印root用户的额ID号    [root@#localhost ~]# awk -F: '$1~/root/ {print $3}' /etc/passwd    0        #打印用户uid号大于500的用户        [root@#localhost ~]# awk -F: '$3>500 {print $1}' /etc/passwd    polkitd    unbound    colord    saslauth    libstoragemgmt    nfsnobody    chrony    gnome-initial-setup    hello    [root@#localhost ~]#

Awk高级应用

IF语句条件判断

#if语法：if(表达式)动作1else动作2#if语法格式2：if(表达式) 动作1；else 动作2#如果表达式判断条件成立，执行动作1，否则执行动作2.[root@server0 ~]# df | grep boot | awk '{print $4}'387300[root@server0 ~]# df | grep boot | awk '{if($4<200)print "Error!!";else print "OK"}'OK[root@server0 ~]# df | grep boot | awk '{if($4<2000000)print "Error!!";else print "OK"}'Error!![root@server0 ~]#

while语句循环

while语法格式1：while(条件)动作`x=1while (x<10){    print $x    x++}`[root@server0 ~]# awk 'i=1 {} BEGIN {while (i<=10){print i;++i}}'#while循环语法格式2do    动作while (条件)[root@server0 ~]# awk 'BEGIN {do {++x;print x}while (x<=10)}'[root@server0 ~]#  awk -F: '{i=1;while(i<=NF){print i":"$i;i++}}' passwd.bak

for语句循环

for (变量;条件;计数器)    动作[root@server0 ~]# awk 'BEGIN {for(i=1;i<=5;i++)print i}'[root@server0 ~]# awk 'BEGIN {for(i=10;i>=1;i--)print i}'[root@server0 ~]# awk -F: '{for(i=1;i<=NF;i++){print i":"$i}}' passwd.bak[root@server0 ~]# awk -F: '{a[$7]++}END{for(i in a)if(i !=""){print i":"a[i]}}' passwd.bak /bin/sync:1/bin/bash:2/sbin/nologin:33/sbin/halt:1/bin/false:1/sbin/shutdown:1[root@server0 ~]# ##a[$7] 将$7作为数组的key，然后统计个数##统计完成后遍历，for，判断i是否在数据a中，如果在则打印a[i]值，个数

Break与continue

break：直接跳出循环continue：终止当前循环#打印1-4for (i-1;i<=10;i++){    if (i=5)        break    print i    }[root@server0 ~]# awk 'BEGIN {for(i=1;i<=10;i++){if(i==5)break;print i}}'打印1-4,6-10for (i=1;i<=10;i++){    if(i=5)        continue    print i}[root@server0 ~]# awk 'BEGIN {for(i=1;i<=10;i++){if(i==5)continue;print i}}'

函数

1.rand()函数作用：产生0-1之间的浮点型的随机数，rand产生随机数时需要通过srand()设置一个参数，否则单独的rand()每次产生的随机数相同[root@server0 ~]# awk 'BEGIN{print rand();srand();print srand()}'2.gsub(x,y,z)函数    在字串z中使用字符y替换与正则表达式x相匹配的第一个字串，z默认为$03.sub(x,y,z)函数    在字串z中使用字符y替换与正则表达式x相匹配的第一个字串，z默认为$0    [root@server0 ~]# awk -F: 'gsub(/root/,"hello",$0){print $0}' passwd.bak     hello:x:0:0:hello:/hello:/bin/bash    operator:x:11:0:operator:/hello:/sbin/nologin    [root@server0 ~]#     [root@server0 ~]# awk -F: 'sub(/root/,"hello",$0){print $0}' passwd.bak     hello:x:0:0:root:/root:/bin/bash    operator:x:11:0:operator:/hello:/sbin/nologin    [root@server0 ~]#         #sub相当于sed中的s///,gsub相当于sed中的s///g.4.length(z)函数    计算返回字串z的长度    [root@server0 ~]# awk '{print length()}' test.txt5.getline函数    从输入中读取下一行内容    [root@server0 ~]# df -h | awk 'BEGIN {print "Disk FREE"}{if(NF==1){getline;print $3};if(NF==6)print $4}'

转载地址：http://ipztz.baihongyu.com/

你可能感兴趣的文章

MySQL不会性能调优？看看这份清华架构师编写的MySQL性能优化手册吧

查看>>

MySQL不同字符集及排序规则详解：业务场景下的最佳选

查看>>

Mysql不同官方版本对比

查看>>

MySQL与Informix数据库中的同义表创建：深入解析与比较

查看>>

mysql与mem_细说 MySQL 之 MEM_ROOT

查看>>