wsx@wsx-laptop:~/tmp$ cat data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
键入命令,查看输出
1 2 3 4 5 6 7 8 9 10
wsx@wsx-laptop:~/tmp$ sed 's/dog/cat/' data1.txt The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat.
可以看到符合模式的字符串都被修改了。
要记住,sed并不会修改文本文件的数据,它只会将修改后的数据发送到STDOUT。
在命令行上使用多个编辑器命令
使用-e选项可以执行多个命令
1 2 3 4 5 6 7 8 9 10
wsx@wsx-laptop:~/tmp$ sed -e 's/brown/green/; s/dog/cat/' data1.txt The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat. The quick green fox jumps over the lazy cat.
两个命令都作用到文件中的每一行数据上。命令之间必须用分号隔开,并且在命令末尾与分号之间不同有空格。
如果不想使用分号,可以用bash shell中的次提示符来分隔命令。
1 2 3 4 5 6 7 8 9 10 11 12 13
wsx@wsx-laptop:~/tmp$ sed -e ' > s/brown/green/ > s/fox/elephant/ > s/dog/cat/' data1.txt The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat.
wsx@wsx-laptop:~/tmp$ sed -f script1.sed data1.txt The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat. The quick green elephant jumps over the lazy cat.
wsx@wsx-laptop:~/tmp$ cat data2.txt One line of test text. Two lines of test text. Three lines of test text. wsx@wsx-laptop:~/tmp$ gawk '{print $1}' data2.txt One Two Three
我们可以使用-F选项指定其他字段分隔符:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
wsx@wsx-laptop:~/tmp$ gawk -F: '{print $1}' /etc/passwd root daemon bin sys sync games man lp mail news uucp proxy www-data backup ...
这个简短程序显示了系统中密码文件的第一个数据字段。
在程序脚本中使用多个命令
在命令之间放个分号即可。
1 2
wsx@wsx-laptop:~/tmp$ echo "My name is Shixiang" | gawk '{$4="Christine"; print $0}' My name is Christine
也可以使用次提示符一次一行输入程序脚本命令(类似sed)。
从文件中读取程序
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
wsx@wsx-laptop:~/tmp$ cat script2.gawk {print $1 " 's home directory is " $6} wsx@wsx-laptop:~/tmp$ gawk -F: -f script2.gawk /etc/passwd root 's home directory is /root daemon 's home directory is /usr/sbin bin 's home directory is /bin sys 's home directory is /dev sync 's home directory is /bin games 's home directory is /usr/games man 's home directory is /var/cache/man lp 's home directory is /var/spool/lpd mail 's home directory is /var/mail news 's home directory is /var/spool/news uucp 's home directory is /var/spool/uucp proxy 's home directory is /bin ...
可以在程序文件中指定多条命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
wsx@wsx-laptop:~/tmp$ cat script3.gawk { text = "'s home directory is " print $1 text $6 } wsx@wsx-laptop:~/tmp$ gawk -F: -f script3.gawk /etc/passwd root's home directory is /root daemon's home directory is /usr/sbin bin's home directory is /bin sys's home directory is /dev sync's home directory is /bin games's home directory is /usr/games man's home directory is /var/cache/man lp's home directory is /var/spool/lpd mail's home directory is /var/mail news's home directory is /var/spool/news ...
在处理数据前运行脚本
使用BEGIN关键字可以强制gawk再读取数据前执行BEGIN关键字指定的程序脚本。
1 2 3 4 5 6 7 8 9 10
wsx@wsx-laptop:~/tmp$ cat data3.txt Line 1 Line 2 Line 3 wsx@wsx-laptop:~/tmp$ gawk 'BEGIN {print "The data3 File Contents:"} > {print$0}' data3.txt The data3 File Contents: Line 1 Line 2 Line 3
在gawk执行了BEGIN脚本后,它会用第二段脚本来处理文件数据。
在处理数据后允许脚本
与BEGIN关键字类似,END关键字允许我们指定一个脚本,gawk在读完数据后执行。
1 2 3 4 5 6 7 8
wsx@wsx-laptop:~/tmp$ gawk 'BEGIN {print "The data3 File Contents:"} > {print$0} > END {print"End of File"}' data3.txt The data3 File Contents: Line 1 Line 2 Line 3 End of File
wsx@wsx-laptop:~/tmp$ cat script4.gawk BEGIN { print "The latest list of users and shells" print " UserID \t Shell" print "-------- \t ------" FS=":" }
{ print $1 " \t " $7 }
END { print "This concludes the listing" } wsx@wsx-laptop:~/tmp$ gawk -f script4.gawk /etc/passwd The latest list of users and shells UserID Shell -------- ------ root /bin/bash daemon /usr/sbin/nologin bin /usr/sbin/nologin sys /usr/sbin/nologin sync /bin/sync games /usr/sbin/nologin man /usr/sbin/nologin lp /usr/sbin/nologin mail /usr/sbin/nologin news /usr/sbin/nologin uucp /usr/sbin/nologin proxy /usr/sbin/nologin www-data /usr/sbin/nologin backup /usr/sbin/nologin list /usr/sbin/nologin irc /usr/sbin/nologin gnats /usr/sbin/nologin nobody /usr/sbin/nologin systemd-timesync /bin/false systemd-network /bin/false systemd-resolve /bin/false systemd-bus-proxy /bin/false syslog /bin/false _apt /bin/false lxd /bin/false messagebus /bin/false uuidd /bin/false dnsmasq /bin/false sshd /usr/sbin/nologin pollinate /bin/false wsx /bin/bash This concludes the listing
wsx@wsx-laptop:~/tmp$ cat data4.txt This is a test of the test script. This is the second test of the test script. wsx@wsx-laptop:~/tmp$ sed 's/test/trial/2' data4.txt This is a test of the trial script. This is the second test of the trial script.
该命令只替换每行中第二次出现的匹配模式。而g标记替换所有的匹配之处。
1 2 3
wsx@wsx-laptop:~/tmp$ sed 's/test/trial/g' data4.txt This is a trial of the trial script. This is the second trial of the trial script.
p替换标记会打印与替换命令中指定的模式匹配的行,通常与sed的-n选项一起使用。
1 2 3 4 5
wsx@wsx-laptop:~/tmp$ cat data5.txt This is a test line. This is a different line. wsx@wsx-laptop:~/tmp$ sed -n 's/test/trial/p' data5.txt This is a trial line.
wsx@wsx-laptop:~/tmp$ sed 's/test/trial/w test.txt' data5.txt This is a trial line. This is a different line. wsx@wsx-laptop:~/tmp$ cat test.txt This is a trial line.
wsx@wsx-laptop:~/tmp$ cat data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. wsx@wsx-laptop:~/tmp$ sed '2s/dog/cat/' data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. wsx@wsx-laptop:~/tmp$ sed '2,3s/dog/cat/' data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. wsx@wsx-laptop:~/tmp$ sed '2,$s/dog/cat/' data1.txt # 美元符指代最后一行 The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat. The quick brown fox jumps over the lazy cat.
wsx@wsx-laptop:~/tmp$ sed '2{ > s/fox/elephant/ > s/dog/cat/ > }' data1.txt The quick brown fox jumps over the lazy dog. The quick brown elephant jumps over the lazy cat. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
也可以在一组命令前指定一个地址区间。
1 2 3 4 5 6 7 8 9 10 11 12 13
wsx@wsx-laptop:~/tmp$ sed '3,${ s/brown/green/ s/lazy/active/ }' data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog. The quick green fox jumps over the active dog.
wsx@wsx-laptop:~/tmp$ cat data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. wsx@wsx-laptop:~/tmp$ sed 'd' data1.txt
和指定的地址一起使用才能发挥删除命令的最大功用。
1 2 3 4 5 6 7 8 9
wsx@wsx-laptop:~/tmp$ cat data6.txt This is line number 1. This is line number 2. This is line number 3. This is line number 4. wsx@wsx-laptop:~/tmp$ sed '3d' data6.txt This is line number 1. This is line number 2. This is line number 4.
通过特定行区间指定:
1 2 3
wsx@wsx-laptop:~/tmp$ sed '2,3d' data6.txt This is line number 1. This is line number 4.
通过特殊文本结尾字符指定:
1 2
wsx@wsx-laptop:~/tmp$ sed '2,$d' data6.txt This is line number 1.
还可以使用模式匹配特性:
1 2 3 4
wsx@wsx-laptop:~/tmp$ sed '/number 1/d' data6.txt This is line number 2. This is line number 3. This is line number 4.
wsx@wsx-laptop:~/tmp$ cat data7.txt This is line number 1. This is line number 2. This is line number 3. This is line number 4. This is line number 1 again. This is text you want to keep. This is the last line in the file. wsx@wsx-laptop:~/tmp$ sed '/1/,/3/d' data7.txt This is line number 4.
wsx@wsx-laptop:~/tmp$ sed '/1/,/5/d' data7.txt wsx@wsx-laptop:~/tmp$ sed '/2/,/4/d' data7.txt This is line number 1. This is line number 1 again. This is text you want to keep. This is the last line in the file.
插入和附加文本
sed允许向数据流插入和附加文本行:
插入命令i会在指定行前增加一个新行
附加命令a会在指定行后增加一个新行
注意,它们不能在单个命令行上使用,必须要指定是要插入还是要附加到的那一行。
1 2 3 4 5 6
wsx@wsx-laptop:~/tmp$ echo "Test Line 2" | sed 'i\Test Line 1' Test Line 1 Test Line 2 wsx@wsx-laptop:~/tmp$ echo "Test Line 2" | sed 'a\Test Line 1' Test Line 2 Test Line 1
要向数据流行内部插入或附加数据,必须用寻址来告诉sed数据应该出现在什么位置。
1 2 3 4 5 6 7 8 9 10 11 12
wsx@wsx-laptop:~/tmp$ sed '3i\ This is an inserted line.' data6.txt This is line number 1. This is line number 2. This is an inserted line. This is line number 3. This is line number 4. wsx@wsx-laptop:~/tmp$ sed '3a\ This is an inserted line.' data6.txt This is line number 1. This is line number 2. This is line number 3. This is an inserted line. This is line number 4.
如果想要给数据流末尾添加多行数据,通过$指定位置即可。
1 2 3 4 5
This is line number 1. This is line number 2. This is line number 3. This is line number 4. This is a new line.
修改行
修改(change)命令允许修改整个数据流中整行文本内容。它跟插入和附加命令的工作机制一样。
1 2 3 4 5 6 7 8 9 10
wsx@wsx-laptop:~/tmp$ sed '3c\This is a changed line.' data6.txt This is line number 1. This is line number 2. This is a changed line. This is line number 4. wsx@wsx-laptop:~/tmp$ sed '/number 3/c\This is a changed line.' data6.txt This is line number 1. This is line number 2. This is a changed line. This is line number 4.
wsx@wsx-laptop:~/tmp$ sed 'y/123/789/' data6.txt This is line number 7. This is line number 8. This is line number 9. This is line number 4.
转换命令是一个全局命令,它会在文本行中找到的所有指定字符自动进行转换,而不会考虑它们出现的位置。
回顾命令
另有3个命令可以用来打印数据流中的信息:
p命令用来打印文本行
等号=命令用来打印行号
l用来列出行
打印行
1 2 3
wsx@wsx-laptop:~/tmp$ echo "this is a test" | sed 'p' this is a test this is a test
p打印已有的数据文本。最常用的用法是打印符合匹配文本模式的行。
1 2 3 4 5 6 7
wsx@wsx-laptop:~/tmp$ cat data6.txt This is line number 1. This is line number 2. This is line number 3. This is line number 4. wsx@wsx-laptop:~/tmp$ sed -n '/number 3/p' data6.txt This is line number 3.
在命令行上使用-n选项,可以禁止输出其他行,只打印包含匹配文本模式的行。
也可以用来快速打印数据流中的某些行:
1 2 3
wsx@wsx-laptop:~/tmp$ sed -n '2,3p' data6.txt This is line number 2. This is line number 3.
wsx@wsx-laptop:~/tmp$ cat data1.txt The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. wsx@wsx-laptop:~/tmp$ sed '=' data1.txt 1 The quick brown fox jumps over the lazy dog. 2 The quick brown fox jumps over the lazy dog. 3 The quick brown fox jumps over the lazy dog. 4 The quick brown fox jumps over the lazy dog. 5 The quick brown fox jumps over the lazy dog. 6 The quick brown fox jumps over the lazy dog. 7 The quick brown fox jumps over the lazy dog. 8 The quick brown fox jumps over the lazy dog. 9 The quick brown fox jumps over the lazy dog.
这用来查找特定文本模式的话非常方便:
1 2 3 4 5 6
wsx@wsx-laptop:~/tmp$ sed -n '/number 4/{ > = > p > }' data6.txt 4 This is line number 4.
列出行
1 2 3 4
wsx@wsx-laptop:~/tmp$ cat data9.txt This line contains tabs. wsx@wsx-laptop:~/tmp$ sed -n 'l' data9.txt This\tline\tcontains\ttabs.$
使用Sed处理文件
写入文件
w命令用来向文件写入行。该命令格式如下:
1
[address]w filename
将文本的前两行写入其他文件:
1 2 3 4 5 6 7 8
wsx@wsx-laptop:~/tmp$ sed '1,2w test.txt' data6.txt This is line number 1. This is line number 2. This is line number 3. This is line number 4. wsx@wsx-laptop:~/tmp$ cat test.txt This is line number 1. This is line number 2.
如果不想让行显示到STDOUT(因为sed默认数据文本流),可以使用sed命令的-n选项。
读取数据
读取命令为r。
1 2 3 4 5 6 7 8 9 10
wsx@wsx-laptop:~/tmp$ cat data12.txt This is an added line. This is the second added line. wsx@wsx-laptop:~/tmp$ sed '3r data12.txt' data6.txt This is line number 1. This is line number 2. This is line number 3. This is an added line. This is the second added line. This is line number 4.
这效果有点像插入文本命令i和补充命令a。
同样适用于文本模式地址:
1 2 3 4 5 6 7
wsx@wsx-laptop:~/tmp$ sed '/number 2/r data12.txt' data6.txt This is line number 1. This is line number 2. This is an added line. This is the second added line. This is line number 3. This is line number 4.
文本末尾添加:
1 2 3 4 5 6 7
wsx@wsx-laptop:~/tmp$ sed '$r data12.txt' data6.txt This is line number 1. This is line number 2. This is line number 3. This is line number 4. This is an added line. This is the second added line.
wsx@wsx-laptop:~/tmp$ cat notice.std Would the following people: LIST please report to the ship's captain.
套用信件将通用占位文本LIST放在人物名单的位置,我们先根据它插入文本字符,然后删除它。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
wsx@wsx-laptop:~/tmp$ sed '/LIST/{ > r data10.txt > d > }' notice.std Would the following people: This line contains an escape character. please report to the ship's captain. wsx@wsx-laptop:~/tmp$ cat data10.txt This line contains an escape character. wsx@wsx-laptop:~/tmp$ cat data11.txt wangshx zhdan wsx@wsx-laptop:~/tmp$ sed '/LIST/{ r data11.txt d }' notice.std Would the following people: wangshx zhdan please report to the ship's captain.