打印唯一的行,比较不超过N个字符

时间:2013-05-01 20:56:38

标签: bash awk

使用uniq,您可以选择仅比较第一个N字符

$ cat foo.txt
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy mouse.

$ uniq -w 40 foo.txt
The quick brown fox jumps over the lazy dog.

使用awk可以达到同样的效果吗?我读 this example

awk '!a[$0]++'

但它比较整行。

2 个答案:

答案 0 :(得分:11)

awk有substr()函数:

awk '!a[substr($0,1,40)]++'

以你的例子:

kent$  echo "The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy cat.
The quick brown fox jumps over the lazy mouse."|awk '!a[substr($0,1,40)]++'
The quick brown fox jumps over the lazy dog

答案 1 :(得分:0)

使用FIELDWIDTHSFPAT的两种选择:

awk '!a[$1]++' FIELDWIDTHS=40

awk '!a[$1]++' FPAT='.{40}'