我有一个列表,其数据与IP地址配对,我只想看一次IP地址,我不想更改订单。
192.168.0.100 fred is happy 192.168.0.100 fred likes pie 192.168.0.100 pie is good 192.168.0.110 tom like cake 192.168.0.110 cake is good 192.168.0.110 pie is better 192.168.0.112 bill like lettuce 192.168.0.112 lettuce is good for you 192.168.0.112 cake and pie are better tasting than lettuce
我想要做的只是删除重复的IP地址,但保留一切完全相同。
我想让它看起来像这样
192.168.0.100 fred is happy fred likes pie pie is good 192.168.0.110 tom like cake cake is good pie is better 192.168.0.112 bill like lettuce lettuce is good for you cake and pie are better tasting than lettuce
我不想触及任何重复的单词,我无法更改顺序
谢谢你能提供帮助
答案 0 :(得分:2)
无论文件中有哪种间距和/或RE元字符,这都可以工作:
$ awk '
{ key = $1 }
key == prev { sub(/[^[:space:]]+/,sprintf("%*s",length(key),"")) }
{ prev = key; print }
' file
192.168.0.100 fred is happy
fred likes pie
pie is good
192.168.0.110 tom like cake
cake is good
pie is better
192.168.0.112 bill like lettuce
lettuce is good for you
cake and pie are better tasting than lettuce
请注意在RE上下文中使用$ 1的解决方案,因为IP地址中的“。”是RE元字符,表示“任何字符”,因此它们可能适用于某些示例数据,但您可以在给定其他输入的情况下获得错误匹配。
答案 1 :(得分:1)
我猜ip和文本之间的分隔符是tab
,然后这个单行应该适合你:
awk -F'\t' -v OFS='\t' 'a[$1]{gsub(/./," ",$1);print;next}{a[$1]=1}7' file
使用您的文件进行测试:
kent$ awk -F'\t' -v OFS='\t' 'a[$1]{gsub(/./," ",$1);print;next}{a[$1]=1}7' f
192.168.0.100 fred is happy
fred likes pie
pie is good
192.168.0.110 tom like cake
cake is good
pie is better
192.168.0.112 bill like lettuce
lettuce is good for you
cake and pie are better tasting than lettuce
答案 2 :(得分:1)
使用awk:
awk 'BEGIN{FS=OFS=" "}{t=$1;if(t in a){gsub(/./," ",$1);a[t]=a[t]RS$0}else{a[t]=$0}}END{for(i in a)print a[i]}' file
输出:
192.168.0.100 fred is happy
fred likes pie
pie is good
192.168.0.110 tom like cake
cake is good
pie is better
192.168.0.112 bill like lettuce
lettuce is good for you
cake and pie are better tasting than lettuce
答案 3 :(得分:1)
还有一个:
awk 'A[$1]++{s=$1; gsub(/./,FS,s); sub($1,s)}1' file
答案 4 :(得分:0)
这可能适合你(GNU sed):
sed -r '1{:a;p;h;s/\s.*//;s/./ /g;H;d};G;s/^(\S+)(\s.*)\n\1.*\n(.*)/\3\2/;t;s/\n.*//;ba' file
打印第一条记录和密钥更改的记录,并将密钥及其补码存储在保留空间中的空格中。对于后续记录,将存储的密钥与当前密钥进行比较,对于匹配的密钥,将当前密钥替换为空格的补码。对于那些不匹配的键,删除存储的键和补码,并从头开始重复。