根据String查找和删除文件中的行,但保留最后一次出现

时间:2017-03-09 13:38:19

标签: awk sed grep

需要帮助。我一直在寻找一整天没有找到特定于我需要的解决方案。

在档案中:

Lots
of
other
lines
...
...
# Client=HOSTNAME@ ..........1323    <- Do not include '# Client=HOSTNAME'
# Client=HOSTNAME@ ..........123123  <- Do not include '# Client=HOSTNAME'
Client=hostname1@ ....rndChars.... <- delete line
Client=hostname1@ ....rndChars.... <- delete line
Client=hostname2@ ....rndChars.... <- delete line
Client=hostname2@ ....rndChars.... <- delete line
Client=hostname2@ ....rndChars.... <- keep last occurrence
Client=hostname1@ ....rndChars.... <- keep last occurrence
Client=hostname3@ ....rndChars.... <- delete line
Client=hostname3@ ....rndChars.... <- delete line
Client=hostname3@ ....rndChars.... <- keep last occurrence
...
...
more
lines

我需要找到与“Client =”或更多匹配的所有行,并删除除最后一次出现之外的行。问题是我永远不知道主机名是什么。

输出应为:

Lots
of
other
lines
...
...
# Client=HOSTNAME@ ..........1323    <- Do not include '# Client=HOSTNAME'
# Client=HOSTNAME@ ..........123123  <- Do not include '# Client=HOSTNAME'
Client=hostname2@ ....rndChars.... <- keep last occurrence
Client=hostname1@ ....rndChars.... <- keep last occurrence
Client=hostname3@ ....rndChars.... <- keep last occurrence
...
...
more
lines
提前谢谢。

4 个答案:

答案 0 :(得分:1)

$ tac file | awk '/^Client=/{if (seen[$1]++) next} 1' | tac
Lots
of
other
lines
...
...
# Client=HOSTNAME@ ..........1323    <- Do not include '# Client=HOSTNAME'
# Client=HOSTNAME@ ..........123123  <- Do not include '# Client=HOSTNAME'
Client=hostname2@ ....rndChars.... <- keep last occurrence
Client=hostname1@ ....rndChars.... <- keep last occurrence
Client=hostname3@ ....rndChars.... <- keep last occurrence
...
...
more
lines

答案 1 :(得分:0)

Perl救援。读取文件两次,将每个主机的最后一个行号保留在哈希表中。

#!/usr/bin/perl
use warnings;
use strict;

my $client_re = qr/Client=(.*?)@/;

my $filename = shift;

open my $IN, '<', $filename or die $!;

my %lines;
while(<$IN>) {
    next if /^#/;

    # Overwrite the line number if already present.
    $lines{$1} = $. if /$client_re/;
}

seek $IN, 0, 0;  # Rewind the file handle.
$. = 0;          # Restart the line counter.
while (<$IN>) {
    if (! /^#/ && ( my ($hostname) = /$client_re/ )) {
        print if $lines{$hostname} == $.;  # Only print the stored line.
    } else {
        print;
    }
}

答案 2 :(得分:0)

使用tac&amp; awk

tac file | awk '/^Client/{ if(!a[$1]){a[$1]++;print};next}1' | tac

<强>输出:

$ tac file | awk '/^Client/{ if(!a[$1]){a[$1]++;print};next}1' | tac
Lots
of
other
lines
...
...
# Client=HOSTNAME@ ..........1323    <- Do not include '# Client=HOSTNAME'
# Client=HOSTNAME@ ..........123123  <- Do not include '# Client=HOSTNAME'
Client=hostname2@ ....rndChars.... <- keep last occurrence
Client=hostname1@ ....rndChars.... <- keep last occurrence
Client=hostname3@ ....rndChars.... <- keep last occurrence
...
...
more
lines

答案 3 :(得分:0)

sed -r ':a;N;$!ba;:b;s/(.*)(Client=[^@]+\b)[^\n]+\n*(.*\2)/\1\3/;tb' file