如何逐行比较两个文件,如果不同则输出整行

时间:2018-05-16 00:40:16

标签: bash shell awk text-processing

我有两个有问题的分类文件

1)one is a control file(ctrl.txt) which is external process generated
2)and other is line count file(count.txt) that I generate using `wc -l`

$ more ctrl.txt

Thunderbird|1000
Mustang|2000
Hurricane|3000

$ more count.txt

Thunder_bird|1000
MUSTANG|2000
Hurricane|3001

我想比较这两个文件,忽略第1栏(文件名)中的皱纹,例如“_”(对于Thunder_bird)或“大写”(对于MUSTANG),所以我的输出只显示在文件下面作为唯一真正不同的文件,其计数不匹配。

Hurricane|3000

我有这个想法只比较两个文件中的第二列,如果它们不同则输出整行

我在AWK中看过其他例子,但我无法解决任何问题。

1 个答案:

答案 0 :(得分:1)

您能否请关注awk并告诉我这是否对您有所帮助。

awk -F"|" 'FNR==NR{gsub(/_/,"");a[tolower($1)]=$2;next} {gsub(/_/,"")} ((tolower($1) in a) && $2!=a[tolower($1)])' cntrl.txt count.txt

现在也添加非单线形式的解决方案。

awk -F"|" '
FNR==NR{
  gsub(/_/,"");
  a[tolower($1)]=$2;
  next}
{ gsub(/_/,"") }
((tolower($1) in a) && $2!=a[tolower($1)])
' cntrl.txt count.txt

说明: 此处也为上述代码添加说明。

awk -F"|" '                                ##Setting field seprator as |(pipe) here for all lines in Input_file(s).
FNR==NR{                                   ##Checking condition FNR==NR which will be TRUE when first Input_file(cntrl.txt) in this case is being read. Following instructions will be executed once this condition is TRUE.
  gsub(/_/,"");                            ##Using gsub utility of awk to globally subtitute _ with NULL in current line.
  a[tolower($1)]=$2;                       ##Creating an array named a whose index is first field in LOWER CASE to avoid confusions and value is $2 of current line.
  next}                                    ##next is awk out of the box keyword which will skip all further instructions now.(to make sure they are read when 2nd Input-file named count.txt is being read).
{ gsub(/_/,"") }                           ##Statements from here will be executed when 2nd Input_file is being read, using gsub to remove _ all occurrences from line.
((tolower($1) in a) && $2!=a[tolower($1)]) ##Checking condition here if lower form of $1 is present in array a and value of current line $2 is NOT equal to array a value. If this condition is TRUE then print the current line, since I have NOT given any action so by default printing of current line will happen from count.txt file.
' cntrl.txt count.txt                      ##Mentioning the Input_file names here which we have to pass to awk.