awk根据字段比较7个文件,打印匹配和差异:

时间:2014-01-27 12:57:06

标签: unix awk

对于这种情况,我需要将7个文件ref.txt和Jan.txt comapre到Jun.txt并获得匹配和不匹配 我想查看Ref.txt的第二个字段,其中包含Jan.txt到Jun.txt的所有First字段,如果是,则打印Ref.txt(主转储)的所有文件, 然后将整个Jan.txt行打印到Jun.txt。并且在Jan.txt到Jun.txt找不到匹配状态“NotFound”。

Ref.txt

abc 10  xxyyzz
bdc 20  xxyyzz
edf 30  xxyyzz
ghi 40  xxyyzz
ofg 50  xxyyzz
mgf 60  xxyyzz

Jan.txt

10  Jan 100
30  Jan 300
50  Jan 500

Feb.txt

10  Feb 200
20  Feb 400
40  Feb 800
60  Feb 1200

Mar.txt

20  Mar 600
50  Mar 1500

Apr.txt

10  Apr 100
30  Apr 300
50  Apr 500

May.txt

10  May 200
20  May 400
40  May 800
60  May 1200

Jun.txt

20  Jun 600
50  Jun 1500

期望的输出:

Ref.txt Ref.txt Ref.txt Jan.txt Jan.txt Jan.txt Feb.txt Feb.txt Feb.txt Mar.txt Mar.txt Mar.txt Apr.txt Apr.txt Apr.txt May.txt May.txt May.txt Jun.txt Jun.txt Jun.txt
abc 10  xxyyzz  10  Jan 100 10  Feb 200 Notfound    Notfound    Notfound    10  Apr 100 10  May 200 Notfound    Notfound    Notfound
bdc 20  xxyyzz  Notfound    Notfound    Notfound    20  Feb 400 20  Mar 600 Notfound    Notfound    Notfound    20  May 400 20  Jun 600
edf 30  xxyyzz  30  Jan 300 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    30  Apr 300 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound
ghi 40  xxyyzz  Notfound    Notfound    Notfound    40  Feb 800 Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    40  May 800 Notfound    Notfound    Notfound
ofg 50  xxyyzz  50  Jan 500 Notfound    Notfound    Notfound    50  Mar 1500    50  Apr 500 Notfound    Notfound    Notfound    50  Jun 1500
mgf 60  xxyyzz  Notfound    Notfound    Notfound    60  Feb 1200    Notfound    Notfound    Notfound    Notfound    Notfound    Notfound    60  May 1200    Notfound    Notfound    Notfound

预先感谢您的回复

1 个答案:

答案 0 :(得分:3)

这是礼物:请提出您不理解的问题

awk '
    FNR == 1 { 
        printf "%s %s %s\t", FILENAME, FILENAME, FILENAME 
        if (NR > FNR) file[++num_files] = FILENAME 
    }
    NR == FNR {
        id[NR] = $2
        ref[NR] = $0
        num_ids++
        next
    }
    { value[FILENAME,$1] = $0 }
    END {
        print ""
        for (row=1; row<=num_ids; row++) {
            printf "%s\t", ref[row]
            for (f=1; f<=num_files; f++) {
                key = file[f] SUBSEP id[row]
                printf "%s\t", (key in value ? value[key] : "Notfound")
            }
            print ""
        }
    }
' {Ref,Jan,Feb,Mar,Apr,May,Jun}.txt
Ref.txt Ref.txt Ref.txt Jan.txt Jan.txt Jan.txt Feb.txt Feb.txt Feb.txt Mar.txt Mar.txt Mar.txt Apr.txt Apr.txt Apr.txt May.txt May.txt May.txt Jun.txt Jun.txt Jun.txt 
abc 10  xxyyzz  10  Jan 100 10  Feb 200 Notfound    10  Apr 100 10  May 200 Notfound    
bdc 20  xxyyzz  Notfound    20  Feb 400 20  Mar 600 Notfound    20  May 400 20  Jun 600 
edf 30  xxyyzz  30  Jan 300 Notfound    Notfound    30  Apr 300 Notfound    Notfound    
ghi 40  xxyyzz  Notfound    40  Feb 800 Notfound    Notfound    40  May 800 Notfound    
ofg 50  xxyyzz  50  Jan 500 Notfound    50  Mar 1500    50  Apr 500 Notfound    50  Jun 1500    
mgf 60  xxyyzz  Notfound    60  Feb 1200    Notfound    Notfound    60  May 1200    Notfound