awk根据起始范围和结束范围比较2个文件,打印匹配和差异:

时间:2014-01-21 07:08:44

标签: unix awk

对于这种情况,我需要联合两个文件f1.txtf2.txt并获取匹配和不匹配 我想检查f2.txt的第二个字段位于f1.txt的StartRange和EndRange之间,如果是,则首先打印f2.txt的第二个字段, 然后打印f1.txt的整行。并且在f1.txt上找不到匹配状态“Not Found”,然后打印f2.txt整行。

f1.txt

Flag,StartRange,EndRange,Month
aa,1000,2000,cc,Jan-13
bb,2500,3000,cc,Feb-13
dd,5000,9000,cc,Mar-13

f2.txt

ss,1500
xx,500
gg,2800
yy,15000

期望输出

ss,1500,aa,1000,2000,cc,Jan-13
xx,500,Not Found,Not Found,Not Found,Not Found
gg,2800,bb,2500,3000,cc,Feb-13
yy,15000,Not Found,Not Found,Not Found,Not Found

1 个答案:

答案 0 :(得分:0)

这可能对您有用:

gawk 'BEGIN { 
        FS="," # Field separator
        c=1    # counter
        while ((getline line < ARGV[1]) > 0) { 
            if (line !~ "Flag,StartRange,EndRange,Month") { # No need for header
                F[c]=line;                   # store line
                split(line,a,",")            # split line
                F2[c]=a[2] ; F3[c]=a[3]      # store the lines' range parts
                c++
            }
        }
      } 
FILENAME==ARGV[2] { 
    # Work on second file
    for (i in F) { # For every line scan the first file 
        # if within a range, step out
        if ($2>=F2[i] && $2<=F3[i]) {found=i ; break} 
        # else check next
        else {found=0}
    }  
    # if the above found anything print the line from second file
    # with the relavant line from the first
    if (found>0) { 
        print $0 "," F[found] 
    } 
    # otherwise the not found message
    else { 
        print $0 ",Not Found,Not Found,Not Found,Not Found" 
    } 
}' f1.txt f2.txt