使用awk根据某些条件替换多列的值

时间:2019-10-03 21:09:17

标签: linux awk

我有一个csv文件

D,FNAME,MNAME,LNAME,GENDER,DOB,snapshot,PID.1,PID2,FNAME.1,MNAME.1,LNAME.1,FNAME2,MNAME2,LNAME2
,,,,,,201211.0,2,6,6,J,J,D,,
,,,,,,201211.0,3,4,6,H,H,M,,
,,,,,,201211.0,6,7,8,d,d,d,,
,,,,,,201211.0,0,2,5,6,7,8,,
,,,,,,201211.0,,,,,,,,
,,,,,,201211.0,,,,,,,,

我想做的是使用PID.1,FNAME.1,MNAME.1,LNAME.1列中的信息填充D,FNAME,MNAME,LNAME列(如果存在空值),然后将所有列输出到新的csv文件。所以我的预期输出是

D,FNAME,MNAME,LNAME,GENDER,DOB,snapshot,PID.1,PID2,FNAME.1,MNAME.1,LNAME.1,FNAME2,MNAME2,LNAME2
2,6,J,J,,,201211.0,2,6,6,J,J,D,,
3,6,H,H,,,201211.0,3,4,6,H,H,M,,
6,8,d,d,,,201211.0,6,7,8,d,d,d,,
0,5,6,7,,,201211.0,0,2,5,6,7,8,,
,,,,,,201211.0,,,,,,,,
,,,,,,201211.0,,,,,,,,

我试图用awk做我自己。 这是我的代码。

awk -F, '{if ($1=="" && $8!="")$1=$8;print $0}' test4.csv | awk -F, '{if ($2=="" && $10!="")$2=$10;print $0}' | awk -F, '{if ($3=="" && $11!="")$3=$11;print $0}' | awk -F, '{if ($4=="" && $12!="")$4=$12;print $0}'

输出为

D,FNAME,MNAME,LNAME,GENDER,DOB,snapshot,PID.1,PID2,FNAME.1,MNAME.1,LNAME.1,FNAME2,MNAME2,LNAME2
2      201211.0 2 6 6 J J D  
3      201211.0 3 4 6 H H M  
6      201211.0 6 7 8 d d d  
0      201211.0 0 2 5 6 7 8  
,,,,,,201211.0,,,,,,,,
,,,,,,201211.0,,,,,,,,

所以我没有成功。有谁可以帮助我吗?谢谢。

2 个答案:

答案 0 :(得分:2)

$ awk -F',' -vOFS=',' '($1=="" && $8!=""){$1=$8} ($2=="" && $10!=""){$2=$10} ($3=="" && $11!=""){$3=$11} ($4=="" && $12!=""){$4=$12} {print $0}' test4.csv
D,FNAME,MNAME,LNAME,GENDER,DOB,snapshot,PID.1,PID2,FNAME.1,MNAME.1,LNAME.1,FNAME2,MNAME2,LNAME2
2,6,J,J,,,201211.0,2,6,6,J,J,D,,
3,6,H,H,,,201211.0,3,4,6,H,H,M,,
6,8,d,d,,,201211.0,6,7,8,d,d,d,,
0,5,6,7,,,201211.0,0,2,5,6,7,8,,
,,,,,,201211.0,,,,,,,,
,,,,,,201211.0,,,,,,,,

答案 1 :(得分:0)

另一个力求更通用的东西:

awk '
BEGIN {
    FS=OFS=","                      # set delimiters
}
NR==1 {
    for(i=1;i<=NF;i++)              # store the header names to h hash with indexes
        h[$i]=i
}
NR>1 {                              # copy fields for all but the first record
    $h["D"]=$h["PID.1"]             # referencing with header names
    $h["FNAME"]=$h["FNAME.1"]       # misspelled header name leads to a catastrophy
    $h["MNAME"]=$h["MNAME.1"]       # if you cant spell, use ifs:
    if(h["LNAME"] && h["LNAME.1"])  # but now you have to spell correctly :D
        $h["LNAME"]=$h["LNAME.1"]    
}
1' file                             # output

输出:

D,FNAME,MNAME,LNAME,GENDER,DOB,snapshot,PID.1,PID2,FNAME.1,MNAME.1,LNAME.1,FNAME2,MNAME2,LNAME2
2,6,J,J,,,201211.0,2,6,6,J,J,D,,
3,6,H,H,,,201211.0,3,4,6,H,H,M,,
6,8,d,d,,,201211.0,6,7,8,d,d,d,,
0,5,6,7,,,201211.0,0,2,5,6,7,8,,
,,,,,,201211.0,,,,,,,,
,,,,,,201211.0,,,,,,,,