仅打印' +'或者' - '如果字符串匹配(有两个条件)

时间:2014-03-14 19:35:36

标签: bash awk

我想为我的实际代码添加两个附加条件:print' +'如果在File2字段5中大于35且字段7大于90。

代码:

while read -r line
do
    grep -q "$line" File2.txt && echo "$line +" || echo "$line -"
done < File1.txt '

输入文件1:

HAPS_0001
HAPS_0002
HAPS_0005
HAPS_0006
HAPS_0007
HAPS_0008
HAPS_0009
HAPS_0010

输入文件2(制表符分隔):

Query   DEG_ID  E-value Score   %Identity   %Positive   %Matching_Len
HAPS_0001   protein:plasmid:149679  3.00E-67    645 45  59  91
HAPS_0002   protein:plasmid:139928  4.00E-99    924 34  50  85
HAPS_0005   protein:plasmid:134646  3.00E-98    915 38  55  91
HAPS_0006   protein:plasmid:111988  1.00E-32    345 33  54  86
HAPS_0007   -   -   0   0   0   0
HAPS_0008   -   -   0   0   0   0
HAPS_0009   -   -   0   0   0   0
HAPS_0010   -   -   0   0   0   0

所需的输出(制表符分隔):

HAPS_0001   +
HAPS_0002   -
HAPS_0005   +
HAPS_0006   -
HAPS_0007   -
HAPS_0008   -
HAPS_0009   -
HAPS_0010   -

谢谢!

2 个答案:

答案 0 :(得分:2)

这应该有效:

$ awk '
BEGIN {FS = OFS = "\t"} 
NR==FNR {if($5>35 && $7>90) a[$1]++; next}
{print (($1 in a) ? $0 FS "+" : $0 FS "-")}' f2 f1
HAPS_0001       +
HAPS_0002       -
HAPS_0005       +
HAPS_0006       -
HAPS_0007       -
HAPS_0008       -
HAPS_0009       -
HAPS_0010       -

答案 1 :(得分:1)

join file1.txt <( tail -n +2 file2.txt) | awk '
 $2 = ($5 > 35 && $7 > 90)?"+":"-" { print $1, $2 }'

您不关心输出中的第二个字段,因此请使用适当的输出符号覆盖它。