使用仅传递列索引的awk将列添加到文件csv中

时间:2018-11-26 11:02:17

标签: python-3.x csv awk

我在练习中遇到问题。

  • 我有一个字符串数组,必须将这些数组添加到列中。
  • 我有一个CSV文件,并且只知道列索引。
  • 我还需要从第12行开始输入,直到数组完成。

我已经在Python中尝试过:

index = 0 
col = "17"
while index < len(packages):
    cmdw = "awk -F \"\t\" -V OFS=\"\t\" -v col=" + col + " -v dato=" + packages[index] + " '{$col=$dato};' 1540476113.gt.tie "
    print("eseguo il comando ",cmd)
    os.system(cmdw)
    print("comando eseguito")
    index = index + 1

print("packages aggiunti!")

样本数据输入文件:

# TIE output version: 1.0 (text format)
# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap 

# Working Mode: off-line
# Session Type: biflow
# 1 plugins enabled: ndping 

# begin trace interval: 1540476116.42434

# begin TIE Table
# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence
17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100
26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100
27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100
38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100
0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0
29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100
31  192.168.20.105  8.8.8.8 17  1219    53  1   1   253 68  1540476194.898945   1540476194.931198   501 0   DNS 100
42  192.168.20.105  8.8.8.8 17  8339    53  1   1   198 70  1540476215.626959   1540476215.643374   501 0   DNS 100
33  192.168.20.105  8.8.8.8 17  10529   53  1   1   198 70  1540476194.960946   1540476194.977174   501 0   DNS 100
35  192.168.20.105  8.8.8.8 17  10916   53  1   1   169 64  1540476195.149943   1540476195.189064   501 0   DNS 100
44  192.168.20.105  8.8.8.8 17  11736   53  1   1   111 63  1540476217.327956   1540476217.369471   501 0   DNS 100
21  192.168.20.105  8.8.8.8 17  13249   53  1   1   102 70  1540476189.828943   1540476189.869843   501 0   DNS 100
24  192.168.20.105  8.8.8.8 17  14312   53  1   1   128 64  1540476194.150951   1540476194.166601   501 0   DNS 100
28  192.168.20.105  8.8.8.8 17  15049   53  1   1   174 67  1540476194.312946   1540476194.354500   501 0   DNS 100
37  192.168.20.105  8.8.8.8 17  17362   53  1   1   75  59  1540476195.428947   1540476195.468915   501 0   DNS 100
39  192.168.20.105  8.8.8.8 17  25274   53  1   1   258 63  1540476195.683944   1540476195.699796   501 0   DNS 100
25  192.168.20.105  8.8.8.8 17  26608   53  1   1   122 64  1540476194.191945   1540476194.207576   501 0   DNS 100
14  192.168.20.105  8.8.8.8 17  35680   53  1   1   120 59  1540476133.452918   1540476133.486316   501 0   DNS 100
18  192.168.20.105  8.8.8.8 17  43833   53  1   1   118 72  1540476136.868920   1540476136.902531   501 0   DNS 100
4   192.168.20.105  8.8.8.8 17  43919   53  1   1   93  61  1540476126.806916   1540476126.822800   501 0   DNS 100
2   192.168.20.105  8.8.8.8 17  51340   53  1   1   141 63  1540476124.935913   1540476124.967768   501 0   DNS 100
3   192.168.20.105  8.8.8.8 17  64815   53  1   1   141 63  1540476124.974914   1540476125.006749   501 0   DNS 100
30  192.168.20.105  216.58.198.14   6   48980   443 4   2   1093    884 1540476194.835944   1540476195.102945   0   0   Other TCP   0

我可以在标头的12行之后向该文件添加一个新列

1 个答案:

答案 0 :(得分:1)

您可以直接在python中读取文件,跳过12行标题并将所需的值附加到最后一列。

我刚刚在下面的代码中打印了输出,您可以根据需要将其写入新文件。

    index = 0 
    col = "17"
    header_len = 0 
    packages = [ "dato =" + str(i) for i in range(100)] # created dummy list, replace it with original one
    with open("input_file.txt",'r+') as f :
        for line in f:
            if header_len >= 12 and index < len(packages): # comparing header length and remaining index in list.
                print("{}\t{}".format(line.rstrip('\n'),packages[index]))
                index+=1

            else:
                print(line)
                header_len+=1

示例输出:

# TIE output version: 1.0 (text format)
# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap

# Working Mode: off-line
# Session Type: biflow
# 1 plugins enabled: ndping

# begin trace interval: 1540476116.42434

# begin TIE Table
# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence
17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100
26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100 dato =0
27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100 dato =1
38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100     dato =2
0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0       dato =3
29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100 dato =4