通过IP地址加入文件

时间:2016-05-31 10:27:31

标签: bash shell join awk

我有两个文件都有不同列中的ip地址,mac和描述列表。两个文件中都有一些IP地址。我想通过ip地址加入这些文件,以便输出文件具有: 1)来自两个文件和的所有IP地址 2)两个文件中列出的任何IP后面跟着每个包含mac和description的文件的四列2。

文件1:

11.16.31.13     00:a0:c8:b5:c2:d5  keshav-ae1.0
10.16.31.17     f0:ad:4e:01:c5:c8  keshav-ge-2/1/5.0
108.16.31.3     4c:96:14:5d:5f:f0  keshav-ae0.0
108.16.31.4     00:0a:9c:52:74:b2  keshav-ae1.0
27.16.32.1      00:00:5e:00:01:4c  keshav-ae0.0

文件2:

192.16.31.10     00:25:90:cd:4e:3c  keshav-ae0.0
10.16.31.17     f0:ad:4e:01:c5:c8  keshav-ae0.0
17.16.31.2      b0:a8:6e:28:87:f0  keshav-ae0.0
108.16.31.4      00:0a:9c:52:74:b2  keshav-ae0.0
10.16.31.5      2c:36:f8:ce:65:42  keshav-ae0.0

输出文件应为:

11.16.31.13     00:a0:c8:b5:c2:d5  keshav-ae1.0
10.16.31.17     f0:ad:4e:01:c5:c8  keshav-ge-2/1/5.0 f0:ad:4e:01:c5:c8  keshav-ae0.0
108.16.31.3     4c:96:14:5d:5f:f0  keshav-ae0.0
108.16.31.4     00:0a:9c:52:74:b2  keshav-ae1.0 00:0a:9c:52:74:b2  keshav-ae0.0
27.16.32.1      00:00:5e:00:01:4c  keshav-ae0.0
192.16.31.10    00:25:90:cd:4e:3c  keshav-ae0.0
17.16.31.2      b0:a8:6e:28:87:f0  keshav-ae0.0
10.16.31.5      2c:36:f8:ce:65:42  keshav-ae0.0

我尝试过加入已排序的文件(“sort -n”或“sort -n -t。-k 1,1 -k 2,2 -k 3,3 -k 4,4”)但它不是提供理想的输出。

sort -n file1 > file3
sort -n file2 > file4
join -j 1 -a 1 -a 2 -e UNKNOWN file3 file4 > output

和输出文件如下所示:

10.16.31.17 f0:ad:4e:01:c5:c8 keshav-ge-2/1/5.0 f0:ad:4e:01:c5:c8 keshav-ae0.0
10.16.31.5 2c:36:f8:ce:65:42 keshav-ae0.0
11.16.31.13 00:a0:c8:b5:c2:d5 keshav-ae1.0
17.16.31.2 b0:a8:6e:28:87:f0 keshav-ae0.0
108.16.31.4 00:0a:9c:52:74:b2 keshav-ae0.0
192.16.31.10 00:25:90:cd:4e:3c keshav-ae0.0
27.16.32.1 00:00:5e:00:01:4c keshav-ae0.0
108.16.31.3 4c:96:14:5d:5f:f0 keshav-ae0.0
108.16.31.4 00:0a:9c:52:74:b2 keshav-ae1.0

3 个答案:

答案 0 :(得分:0)

awk '!/^$/{ if(!a[$1]){a[$1]=$1"\t"$2"\t"$3} else {a[$1]=a[$1]"\n\t\t"$2"\t"$3} } END { for(i in a) print a[i] }' f1 f2

<强>输出:

17.16.31.2      b0:a8:6e:28:87:f0       keshav-ae0.0
192.16.31.10    00:25:90:cd:4e:3c       keshav-ae0.0
108.16.31.3     4c:96:14:5d:5f:f0       keshav-ae0.0
108.16.31.4     00:0a:9c:52:74:b2       keshav-ae1.0
                00:0a:9c:52:74:b2       keshav-ae0.0
10.16.31.5      2c:36:f8:ce:65:42       keshav-ae0.0
27.16.32.1      00:00:5e:00:01:4c       keshav-ae0.0
11.16.31.13     00:a0:c8:b5:c2:d5       keshav-ae1.0
10.16.31.17     f0:ad:4e:01:c5:c8       keshav-ge-2/1/5.0
                f0:ad:4e:01:c5:c8       keshav-ae0.0

答案 1 :(得分:0)

如果您想保留例外输出的顺序,可以试试这个。

awk 'ARGIND<3{dic[$1]=dic[$1]" "$2" "$3}ARGIND>=3 && !($1 in a){print $1dic[$1];a[$1]}' file1 file2 file1 file2

<强>输出

11.16.31.13 00:a0:c8:b5:c2:d5 keshav-ae1.0
10.16.31.17 f0:ad:4e:01:c5:c8 keshav-ge-2/1/5.0 f0:ad:4e:01:c5:c8 keshav-ae0.0
108.16.31.3 4c:96:14:5d:5f:f0 keshav-ae0.0
108.16.31.4 00:0a:9c:52:74:b2 keshav-ae1.0 00:0a:9c:52:74:b2 keshav-ae0.0
27.16.32.1 00:00:5e:00:01:4c keshav-ae0.0
192.16.31.10 00:25:90:cd:4e:3c keshav-ae0.0
17.16.31.2 b0:a8:6e:28:87:f0 keshav-ae0.0
10.16.31.5 2c:36:f8:ce:65:42 keshav-ae0.0

<强>解释

ARGIND<3表示以下语句{dic[$1]=dic[$1]" "$2" "$3}仅处理前2个输入文件,语句{dic[$1]=dic[$1]" "$2" "$3}只是将数据放入字典(用Python)或映射(用C ++ / Java等)dic使用ip作为键,mac用描述作为值。

ARGIND>=3测量以下语句{print $1dic[$1];a[$1]}将处理其他文件,排除前2个文件。 !($1 in a)表示密钥不在字典a中,因此输出不会重复。声明{print $1dic[$1];a[$1]}只需打印ipmacdescription并将IP更新为字典a

这里的第三个和第四个参数只是扫描文件的密钥,以便结果可以按照你想要的那样输出。

如果您只想加入表格,而不关心输出顺序,可以使用

awk '{dic[$1]=dic[$1]" "$2" "$3}END{for(item in dic) print item""dic[item]}' file1 file2

<强>输出

17.16.31.2 b0:a8:6e:28:87:f0 keshav-ae0.0
192.16.31.10 00:25:90:cd:4e:3c keshav-ae0.0
108.16.31.3 4c:96:14:5d:5f:f0 keshav-ae0.0
108.16.31.4 00:0a:9c:52:74:b2 keshav-ae1.0 00:0a:9c:52:74:b2 keshav-ae0.0
10.16.31.5 2c:36:f8:ce:65:42 keshav-ae0.0
27.16.32.1 00:00:5e:00:01:4c keshav-ae0.0
11.16.31.13 00:a0:c8:b5:c2:d5 keshav-ae1.0
10.16.31.17 f0:ad:4e:01:c5:c8 keshav-ge-2/1/5.0 f0:ad:4e:01:c5:c8 keshav-ae0.0

<强>解释 语句{dic[$1]=dic[$1]" "$2" "$3}执行相同的操作(将日期放入字典dic),语句END{for(item in dic) print item""dic[item]}表示在处理完每个输入文件后,对于每个键item字典,打印键和值。

答案 2 :(得分:0)

awk应该有所帮助:

awk '{a[$1]=a[$1]" "$2" "$3}END{for (i in a){print i a[i]}}' file1 file2

说明:对于每个IP地址,记录mac地址&amp;描述。 (如果有的话,将其附加到已有的条目。)&amp;最后打印一切。

如果您希望按IP地址排序数组,请在for循环之前添加PROCINFO["sorted_in"]="@ind_num_asc"

awk '{a[$1]=a[$1]" "$2" "$3}END{PROCINFO["sorted_in"]="@ind_num_asc"; for (i in a){print i a[i]}}' file1 file2