Python删除重复的行

时间:2015-11-25 12:37:53

标签: python duplicates

我编写了以下代码,以便从文件中获取所有IP地址并打印出来:

with open("C:\\users\\joey\\desktop\\access.log",'r') as bestand:
    for line in bestand:
        try:
            splittedline = line.split('sftp-session')[1].split("[")[1].split("]")[0]
        except Exception:
            continue
        print splittedline

以下代码打印另一个文件的所有IP地址:

with open("C:\\users\\joey\\desktop\\exit_nodes.csv",'r') as bestand:
    for line in bestand:
        print line

如何比较2个文件并仅显示唯一的IP地址并删除重复项?

输出atm如下:

217.172.190.19
217.210.165.43
218.250.241.229
223.18.115.229
223.133.243.101

1 个答案:

答案 0 :(得分:2)

如果订单不重要,请使用套装:

ips_1 = set()

with open("C:\\users\\joey\\desktop\\access.log",'r') as bestand:
    for line in bestand:
        try:
            ips1.add(linprint splittedlinee.split('sftp-session')[1].split("[")[1].split("]")[0])
        except Exception:
            continue

ips_2 = set()
with open("C:\\users\\joey\\desktop\\exit_nodes.csv",'r') as bestand:
    for line in bestand:
        ips_2.add(line)

然后,您可以使用set方法查看两个文件中的ips,这两个文件只在一个文件上,或者用于获取所有唯一的ips:

两个文件中都有哪些ips?

ips_1.intersection(ips_2)

哪个ips只在文件1中?

ips_1.difference(ips_2)

所有独特的ips:

ips_1.union(ips_2)