Python-在CSV文件中打印重复的数据集(csv.reader)

时间:2019-03-15 09:59:43

标签: python python-3.x csv reader

我正在尝试打印出不止一次存在的CSV文件中的所有数据集。不是单个值,而是多个值。

例如:

//dataset1: 25,41,1,23,12,//dataset2: 11, 2
//dataset1: 25,41,1,22,13,//dataset2: 11, 2
//dataset1: 25,41,1,23,14,//dataset2: 11, 3
//dataset1: 25,41,1,23,15,//dataset2: 11, 4
//dataset1: 25,41,1,23,15,//dataset2: 11, 5

我试图这样做:

    with open(str(csv_file)) as file:
        reader = csv.reader(file)
        for row in reader:
            rowset1 = [row[0], row[1], row[2], row[3], row[4]]
            rowset2 = [row[5], row[6]] 
            //this is where I am stuck 
            if rowset1 //exists more than once > 1 or rowset2 //exists more than once > 1:
                print("True")
            else:
                print("False")

编辑:

输出应为:

True
True
False
True
True

我也尝试过类似的方法,但是我认为这是错误的方法:

len(my_list) != len(set(my_list))

2 个答案:

答案 0 :(得分:1)

如果我正确理解:

文件

25,41,1,23,12,11,2
25,41,1,22,13,11,2
25,41,1,23,14,11,3
25,41,1,23,15,11,4
25,41,1,23,15,11,5

代码

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import contextlib
import csv

if __name__ == '__main__':

    res1 = list()
    res2 = list()

    with contextlib.closing(open('file.csv', 'r')) as csv_file:
        reader_orig = csv.reader(csv_file, delimiter=',')
        for row in reader_orig:
            row_set1 = [row[0], row[1], row[2], row[3], row[4]]
            row_set2 = [row[5], row[6]]

            if row_set1 not in res1:
                res1.append(row_set1)
            else:
                print(row_set1)

            if row_set2 not in res2:
                res2.append(row_set2)
            else:
                print(row_set2)

输出

['11', '2']
['25', '41', '1', '23', '15']

答案 1 :(得分:0)

您的问题还不清楚,CSV中的“数据集”是什么?给出CSV的实际摘录,但不要添加“ //”注释。