使用Python操作csv文件

时间:2015-07-19 18:13:35

标签: python csv

我试图通过两列输出2个csv文件之间的差异,并创建第三个csv文件。如何通过第0列和第3列进行以下代码比较。

SELECT questions.question_id,
       questions.username,
       questions.question,
       userlog.user_mail,
       COUNT(answers.answer) as answerCount
  FROM questions
  LEFT JOIN userlog ON questions.username = userlog.username
  LEFT JOIN answers ON answers.question_id = questions.question_id
 WHERE questions.topic_id = '0d3fb89c012b5af12e1e0'
 GROUP BY questions.question_id, questions.username, questions.question, userlog.user_mail
 ORDER BY questions.username, questions.question_id

1 个答案:

答案 0 :(得分:2)

如果你想要ted.csv中没有任何相同的第三和第四列元素的行作为ted2,从ted2创建一组这些元素并在写入之前检查ted.csv中的每一行:

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
    r1, r2 = csv.reader(f1), csv.reader(f2)
    st = set((row[0], row[3]) for row in r1)
    wr = csv.writer(out)
    for row in (row for row in r2 if (row[0],row[3]) not in st):
          wr.writerow(row)   

如果你真的想要symmetric difference之类的东西,你可以从两个文件中获得唯一的行,那么从这两个文件中创建一组第三和第四列:

from itertools import chain

with open("ted.csv") as f1, open("ted2.csv") as f2, open('foo.csv', 'w') as out:
    r1, r2 = csv.reader(f1), csv.reader(f2)
    st1 = set((row[0], row[3]) for row in r1)
    st2 = set((row[0], row[3]) for row in r2)
    f1.seek(0), f2.seek(0)
    wr = csv.writer(out)
    r1, r2 = csv.reader(f1), csv.reader(f2)
    output1 = (row for row in r1 if (row[0], row[3]) not in st2)
    output2 = (row for row in r2 if (row[0], row[3]) not in st1)
    for row in chain.from_iterable((output1, output2)):
        wr.writerow(row)