使用python比较CSV文件

时间:2014-04-09 09:54:47

标签: python csv comparison

我需要帮助比较两个csv文件...并将两个文件中的匹配记录输入到输出文件中......并将不匹配的记录输入到另一个输出文件中...... 对于Eg:我的第一个csv文件包含两列Name和Salary,数据为

A 20000
B 15000
C 10000
D 5000

第二个CSV文件包含姓名和薪资,数据为

A 40000
D 10000
B 15000

我的输出应该是两个文件,匹配的文件分别包含来自file1和file2的B 15000 B 15000 和第二个文件包含不匹配的记录

A 20000,A 40000
C 10000,-------(no record in file2)
D 5000, D 10000

1 个答案:

答案 0 :(得分:0)

f1_in = open("f1.csv","r")
f1_dict = {}
for line in f1_in:
    l = line.split()
    f1_dict[l[0].strip()] = l[1].strip()
f1_in.close()

f2_in = open("f2.csv","r")
f2_dict = {}
for line in f2_in:
    l = line.split()
    f2_dict[l[0].strip()] = l[1].strip()
f2_in.close()

f_same = open("same.txt","w")
f_different = open("different.txt","w")

for k1 in f1_dict.keys():
    if k1 in f2_dict.keys() \
    and f2_dict[k1] == f1_dict[k1]:
        f_same.write("{0}, {1}\n".format(str(k1)+" "+str(f1_dict[k1]),
                                         str(k1)+" "+str(f2_dict[k1])))
    elif not k1 in f2_dict.keys():
        f_different.write("{0}, {1}\n".format(str(k1)+" "+str(f1_dict[k1]),
                                              "------"))
    elif not f2_dict[k1] == f1_dict[k1]:
        f_different.write("{0}, {1}\n".format(str(k1)+" "+str(f1_dict[k1]),
                                              str(k1)+" "+str(f2_dict[k1])))

f_same.close()
f_different.close()

修改 在循环键(for k1 in f1_dict.keys())之前按键对字典进行排序:

# get the keys as a list
my_keys = f1_dict.keys()
# sort it
my_keys.sort()
# use sorted list
for k1 in my_keys:

要在循环键之前按值对dict排序:

# return a list of tuple : [(key1,value1),(key2,value2)]
my_zip = zip(f1_dict.keys(), f1_dict.values())
# now you have a list you can sort it :
my_sorted_list = sorted(my_zip, key=lambda value, value[2])

key=lambda value, value[2]表示my_zip将使用列表中每个元组的第二个值进行排序。

# use sorted list
for t in my_sorted_list:
    # t is a tuple
    k1 = t[0]
    value = t[1]