包含额外逗号的csv文件的CSV模块问题

时间:2017-08-20 00:53:11

标签: python csv

我正在跟随克林顿·W·布朗利(O'Reilly Media Inc.)出版的基于Python的分析基础一书

第2章 - 读取和写入CSV文件(第2部分) 基础Python,带有csv模块

脚本如下:

#!/usr/bin/env python3
import sys
import csv

input_file = sys.argv[1]
output_file = sys.argv[2]

with open(input_file, 'r', newline='') as csv_input_file:
    with open(output_file, 'w', newline='') as csv_output_file:

        filereader = csv.reader(csv_input_file, delimiter=',')
        filewriter = csv.writer(csv_output_file, delimiter=',')

        for row_list in filereader:
            print(row_list)
            filewriter.writerow(row_list)

输入文件包含逗号(最后两行中的美元金额)字段:

Supplier Name,Invoice Number,Part Number,Cost,Purchase Date
Supplier X,001-1001,2341,$500.00,1/20/14
Supplier X,001-1001,2341,$500.00,1/20/14
Supplier X,001-1001,5467,$750.00,1/20/14
Supplier X,001-1001,5467,$750.00,1/20/14
Supplier Y,50-9501,7009,$250.00,1/30/14
Supplier Y,50-9501,7009,$250.00,1/30/14
Supplier Y,50-9505,6650,$125.00,2/3/14
Supplier Y,50-9505,6650,$125.00,2/3/14
Supplier Z,920-4803,3321,$615.00,2/3/14
Supplier Z,920-4804,3321,$615.00,2/10/14
Supplier Z,920-4805,3321,$6,015.00,2/17/14
Supplier Z,920-4806,3321,$1,006,015.00,2/24/14

运行脚本会在终端中生成以下输出:

['Supplier Name', 'Invoice Number', 'Part Number', 'Cost', 'Purchase Date']
['Supplier X', '001-1001', '2341', '$500.00', '1/20/14']
['Supplier X', '001-1001', '2341', '$500.00', '1/20/14']
['Supplier X', '001-1001', '5467', '$750.00', '1/20/14']
['Supplier X', '001-1001', '5467', '$750.00', '1/20/14']
['Supplier Y', '50-9501', '7009', '$250.00', '1/30/14']
['Supplier Y', '50-9501', '7009', '$250.00', '1/30/14']
['Supplier Y', '50-9505', '6650', '$125.00', '2/3/14']
['Supplier Y', '50-9505', '6650', '$125.00', '2/3/14']
['Supplier Z', '920-4803', '3321', '$615.00', '2/3/14']
['Supplier Z', '920-4805', '3321', '$615.00', '2/17/14']
['Supplier Z', '920-4804', '3321', '$6', '015.00', '2/10/14']
['Supplier Z', '920-4806', '3321', '$1', '006', '015.00', '2/24/14']

但是这本书显示了这样的预期输出:

enter image description here

我做错了什么?

2 个答案:

答案 0 :(得分:6)

您有三种方法可以纠正输出:

  1. 从金额中删除逗号。
  2. 使用QUOTING:将金额用双引号括起来。例如,在第一行$ 500.00将是“$ 500.00”。引用是一种流行的技术。使用引号时,请将read语句更改为:

    filereader = csv.reader(csv_input_file, delimiter=',', quotechar='"')

  3. 使用不同的分隔符。您不必使用逗号作为分隔符。要使用此方法,请将输入文件中的分隔符更改为另一个分隔符。我喜欢管道分隔文件,因为管道很少用作文本。

    filereader = csv.reader(csv_input_file, delimiter='|')

答案 1 :(得分:0)

只需仔细检查,图2-7的屏幕截图显示了Excel的界面。

使用Excel或Numbers等应用程序修改csv文件,然后导出为csv,包含逗号的单元格将被双引号括起来

谢谢大家的详细解释!