Question

我正在尝试根据csv列中的值是否为某个字符串来编写条件。

这是我的代码，我会根据“类型”列中的单元格内容是否为“问题”执行一些内容：

f = open('/Users/samuelfinegold/Documents/harvard/edXresearch/snaCreationFiles/time_series/time_series.csv','rU')
reader = csv.DictReader(f, delimiter=',')

for line in reader:
    if line['type'] == 'Question':
         print "T"

CSV：

我收到的错误：AttributeError: DictReader instance has no attribute '__getitem__'

post_id thread_id   author_id   post_content  types       time     votes_up votes_down posters  
1           0           Jan     NULL          Question    3/1/12 10:45  5   1   Jan, Janet, Jack
2           0           Janet   NULL          Answer      3/1/12 11:00  2   1   Jan, Janet, Jack
3           0           Jack    NULL          Comment     3/2/12 8:00   0   0   Jan, Janet, Jack
4           1           Jason   NULL          Question    3/4/12 9:00   3   1   Jason, Jan, Janet
5           1           Jan     NULL          Answer      3/7/12 1:00   3   1   Jason, Jan, Janet
6           1           Janet   NULL          Answer      3/7/12 2:00   1   2   Jason, Jan, Janet

Answer 1

我将您提供的数据放在以逗号分隔的CSV文件中，然后我根据您提供的数据运行您的代码并为KeyError获得type，因此我将if line['type']更改为if line['types']它有效。

我的代码：

import csv

f = open('test.csv','rU')
reader = csv.DictReader(f,delimiter=',')

for line in reader:
    print line
    if line['types'] == 'Question':
        print 'The above line has type question'

我的输出：

{'thread_id': '0', 'posters  ': 'Jan', None: ['Janet', 'Jack'], 'post_id': '1', 'post_content': 'NULL', 'time': '3/1/12 10:45', 'votes_down': '1', 'votes_up': '5', 'author_id': 'Jan', 'types': 'Question'}
The above line has type question
{'thread_id': '0', 'posters  ': 'Jan', None: ['Janet', 'Jack'], 'post_id': '2', 'post_content': 'NULL', 'time': '3/1/12 11:00', 'votes_down': '1', 'votes_up': '2', 'author_id': 'Janet', 'types': 'Answer'}
{'thread_id': '0', 'posters  ': 'Jan', None: ['Janet', 'Jack'], 'post_id': '3', 'post_content': 'NULL', 'time': '3/2/12 8:00', 'votes_down': '0', 'votes_up': '0', 'author_id': 'Jack', 'types': 'Comment'}
{'thread_id': '1', 'posters  ': 'Jason', None: ['Jan', 'Janet'], 'post_id': '4', 'post_content': 'NULL', 'time': '3/4/12 9:00', 'votes_down': '1', 'votes_up': '3', 'author_id': 'Jason', 'types': 'Question'}
The above line has type question
{'thread_id': '1', 'posters  ': 'Jason', None: ['Jan', 'Janet'], 'post_id': '5', 'post_content': 'NULL', 'time': '3/7/12 1:00', 'votes_down': '1', 'votes_up': '3', 'author_id': 'Jan', 'types': 'Answer'}
{'thread_id': '1', 'posters  ': 'Jason', None: ['Jan', 'Janet'], 'post_id': '6', 'post_content': 'NULL', 'time': '3/7/12 2:00', 'votes_down': '2', 'votes_up': '1', 'author_id': 'Janet', 'types': 'Answer'}

你有一个名为None的密钥的原因是因为在海报栏中数据已经用逗号分隔，因此只有列中的第一个值会被赋予关键字'海报'

我仍然不确定你为什么会得到attribute error，但只需对代码进行简单的更改就可以了。

Answer 2

python有一个模块来处理标准库中的csv文件

https://www.google.com/search?q=python+csv

第一击：

http://docs.python.org/library/csv.html

Answer 3

也许你应该检查你的数据是否有标题行

has_header(sample)

如何在Python中检查电子表格中列的值

3 个答案: