我想将多个 CSV 文件(对于每个 CSV 文件,我只需要第一列的前五个元素)转换为文本文件。这是我的代码。
将熊猫导入为 pd 导入操作系统
for root, dirs, files in os.walk("./data_v6/level3/"):
count = 1
for dir in dirs:
print(dir)
count= count+1
print(count)
df = pd.read_csv('data_v6/level3/'+dir+'/tweets_topic.csv',usecols=[0])
print(df.loc[0:4])
#print(df)
df.to_csv('data_v6/level3/topic_DIC.txt', header=None, index=None, sep=' ', mode='a')
但是它不能工作。我收到了这个错误。
File "F:/RUN/RUN/GetDictionary.py", line 11, in <module>
print(df.loc[0:4])
File "F:\anaconda\lib\site-packages\pandas\core\indexing.py", line 879, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "F:\anaconda\lib\site-packages\pandas\core\indexing.py", line 1088, in _getitem_axis
return self._get_slice_axis(key, axis=axis)
File "F:\anaconda\lib\site-packages\pandas\core\indexing.py", line 1122, in _get_slice_axis
indexer = labels.slice_indexer(
File "F:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 4966, in slice_indexer
start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
File "F:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 5167, in slice_locs
start_slice = self.get_slice_bound(start, "left", kind)
File "F:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 5079, in get_slice_bound
label = self._maybe_cast_slice_bound(label, side, kind)
File "F:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 5031, in _maybe_cast_slice_bound
self._invalid_indexer("slice", label)
File "F:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 3267, in _invalid_indexer
raise TypeError(
TypeError: cannot do slice indexing on Index with these indexers [0] of type int
答案 0 :(得分:0)
您正在尝试使用大功率锤钻,而简单的螺丝刀更合适。 Pandas 确实是一个非常强大的库,它可以通过自动类型检测很好地处理 csv 文件,但您不需要所有这些:您只需要第一列的前 4 个字段。
只需在此处使用 csv 模块:
with open('data_v6/level3/topic_DIC.txt', 'a') as outfile:
for root, dirs, files in os.walk("./data_v6/level3/"):
count = 1
for dir in dirs:
print(dir)
count= count+1
print(count)
with open('data_v6/level3/'+dir+'/tweets_topic.csv') as fd:
rd = csv.reader(fd)
try:
_ = next(rd) # skip header line
exception StopIteration:
print('data_v6/level3/'+dir+'/tweets_topic.csv is empty')
continue
try:
for i in range(4):
try:
row = next(rd)
print(row[0], file=outfile)
except StopIteration:
break