csv.Error:换行符

时间:2016-05-16 22:01:17

标签: python numpy

我正在使用一些示例代码(下面)来测试NB分类器,我从第22行得到以下错误:

_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

这是csv文件的示例行:

b8:27:eb:38:72:a7,df598b5eb8f4,5/9/16 14:47,154aec250ef6,-84,outside

代码示例:

from sklearn.preprocessing import LabelBinarizer
import numpy as np
from sklearn import naive_bayes
import csv
import random
from sklearn import metrics
import urllib
url = "example.com"
webpage = urllib.urlopen(url)
# download the file
#raw_data = urllib.urlopen(url)

datareader = csv.reader(webpage) #line 22 is this one

ct = 0;
for row in datareader:
  ct = ct+1
webpage = urllib.urlopen(url)
datareader = csv.reader(webpage)
data = np.array(-1*np.ones((ct,6),float),object);
k=0;
for row in datareader:
    data[k,:] = np.array(row)
    k = k+1;

featnames = np.array(['unti','dongle','timestamp','tracker','rssi','label'],str)

keys = [[]]*np.size(data,1)
numdata = -1*np.ones_like(data);

for k in range(np.size(data,1)):
    keys[k],garbage,numdata[:k] = np.unique(data[:,k],True,True)

numrows = np.size(numdata,0);
numcols = np.size(numdata,1);
numdata = np.array(numdata, int)
xdata = numdata[:,:-1]
ydata = numdata[:,-1]

lbin = LabelBinarizer();
for k in range(np.size(xdata,1)):
 if k==0:
   xdata_ml = lbin.fit_transform(xdata[:,k]);
 else:
   xdata_ml = np.hstack((xdata_ml,lbin.fit_transform(xdata[:,k])))
ydata_ml = lbin.fit_transform(ydata)


allIDX = np.arrange(numrows);
random.shuffle(allIDX);
holdout_number = numrows/10;
testIDX = allIDX[0:holdout_number];
trainIDX = allIDX[holdout_number:];

xtest = xdata_ml[testIDX,:];
xtrain = xdata_ml[trainIDX,:];
ytest = ydata[testIDX];
ytrain = ydata[trainIDX];

mnb = naive_bayes.MultinomialNB();
mnb.fit(xtrain,ytrain);
print "Classification accuracy of MNB =", mnb.score(xtest,ytest)

任何人都可以帮我找到错误并建议修复吗?

2 个答案:

答案 0 :(得分:0)

你正在使用Windows吗?如果是,可以通过以下方式解决:

explicit

答案 1 :(得分:0)

此处的部分答案CSV new-line character seen in unquoted field error是指MAC中的CSV

您是否可以尝试将文件手动下载到MAC并尝试将该文件作为本地文件执行以下操作:

1)将文件另存为CSV(MS-DOS逗号分隔)

2)将文件另存为CSV(Windows逗号分隔)

3)运行以下脚本

with open(csv_filename, 'rU') as csvfile:
    csvreader = csv.reader(csvfile)
    for row in csvreader:
        print ', '.join(row)
关于' ru':https://www.python.org/dev/peps/pep-0278/

的解释

在具有通用换行符支持open()的Python中,mode参数也可以是" U",意思是"打开输入作为具有通用换行符解释的文本文件"。模式" rU"也允许使用" rb"

进行对称

<强>原理

通用换行支持在C中实现,而不是在Python中实现。 这样做是因为我们想要一个带有外部换行符的文件 约定是可导入的,因此可以使用Python Lib目录 通过远程文件系统连接共享,或者在 MacPython之间共享 和Mac OS X上的Unix-Python