Sklearn:ValueError:找到样本数不一致的输入变量:[1,6]

时间:2017-05-25 13:20:07

标签: python scikit-learn

X = [ 1994.  1995.  1996.  1997.  1998.  1999.]
y = [1.2 2.3 3.4 4.5 5.6 6.7]
clf = LinearRegression()
clf.fit(X,y)

这给出了上述错误。 X和y都是numpy数组

如何删除此错误?

我尝试了here给出的方法,并使用X.reshape((-1,1))y.reshape((-1,1))重新整形了X和y。但它没有成功。

2 个答案:

答案 0 :(得分:2)

这对我很好。在重新整形之前,请确保阵列是numpy数组。

sudo service nginx restart

答案 1 :(得分:-1)

import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn.cross_validation import train_test_split

df_house = pd.read_csv('CSVFiles/kc_house_data.csv',index_col = 0,engine ='c')

df_house.drop(df_house.columns[[1, 0, 10, 11,12, 13, 14, 15, 16, 17,18]], axis=1, inplace=True)

reg=linear_model.LinearRegression()
df_y=df_house[df_house.columns[1:2]]


df_house.drop(df_house.columns[[6, 7, 8, 5]], axis=1, inplace=True)


x_train, x_test, y_train, y_test=train_test_split(df_house, df_y, test_size=0.1, random_state=7)

print(x_train.shape, y_train.shape)

reg.fit(x_train, x_test)

LinearRegression(copy_x=True, fit_intercept=True, n_jobs=1, normalize=False )

My Shape is :
(19451, 5) (19451, 1)

ValueError: Found input variables with inconsistent numbers of samples: [19451, 2162]