Question

这是我的观点。

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

关于数据是我的数据模型。

x-cordinates指＆＃34;薪水＆＃34; y-cordinates指＆＃34;费用＆＃34;

我想在预算时预测费用＆＃34;薪水＆＃34;即，X坐标。

这是我的示例代码。请帮帮我。

from sklearn.linear_model import LinearRegression

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

salary=[]
expenses=[]

for dataset in data:
    # import pdb; pdb.set_trace()
    salary.append(dataset[0])
    expenses.append(dataset[1])

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict([10200.00])
print(prediction)

我得到的错误：

ValueError: Expected 2D array, got 1D array instead:
array=[ 25593.14  98411.    71498.8   38068.    58188.    10220.  ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample

。

Answer 1

正如评论所建议的那样，这样的事情将是一种更好的方式来处理您想要提供给scikit学习模型的数据。另一个例子是here。

from sklearn.linear_model import LinearRegression
import numpy as np

data = np.array(
        [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]
).T

salary = data[0].reshape(-1, 1)
expenses = data[1]

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(np.array([10200.00]).reshape(-1, 1))
print(prediction)

Answer 2

使用列表理解而不是循环更加pythonic（和期望）;
正如您的错误所述，您需要重塑数据。

工作代码：

from sklearn.linear_model import LinearRegression
import numpy as np

dataset = [[25593.14, 39426.66],
           [98411.00, 81869.75],
           [71498.80, 62495.80],
           [38068.00, 54774.00],
           [58188.00, 43453.65],
           [10220.00, 18465.25]]

salary = np.array([data[0] for data in dataset]).reshape(-1,1)
expenses = np.array([data[1] for data in dataset]).reshape(-1,1)
model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(10200.00)
print(prediction)

Answer 3

快速修复，替换此行

model.fit(np.array([salary]), np.array([expenses]))

X应该是一个数组数组，array([arr1,arr2,array3,...])与arr1相同，arr2是至少一个特征的数组，对于y是相同的，它应该是一个包含值列表{{1}的数组}

线性回归起诉Scikitlearn（线性回归）

3 个答案: