ValueError:找到样本数量不一致的输入变量:[12600、4200]

时间:2018-06-24 15:25:17

标签: python scikit-learn svm

在这段代码中,我将进行时间序列拆分,然后使用scikit进行学习,以创建用于预测的SVR模型。我的代码是:

from sklearn import preprocessing as pre 
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import TimeSeriesSplit
from sklearn import svm
from sklearn.preprocessing import MinMaxScaler



X_feature = wind_speed

X_feature = X_feature.reshape(-1, 1)## Reshaping array to be 1D from 2D

y_label = Power
y_label = y_label.reshape(-1,1)

timeseries_split = TimeSeriesSplit(n_splits=3)
for train1, test1 in timeseries_split.split(X_feature):

    print("Training data:",train1, "Testing data test:", test1)
train1 = train1.reshape(-1,1)## Reshaping array to be 1D fron 2D
test1 = test1.reshape(-1,1)

timeseries_split = TimeSeriesSplit(n_splits=3)
for train, test in timeseries_split.split(y_label):
    print("Training data_1:",train, "Testing data test_1:", test)

scaler =pre.MinMaxScaler(feature_range=(0,1)).fit(train1)


scaled_wind_speed_train = scaler.transform(train1)
print("scaler", scaled_wind_speed_train)
scaled_wind_speed_test = scaler.transform(test1)

SVR_model = svm.SVR(kernel='rbf',C=100,gamma=.001).fit(scaled_wind_speed_train,train)

y_prediction = SVR_model.predict(y_label)


    print (y_prediction)
    SVR_model.score(scaled_wind_speed_test,train)

我收到的错误是:

Training data: [   0    1    2 ... 4197 4198 4199] Testing data test: [4200 4201 4202 ... 8397 8398 8399]
Training data: [   0    1    2 ... 8397 8398 8399] Testing data test: [ 8400  8401  8402 ... 12597 12598 12599]
Training data: [    0     1     2 ... 12597 12598 12599] Testing data test: [12600 12601 12602 ... 16797 16798 16799]
Training data_1: [   0    1    2 ... 4197 4198 4199] Testing data test_1: [4200 4201 4202 ... 8397 8398 8399]
Training data_1: [   0    1    2 ... 8397 8398 8399] Testing data test_1: [ 8400  8401  8402 ... 12597 12598 12599]
Training data_1: [    0     1     2 ... 12597 12598 12599] Testing data test_1: [12600 12601 12602 ... 16797 16798 16799]
scaler [[0.00000000e+00]
 [7.93713787e-05]
 [1.58742757e-04]
 ...
 [9.99841257e-01]
 [9.99920629e-01]
 [1.00000000e+00]]
/home/nbuser/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with input dtype int64 was converted to float64 by MinMaxScaler.
  warnings.warn(msg, DataConversionWarning)
[6153.41834275 6006.33852041 5997.57462806 ... 6569.44075144 6393.55696288
 6112.57831243]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-53-925646f8c16a> in <module>()
     43 
     44 print (y_prediction)
---> 45 SVR_model.score(scaled_wind_speed_test,train)
     46 
     47 

~/anaconda3_501/lib/python3.6/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
    385         from .metrics import r2_score
    386         return r2_score(y, self.predict(X), sample_weight=sample_weight,
--> 387                         multioutput='variance_weighted')
    388 
    389 

~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/regression.py in r2_score(y_true, y_pred, sample_weight, multioutput)
    528     """
    529     y_type, y_true, y_pred, multioutput = _check_reg_targets(
--> 530         y_true, y_pred, multioutput)
    531 
    532     if sample_weight is not None:

~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/regression.py in _check_reg_targets(y_true, y_pred, multioutput)
     73 
     74     """
---> 75     check_consistent_length(y_true, y_pred)
     76     y_true = check_array(y_true, ensure_2d=False)
     77     y_pred = check_array(y_pred, ensure_2d=False)

~/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
    202     if len(uniques) > 1:
    203         raise ValueError("Found input variables with inconsistent numbers of"
--> 204                          " samples: %r" % [int(l) for l in lengths])
    205 
    206 

ValueError: Found input variables with inconsistent numbers of samples: [12600, 4200]

我认为错误可能是:SVR_model.score(scaled_wind_speed_test,train),但我不知道如何解决此问题。我已将压痕编辑为完全原始,但不确定是否有任何无意的压痕可能导致错误。

1 个答案:

答案 0 :(得分:3)

假设我已正确理解您的代码,则以下行应修复该错误

SVR_model.score(scaled_wind_speed_test,test)