ValueError:未知标签类型

时间:2018-07-08 10:55:16

标签: python-3.x machine-learning scikit-learn jupyter-notebook

我正在尝试搜索SVR,并遵循Parameter estimation using grid search with cross-validation¶中给出的教程,但出现错误:

 ValueError: Unknown label type: (array([[0.0970681 ],
           [0.04160906],
           [0.00209168],
           ...,
           [0.92857565],
           [0.64930691],
           [0.20325924]]), array([6.38226813, 6.18596882, 6.03850002, ..., 4.68553846, 7.06541915,
           7.8636379 ]))

我的代码是:

    param = {'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4, 1e-5],
               'C': [1, 10, 100, 1000]}



regressor_1 = SVR(C=1)
TS_split = TimeSeriesSplit(n_splits=3)
scoring='neg_mean_squared_error'
clf = GridSearchCV(regressor_1, param, cv=cv=timeseries_split, verbose=True)


X_gridsearch = pre.MinMaxScaler(feature_range=(0,1)).fit(X_feature)

scaled_X_gridsearch = X_gridsearch.transform(X_feature)


y_gridsearch = pre.MinMaxScaler(feature_range=(0,1)).fit(y_label)

scaled_y_gridsearch = y_gridsearch.transform(y_label)


for scoring in scoring:
    print("Hypter Parameters for %s" % scoring)

clf.fit(scaled_X_gridsearch,scaled_y_gridsearch )

print (scaled_y_gridsearch  )
print (clf.best_params_)
mean = clf.cv_results_['mean_test_score']
std = clf.cv_results_['std_test_score']
for mean, std, params  in zip(mean, std, clf.cv_results_['params']):
     print("%0.3f (+/-%0.03f) for %r"
              % (mean, std * 2, params))

print("Detailed classification report:")

y_true, y_pred= y_test, clf.predict(X_test)
print(classification_report(y_true, y_pred))

我用于scaled_y_gridsearch缩放的数据是:

[0.11321139]
 [0.07218848]
 ...
 [0.64844211]
 [0.4926122 ]
 [0.4030334 ]]

我关于scaled_X_gridsearch的数据是:

[[0.2681013 ]
 [0.03454225]
 [0.02062136]
 ...
 [0.92857565]
 [0.64930691]
 [0.20325924]]

完整的追溯错误消息是:

     50 y_true, y_pred= y_test, clf.predict(X_test)
---> 51 print(classification_report(y_true, y_pred))
     52 
     53 

~/anaconda3_501/lib/python3.6/site-packages/sklearn/metrics/classification.py in classification_report(y_true, y_pred, labels, target_names, sample_weight, digits)
   1419 
   1420     if labels is None:
-> 1421         labels = unique_labels(y_true, y_pred)
   1422     else:
   1423         labels = np.asarray(labels)

~/anaconda3_501/lib/python3.6/site-packages/sklearn/utils/multiclass.py in unique_labels(*ys)
     95     _unique_labels = _FN_UNIQUE_LABELS.get(label_type, None)
     96     if not _unique_labels:
---> 97         raise ValueError("Unknown label type: %s" % repr(ys))
     98 
     99     ys_labels = set(chain.from_iterable(_unique_labels(y) for y in ys))

我不确定为什么这可能是我尽可能遵循Scikit的示例学习的原因。帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

Classification report不用于回归,而是用于分类类型问题。检出this link,然后在“回归指标”下查找,例如 r2_scoremean_squared_errormean square log error

如果这是一个分类问题,请将分类器从SVR更改为SVC,这应该可以工作。