ValueError:不支持连续格式

时间:2017-06-10 00:04:51

标签: python scikit-learn

我编写了一个简单的函数,我使用scikit-learn中的average_precision_score来计算平均精度。

我的代码:

def compute_average_precision(predictions, gold):
    gold_predictions = np.zeros(predictions.size, dtype=np.int)
    for idx in range(gold):
        gold_predictions[idx] = 1
    return average_precision_score(predictions, gold_predictions)

执行该功能时,会产生以下错误。

Traceback (most recent call last):
  File "test.py", line 91, in <module>
    total_avg_precision += compute_average_precision(np.asarray(probs), len(gold_candidates))
  File "test.py", line 29, in compute_average_precision
    return average_precision_score(predictions, gold_predictions)
  File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/ranking.py", line 184, in average_precision_score
    average, sample_weight=sample_weight)
  File "/if5/wua4nw/anaconda3/lib/python3.5/site-packages/sklearn/metrics/base.py", line 81, in _average_binary_score
    raise ValueError("{0} format is not supported".format(y_type))
ValueError: continuous format is not supported

如果我打印两个numpy数组predictionsgold_predictions,比如说一个例子,它看起来没问题。 [下面提供了一个例子。]

[ 0.40865014  0.26047812  0.07588802  0.26604077  0.10586583  0.17118802
  0.26797949  0.34618672  0.33659923  0.22075308  0.42288553  0.24908153
  0.26506338  0.28224747  0.32942101  0.19986877  0.39831917  0.23635269
  0.34715138  0.39831917  0.23635269  0.35822859  0.12110706]
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

我在这里做错了什么?错误是什么意思?

1 个答案:

答案 0 :(得分:6)

只需查看sklearn docs

即可
  

参数:

     

y_true:array,shape = [n_samples]或[n_samples,n_classes] True   二进制标签指示符中的二进制标签。

     

y_score:array,shape = [n_samples]或[n_samples,n_classes]目标   分数,可以是正类的概率估计,   置信度值,或非阈值度量决策(如   在某些分类器上由“decision_function”返回。)

所以你的第一个参数必须是二进制标签数组,但是你传递某种float数组作为第一个参数。所以我相信你需要改变你传递的参数的顺序。