如何将OneVsRestClassifier与NearestCentroid分类器结合使用,以将多标签分类转换为二进制分类

时间:2020-08-29 08:21:43

标签: python pandas machine-learning scikit-learn multilabel-classification

我有一个多类数据集和分类器(使用Nearestcentroid分类器,需要做一个OneVsRestClassifier才能将其转换为二进制分类(用于绘制ROC曲线)。

到目前为止,我已经能够使其与SVMKNNbagging一起使用。但是,当尝试将NearestCentroid分类器与OneVsRestClassifer一起使用时,似乎出现了错误,特别是因为分类器既没有predict_proba()也没有decision_function()函数。

这是我使用的代码的主要部分(对其他代码有用)

import numpy as np
import pandas as pd
from sklearn.neighbors.nearest_centroid import NearestCentroid
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer


text_clf = Pipeline([('vect', CountVectorizer()),
                     ('tfidf', TfidfTransformer()),
                     ('clf', NearestCentroid()),
                     ])


from sklearn.metrics import roc_curve, auc
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import label_binarize

categories = ['Business','Sci/Tech','World','Sports']

y_train = label_binarize(y_train, classes=categories)
y_test = label_binarize(y_test, classes=categories)


# classifier
new_clf = OneVsRestClassifier(text_clf)
another_new_clf = new_clf.fit(X_train, y_train)
y_score = another_new_clf.predict(X_test)

这是我得到的错误:

AttributeError                            Traceback (most recent call last)
~\Anaconda3\lib\site-packages\sklearn\multiclass.py in _predict_binary(estimator, X)
     93     try:
---> 94         score = np.ravel(estimator.decision_function(X))
     95     except (AttributeError, NotImplementedError):

~\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in __get__(self, obj, type)
    109                 else:
--> 110                     getattr(delegate, self.attribute_name)
    111                     break

AttributeError: 'NearestCentroid' object has no attribute 'decision_function'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-25-34dca44072e5> in <module>
      2 new_clf = OneVsRestClassifier(text_clf)
      3 another_new_clf = new_clf.fit(X_train, y_train)
----> 4 y_score = another_new_clf.predict(X_test)
      5 
      6 # new_clf = OneVsRestClassifier(text_clf)

~\Anaconda3\lib\site-packages\sklearn\multiclass.py in predict(self, X)
    331             indptr = array.array('i', [0])
    332             for e in self.estimators_:
--> 333                 indices.extend(np.where(_predict_binary(e, X) > thresh)[0])
    334                 indptr.append(len(indices))
    335             data = np.ones(len(indices), dtype=int)

~\Anaconda3\lib\site-packages\sklearn\multiclass.py in _predict_binary(estimator, X)
     95     except (AttributeError, NotImplementedError):
     96         # probabilities of the positive class
---> 97         score = estimator.predict_proba(X)[:, 1]
     98     return score
     99 

~\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in __get__(self, obj, type)
    108                     continue
    109                 else:
--> 110                     getattr(delegate, self.attribute_name)
    111                     break
    112             else:

AttributeError: 'NearestCentroid' object has no attribute 'predict_proba'

任何帮助将不胜感激。预先感谢。

0 个答案:

没有答案