我在随机森林分类器中得到Not Fitted错误?

时间:2018-03-30 16:46:49

标签: python pandas numpy machine-learning scikit-learn

我有4个功能和一个目标变量。我正在使用RandomForestRegressor而不是RandomForestClassifer,因为我的目标变量是float。当我试图适应我的模型,然后按排序顺序输出它们以获得我得到的重要功能不适合错误如何修复它?

代码:

import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn import datasets
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import SelectFromModel
from sklearn.metrics import accuracy_score

# Split the data into 30% test and 70% training
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
feat_labels = data.columns[:4]

regr = RandomForestRegressor(max_depth=2, random_state=0)
#clf = RandomForestClassifier(n_estimators=100, random_state=0)

# Train the classifier
#clf.fit(X_train, y_train)
regr.fit(X, y)

importances = clf.feature_importances_
indices = np.argsort(importances)[::-1]

for f in range(X_train.shape[1]):
    print("%2d) %-*s %f" % (f + 1, 30, feat_labels[indices[f]], importances[indices[f]]))

enter image description here

2 个答案:

答案 0 :(得分:4)

您适合apply.myotherclass,但在regr上调用要素重要性。请尝试调用此方法:

clf

答案 1 :(得分:1)

我注意到之前你的分类器适合你设置的训练数据,但回归量现在适合X和y。

但是,我不会在这里看到您首先设置X和y的位置,或者更实际加载数据集的位置。难道你忘了这一步以及Harpal在另一个答案中提到的那个吗?