KerasClassifier错误与分类数据

时间:2017-09-01 05:51:33

标签: python neural-network keras keras-layer sklearn-pandas

我尝试在python (3.5)中为分类数据创建神经网络。

我有一个包含47个独立变量(X)的表,以及包含1列因变量(y)的表。此变量是分类的,它是三种可能的选项之一。 因此,我将其标记为LabelEncoder(),以便此变量现在为012。 然后我将这些数字放在三列中:使用OneHotEncoder ,并删除最后一列。原因:因为两个10的组合带来了3种可能的结果。

对于神经网络,我在输出层使用softmax,在损失函数使用categorical_crossentropy(这应该用于分类数据)

当我运行我的代码时,我收到错误:

 classification.py in _check_targets(y_true=array([[ 1.,  0.,  0.],
   [ 0.,  1.,  0.],
...
   [ 0.,  0.,  1.],
   [ 0.,  1.,  0.]]), y_pred=array([2, 2, 2, 2, 2]))
 77     if y_type == set(["binary", "multiclass"]):
 78         y_type = set(["multiclass"])
 79 
 80     if len(y_type) > 1:
 81         raise ValueError("Can't handle mix of {0} and {1}"
---> 82                          "".format(type_true, type_pred))
    type_true = 'multilabel-indicator'
    type_pred = 'binary'
 83 
 84     # We can't have more than one value on y_type => The set is no more needed
 85     y_type = y_type.pop()
 86 

ValueError: Can't handle mix of multilabel-indicator and binary

我不明白错误:type_true - >可能是真实数据的类型(我拥有的真实数据),我可以看到它们是二进制的。

P.S。

如果我删除了y中的两列而不是一列(那么我只剩下一列),并且我使用sigmoid函数和binary_crossentropy丢失函数,我不会任何错误。那么数据准备好了吗?

P.P.S

我的代码是这样的:

# y is like [['first'], ['second'], ['third'],...]
labelencoder_y_1 = LabelEncoder()
y[:, 0] = labelencoder_y_1.fit_transform(y[:, 0])

onehotencoder_y = OneHotEncoder(categorical_features = [0])
y = onehotencoder_y.fit_transform(y).toarray()



# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, 
random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Tuning the ANN
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers import Dense

def build_classifier(optimizer, units, layers):
    classifier = Sequential()
    classifier.add(Dense(units = units, kernel_initializer = 'uniform', activation = 'relu', input_dim = 47))
    for i in range(layers):
        classifier.add(Dense(units = units, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 3, kernel_initializer = 'uniform', activation = 'softmax'))
    classifier.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
    return classifier

classifier = KerasClassifier(build_fn = build_classifier)


parameters = {'batch_size': [32],
          'epochs': [64],
          'optimizer': ['rmsprop'],
          'units': [16],
          'layers': [2]}

grid_serach = GridSearchCV(estimator = classifier,
                       param_grid = parameters,
                       scoring = 'accuracy',
                       cv = 10,
                       n_jobs = 3)
grid_serach = grid_serach.fit(X_train, y_train)
best_parameters = grid_serach.best_params_
best_accuracy = grid_serach.best_score_

编辑: 由于来自@ djk47463

的评论,我编辑我的问题以获得所有三列

0 个答案:

没有答案