Keras训练准确率低于60%

时间:2017-08-12 06:35:17

标签: python tensorflow keras

我需要帮助,如何使用具有张量流作为后端的keras来提高训练的准确性。

首先我从这里下载了公共电气数据集 - > http://plaidplug.com/

然后我为每个类别选择了5个数据集,选择2000行Current(I)每个数据集,将它们堆叠起来并保存为input.h5

该文件将具有这样的结构。我将此堆叠矩阵命名为currentdata

[[~2000 of data for AC],
 [~2000 of data for AC],
 ...,
 ...,
 ...,
 [~2000 of data for CFL],
 [~2000 of data for CFL],
 ...,
 ...,
 ...,
 [~2000 of data for Fridge],
 ...,
 ...,
 ...,
 ...,
 ...,
 [~2000 of data for Heater]]

然后我为输出创建了一个txt文件。该文件由以下字符串组成,表示每个类别的数据集(5)的数量:

AC,AC,AC,AC,AC,CFL,CFL,CFL,CFL,CFL,Fridge,Fridge,...,Heater,Heater,Heater,Heater,Heater

以下是我的代码

import numpy as np
import h5py
import pandas
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers.normalization import BatchNormalization
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint
from sklearn.model_selection import cross_val_score, KFold
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.pipeline import Pipeline

seed = 7 
np.random.seed(seed)

### load dataset with 2000 row ###
input_data = h5py.File('input.h5', 'r')
output_type = open('output.txt', 'r')

X = input_data['currentdata'][:] ## input the stacked matrix
Y = output_type.read().split(',') ## read output 

## encode class value as integer
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
## convert integers to dummy variables (one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)

weight_path = "/home/fang/workspace/myproject/weight/"

def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(800, input_dim=2000, init='normal', activation='relu'))
    model.add(Dense(400, init='normal', activation='relu'))
    model.add(Dense(200, init='normal', activation='relu'))
    model.add(Dense(11, activation='softmax')) ## 11 output category based from PLAID dataset

    ## Compile model
    #opt = optimizers.SGD(lr=0.02, decay=1e-6, momentum=0.9, nesterov=True)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    ## save the architecture
    model_json = model.to_json()
    with open('current_data_11type.json', 'w') as json_file:
        json_file.write(model_json)

    ## save the weight
    model.save_weights(weight_path + 'current_data_4type.h5')

    return model

#estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=500,batch_size=64, verbose=0)
estimators = []
estimators.append(('standardized', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=baseline_model, nb_epoch=400, batch_size=32, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, shuffle=True, random_state=seed)

#results = cross_val_score(estimator, X, dummy_y, cv=kfold)
results = cross_val_score(pipeline, X, dummy_y, cv=kfold)
print ("Accuracy: %.2f%% (%.2f%%)" %(results.mean()*100, results.std()*100))

我尝试使用和不使用StandardScaler(),但这也没有给出任何好结果。我得到的最高准确率是57%。

我还尝试使用学习率为SGD的{​​{1}}优化工具,而不是使用0.01~0.06,但这也没有给我带来好结果。

添加图层,将adam从低至5更改为10并使用32,64,128对我没有任何帮助。

我的系统:

batch_size

我也试图增加我的数据大小。(> 5000行)但这会给我OS:Ubuntu 16.04 LTS processor:Intel® Core™ i3-5005U CPU @ 2.00GHz × 4 RAM:4GB GPU:GeForce 920MX 2GB

有没有人对如何解决这类问题有任何想法或建议?

感谢您的帮助。

0 个答案:

没有答案