训练损失少,验证损失大,验证精度低

时间:2019-04-16 17:37:28

标签: python tensorflow keras neural-network

我正在尝试使用categorical_crossentropy处理多类分类问题(心脏病数据集)的Keras(后端为TensorFlow)获得良好的准确性。我的模型可以达到良好的训练精度,但是验证精度很低(验证损失很高)。我曾尝试过拟合的解决方案(例如规范化,辍学,正则化等),但我仍然遇到相同的问题。到目前为止,我一直在尝试优化器,损失,时期和批处理大小,但没有成功。这是我正在使用的代码:

import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.optimizers import SGD,Adam
from keras.layers import Dense, Dropout
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.models import load_model
from keras.regularizers import l1,l2
# fix random seed for reproducibility
np.random.seed(5)
data = pd.read_csv('ProcessedClevelandData.csv',delimiter=',',header=None)
#Missing Values
Imp=SimpleImputer(missing_values=np.nan,strategy='mean',copy=True)
Imp=Imp.fit(data.values)
Imp.transform(data)
X = data.iloc[:, :-1].values
y=data.iloc[:,-1].values

y=to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.1)
scaler = StandardScaler()
X_train_norm = scaler.fit_transform(X_train)
X_test_norm=scaler.transform(X_test)
# create model
model = Sequential()
model.add(Dense(13, input_dim=13, activation='relu',use_bias=True,kernel_regularizer=l2(0.0001)))
#model.add(Dropout(0.05))
model.add(Dense(9, activation='relu',use_bias=True,kernel_regularizer=l2(0.0001)))
#model.add(Dropout(0.05))
model.add(Dense(5,activation='softmax'))
sgd = SGD(lr=0.01, decay=0.01/32, nesterov=False)
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])#adam,adadelta,
print(model.summary())
history=model.fit(X_train_norm, y_train,validation_data=(X_test_norm,y_test), epochs=1200, batch_size=32,shuffle=True)
# list all data in history
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

这是输出的一部分,您可以在其中看到上述行为:

Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 13)                182       
_________________________________________________________________
dense_2 (Dense)              (None, 9)                 126       
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 50        
=================================================================
Total params: 358
Trainable params: 358
Non-trainable params: 0
_________________________________________________________________

Train on 272 samples, validate on 31 samples
Epoch 1/1200

 32/272 [==>...........................] - ETA: 21s - loss: 1.9390 - acc: 0.1562
272/272 [==============================] - 3s 11ms/step - loss: 2.0505 - acc: 0.1434 - val_loss: 2.0875 - val_acc: 0.1613
Epoch 2/1200

 32/272 [==>...........................] - ETA: 0s - loss: 1.6747 - acc: 0.2188
272/272 [==============================] - 0s 33us/step - loss: 1.9416 - acc: 0.1544 - val_loss: 1.9749 - val_acc: 0.1290
Epoch 3/1200

 32/272 [==>...........................] - ETA: 0s - loss: 1.7708 - acc: 0.2812
272/272 [==============================] - 0s 37us/step - loss: 1.8493 - acc: 0.1801 - val_loss: 1.8823 - val_acc: 0.1290
Epoch 4/1200

 32/272 [==>...........................] - ETA: 0s - loss: 1.9051 - acc: 0.2188
272/272 [==============================] - 0s 33us/step - loss: 1.7763 - acc: 0.1949 - val_loss: 1.8002 - val_acc: 0.1613
Epoch 5/1200

 32/272 [==>...........................] - ETA: 0s - loss: 1.6337 - acc: 0.2812
272/272 [==============================] - 0s 33us/step - loss: 1.7099 - acc: 0.2426 - val_loss: 1.7284 - val_acc: 0.1935
Epoch 6/1200
....
 32/272 [==>...........................] - ETA: 0s - loss: 0.0494 - acc: 1.0000
272/272 [==============================] - 0s 37us/step - loss: 0.0532 - acc: 1.0000 - val_loss: 4.1031 - val_acc: 0.5806
Epoch 1197/1200

 32/272 [==>...........................] - ETA: 0s - loss: 0.0462 - acc: 1.0000
272/272 [==============================] - 0s 33us/step - loss: 0.0529 - acc: 1.0000 - val_loss: 4.1174 - val_acc: 0.5806
Epoch 1198/1200

 32/272 [==>...........................] - ETA: 0s - loss: 0.0648 - acc: 1.0000
272/272 [==============================] - 0s 37us/step - loss: 0.0533 - acc: 1.0000 - val_loss: 4.1247 - val_acc: 0.5806
Epoch 1199/1200

 32/272 [==>...........................] - ETA: 0s - loss: 0.0610 - acc: 1.0000
272/272 [==============================] - 0s 29us/step - loss: 0.0532 - acc: 1.0000 - val_loss: 4.1113 - val_acc: 0.5484
Epoch 1200/1200

 32/272 [==>...........................] - ETA: 0s - loss: 0.0511 - acc: 1.0000
272/272 [==============================] - 0s 29us/step - loss: 0.0529 - acc: 1.0000 - val_loss: 4.1209 - val_acc: 0.5484

2 个答案:

答案 0 :(得分:0)

问题可能是您的数据在培训和测试分组中分布不均(如评论中所述)。尝试查看分布是否不均匀,如果是这种情况,请尝试使用其他种子。 在使用小型医疗数据集之前,我也遇到过类似的问题。数据集越小,分割后的数据集将无法准确表示真实分布的可能性越大。

编辑:取决于您设置种子的方式 np.random.seed(my_seed)将其设置为numpy,或random.seed(my_seed)将其设置为python模块,或将其设置为keras,请遵循their documentation

答案 1 :(得分:0)

Help yourself by increasing your validation size to more like ~30%, unless you really have a large data set. Even 50/50 is often used.

Remember that good loss and acc with bad val_loss and val_acc implies overfitting.

Try this basic solution:

from keras.callbacks import EarlyStopping, ReduceLROnPlateau

early_stop = EarlyStopping(monitor='val_loss',patience=10)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1,
                              patience=6, verbose=1, mode='auto',
                              min_delta=0.0001, cooldown=0, min_lr=1e-8)

history = model.fit(X,y,num_epochs=666,callbacks=[early_stop,reduce_lr])

Hope that helps!