错误的预测,但模型准确性高

时间:2018-08-25 08:44:07

标签: python tensorflow machine-learning keras

我试图在Tensorflow上构建我的第一个Keras深度神经网络,并希望使用Flask进行部署。我获取了航空公司的示例数据,并希望预测航班是否延误。首先,我仅使用示例列:Year, Month, DayOfWeek, UniqueCarrier, FlightNum, Origin, Dest, Distance。使用标签编码器将UniqueCarrier, Origin, Dest转换为数值。然后,运行以下程序之后,发现精度为93%。但是,当我通过rest api发送参数手动运行预测时,我总是得到1作为输出。不知道需要做什么。

下面是一些代码和示例输出:

  le = LabelEncoder()

  data["UniqueCarrier"] = le.fit_transform(data["UniqueCarrier"])
  UniqueCarrier = list(le.classes_)
  print(UniqueCarrier)
  data["Origin"] = le.fit_transform(data["Origin"])
  Carrier = list(le.classes_)
  print(Carrier)
  data["Dest"] = le.fit_transform(data["Dest"])
  TailNum = list(le.classes_)
  print(TailNum)

数据已设置为预测变量和目标:

rfDataOriginal = pd.DataFrame(data)
Delay_YesNo = rfDataOriginal['IsDepDelayed']
rfDataOriginal.drop(['IsDepDelayed'], axis=1, inplace=True)

删除目标变量:

print('Dimension reduced to:')
print(len(rfDataOriginal.columns))

功能扩展:

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

创建模型:

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(15, input_dim=12, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

编译模型:

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

为Tensorboard图形目的登录:

import keras
tbCallBack = keras.callbacks.TensorBoard(log_dir='/tmp/keras_logs',  write_graph=True)

拟合模型:

model.fit(X_train, y_train, epochs=5, batch_size=30,  verbose=1, callbacks=[tbCallBack])

创建混淆矩阵:

from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, Y_pred)
print("\nConfusion Matrix:")
print(cm)
acs = accuracy_score(y_test, Y_pred)
print("\nAccuracy Score: %.2f%%" % (acs * 100))

Confusion Matrix:
[[41614   322]
[ 5664 35894]]

 Accuracy Score: 92.83%

通过传递参数进行预测测试:

inputFeature = [1989, 9, 14, 1719, 1720, 1845, 1859, 11, 927, 58, 68, 997]
inputFeature = np.asarray(inputFeature).reshape(1, 12)
model.predict(inputFeature)

Output: array([[ 1.]], dtype=float32)

inputFeature = [1989, 11, 24, 1144, 1144, 1633, 1635, 0, 816, 213, 59, 1205]
inputFeature = np.asarray(inputFeature).reshape(1, 12)
model.predict(inputFeature)

array([[ 1.]], dtype=float32)

0 个答案:

没有答案