Keras自定义损失函数与Lambda层

时间:2019-06-24 23:55:06

标签: keras

我有一个模型,可以使用自定义损失函数进行训练,并且可以正常工作。我想通过将一些计算移到Lambda层来用标准的mean_squared_error替换自定义损失函数。

一些细节: 该模型最终需要产生一个浮点数。原始模型有60个输出,我通过加权平均值将其转换为一个数字。我在损失函数中执行了此操作以与标签进行比较,但是在推断之后也必须这样做。我想将此加权平均值嵌入网络本身的最后一层,然后将其简化。

我想提出建议,只在末尾添加一个节点Dense层,然后让网络解决。我尝试过,但效果不是很好。 (我认为问题在于加权平均值需要除法运算,而这需要被更多的两个密集层模拟)。无论如何,我真的很想了解Lambda图层,因此可以将其添加到我的工具箱中。

这是一些代码,显示我已完成的两件事。我已将其最小化。这些是较大脚本的摘录,但未显示的部分与此相同,只是唯一的区别:

#-----------------------------------------------------
# customLoss
#-----------------------------------------------------
# Define custom loss function that compares calcukated phi
# to true
def customLoss(y_true, y_pred):

    # Calculate weighted sum of prediction
    ones = K.ones_like(y_pred[0,:])       # [1, 1, 1, 1....]   (size Nouts)
    idx  = K.cumsum(ones)                 # [1, 2, 3, 4....]   (size Nouts)
    norm = K.sum(y_pred, axis=1)          # normalization of all outputs by batch. shape is 1D array of size batch
    wavg = K.sum(idx*y_pred, axis=1)/norm # array of size batch with weighted avg. of mean in units of bins
    wavg_cm = wavg*BINSIZE + XMIN         # array of size batch with weighted avg. of mean in physical units

    # Calculate loss
    loss_wavg = K.mean(K.square(y_true[:,0] - wavg_cm), axis=-1)

    return loss_wavg

#-----------------------------------------------------
# DefineModel
#-----------------------------------------------------
# This is used to define the model. It is only called if no model
# file is found in the model_checkpoints directory.
def DefineModel():

    # Build model
    inputs = Input(shape=(height, width, 1), name='image_inputs')
    x = Flatten()(inputs)
    x = Dense( int(Nouts*5), activation='linear')(x)
    x = Dense( Nouts, activation='relu')(x)
    model = Model(inputs=inputs, outputs=[x])

    # Compile the model and print a summary of it
    opt = Adadelta(clipnorm=1.0)
    model.compile(loss=customLoss, optimizer=opt)

    return model
#-----------------------------------------------------
# MyWeightedAvg
#
# This is used by the final Lambda layer of the network.
# It defines the function for calculating the weighted
# average of the inputs from the previous layer.
#-----------------------------------------------------
def MyWeightedAvg(inputs):

    # Calculate weighted sum of inputs
    ones = K.ones_like(inputs[0,:])       # [1, 1, 1, 1....]   (size Nouts)
    idx  = K.cumsum(ones)                 # [1, 2, 3, 4....]   (size Nouts)
    norm = K.sum(inputs, axis=1)          # normalization of all outputs by batch. shape is 1D array of size batch
    wavg = K.sum(idx*inputs, axis=1)/norm # array of size batch with weighted avg. of mean in units of bins
    wavg_cm = wavg*BINSIZE + XMIN         # array of size batch with weighted avg. of mean in physical units

    return wavg_cm

#-----------------------------------------------------
# DefineModel
#-----------------------------------------------------
# This is used to define the model. It is only called if no model
# file is found in the model_checkpoints directory.
def DefineModel():

    # Build model
    inputs = Input(shape=(height, width, 1), name='image_inputs')
    x = Flatten()(inputs)
    x = Dense( int(Nouts*5), activation='linear')(x)
    x = Dense( Nouts, activation='relu')(x)
    x = Lambda(MyWeightedAvg, output_shape=(1,), name='z_output')(x)
    model = Model(inputs=inputs, outputs=[x])

    # Compile the model and print a summary of it
    opt = Adadelta(clipnorm=1.0)
    model.compile(loss='mean_squared_error', optimizer=opt)

    return model

我希望它们能得到相同的结果,但是自定义损失函数似乎训练得很好,并且产生的损失值在几个时期内都稳定地下降,而Lamda下降到18.72的值...并且接近于此振荡

1 个答案:

答案 0 :(得分:0)

在K.sum操作中使用keepdims=True。这是保持正确形状所必需的。

尝试以下操作:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K

BINSIZE = 1
XMIN = 0

def weighted_avg(inputs):
    # Calculate weighted sum of inputs
    ones = K.ones_like(inputs[0,:])       # [1, 1, 1, 1....]   (size Nouts)
    idx  = K.cumsum(ones)                 # [1, 2, 3, 4....]   (size Nouts)
    norm = K.sum(inputs, axis=-1, keepdims=True)          # normalization of all outputs by batch. shape is 1D array of size batch
    wavg = K.sum(idx*inputs, axis=-1, keepdims=True)/norm # array of size batch with weighted avg. of mean in units of bins
    wavg_cm = wavg*BINSIZE + XMIN         # array of size batch with weighted avg. of mean in physical units

    return wavg_cm

def make_model():
  inp = Input(shape=(4,))
  out = Lambda(weighted_avg)(inp)
  model = Model(inp, out)
  model.compile('adam', 'mse')
  return model

model = make_model()
model.summary()

简单的测试代码:

import numpy as np
X = np.array([
    [1, 1, 1, 1],
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1]
])
model.predict(X)

predict应该发出一个列向量,例如:

array([[2.5],
       [1. ],
       [2. ],
       [3. ],
       [4. ]], dtype=float32)