Question

在此代码中，作者定义了2个输入，但是模型只有一个输入提要。应该有一些错误，但是，我可以运行它。我不知道为什么我可以成功运行此代码。

def han():
    # refer to 4.2 in the paper whil reading the following code

    # Input for one day : max article per day =40, dim_vec=200
    input1 = Input(shape=(40, 200), dtype='float32')

    # Attention Layer
    dense_layer = Dense(200, activation='tanh')(input1)
    softmax_layer = Activation('softmax')(dense_layer)
    attention_mul = multiply([softmax_layer,input1])
    #end attention layer


    vec_sum = Lambda(lambda x: K.sum(x, axis=1))(attention_mul)
    pre_model1 = Model(input1, vec_sum)
    pre_model2 = Model(input1, vec_sum)

    # Input of the HAN shape (None,11,40,200)
    # 11 = Window size = N in the paper 40 = max articles per day, dim_vec = 200
    input2 = Input(shape=(11, 40, 200), dtype='float32')

    # TimeDistributed is used to apply a layer to every temporal slice of an input 
    # So we use it here to apply our attention layer ( pre_model ) to every article in one day
    # to focus on the most critical article
    pre_gru = TimeDistributed(pre_model1)(input2)

# bidirectional gru
    l_gru = Bidirectional(GRU(100, return_sequences=True))(pre_gru)

    # We apply attention layer to every day to focus on the most critical day   
    post_gru = TimeDistributed(pre_model2)(l_gru)

# MLP to perform classification
    dense1 = Dense(100, activation='tanh')(post_gru)
    dense2 = Dense(3, activation='tanh')(dense1)
    final = Activation('softmax')(dense2)
    final_model = Model(input2, final)
    final_model.summary()

    return final_model

Answer 1

Keras模型可以用作图层。在上面的代码中，input1用于定义pre_model{1,2}。然后，名为final_model的模型将这些模型称为波纹管。

final_model具有单个输入层。

为什么Keras中的注意力模型只有一个输入？

1 个答案: