如何从softmax层提取权重?

时间:2018-05-18 11:20:05

标签: python keras neural-network lstm

我很难理解Keras中权重矩阵的输出形状。

我有一个简单的BiLSTM,如下所示:

model = Sequential()
model.add(Embedding(vocab_size, embedding_size, input_length=55, weights=[pretrained_weights])) 
model.add(Bidirectional(LSTM(units=embedding_size)))
model.add(Dense(5926, activation='softmax')) # number of classes

print(model.summary())

weights = model.layers[-1].get_weights()
print(weights)
print(len(weights))
print(weights[0][0].shape)
print(weights[0][0])

for e in zip(model.layers[-1].trainable_weights, model.layers[-1].get_weights()):
    print('Param %s:\n%s' % (e[0],e[1]))

model.compile(loss='categorical_crossentropy',
          optimizer = RMSprop(lr=0.0005),
          metrics=['accuracy'])

model.fit(np.array(X_train), np.array(y_train), epochs=100, validation_data=(np.array(X_val), np.array(y_val)))

如果我打印最后一层图层的重量形状,我会得到这个:

Param <tf.Variable 'dense_14/kernel:0' shape=(200, 5926) dtype=float32_ref>:

所以形状是(200,5926)。

根据类的数量,我网络中的神经元数量是多少。我想找到一种方法来提取与每个预测相关的权重,因为那时我需要更新权重矩阵。

我的测试集由我的680个句子组成,每个句子我有1个标签。预测具有以下形状:

predictions = model.predict(np.array(X_test))
# shape predictions = (680, 5926)

有没有办法从softmax层中提取每个预测的权重(shape =(680,5926)?喜欢:

predictions = [probability_class_1, probability_class_2,......, probability_class_5926] 
weights = [weight_class_1, weight_class_2, ......., weight_class_5926]

1 个答案:

答案 0 :(得分:1)

您应该使用带掩码的第二个输入来说明哪些句子用于哪些句子并执行简单的元素乘法运算:

sentenceInputs = Input((sentenceLength,))
desiredVerbs = Input((5926,))

sentenceOutputs = Embedding(vocab_size, embedding_size, input_length=55, weights=[pretrained_weights])(sentenceInputs)
sentenceOutputs = Bidirectional(LSTM(units=embedding_size))(sentenceOutputs)

sentenceOuptuts = Dense(5926)(sentenceOutputs)
selectedOutputs = Multiply()([sentenceOutputs, desiredVerbs])
selectedOutputs = Activation('softmax')(selectedOutputs)

model = Model([sentenceInputs,desiredVerbs], selectedOutputs)

现在,使用所需的动词创建一个数组:

desired = np.zeros((X_train.shape[0], 5926))

#for each sentence, make the desired verbs be one:
desired[sentenceIndex, verbIndex] = 1.

#now, how you're going to do this is up to you

#if they're the same for all sentences:
verbs = [selectedVerbIndex1, selectedVerbIndex2, ...... ]
for verbIndex in verbs:
    desired[:, verbIndex] = 1.

适合两种输入:

model.fit([np.array(X_train), desired], np.array(y_train), ......)

使用class_weight中的fit参数:

您可以尝试使用原始模型(不遵循上述建议)并使用class_weight方法中的参数fit

但是,这会有所不同。只有在训练时,您才能在预测时选择动词。你不能为不同的句子选择不同的动词。其他动词永远不会得到任何训练(也许带有Dense(5)的模型会更有趣吗?)

我也不太确定你的体重是否合理。

verbWeights = { i: 0. for i in range(5926) }

desiredVerbs = [verbIndex1, verbIndex2, .... ]
for verb in desiredVerbs:
    verbWeights[verb] = 1.

model.fit(X_train, y_train, class_weight = desiredVerbs, ....)