我很难理解Keras中权重矩阵的输出形状。
我有一个简单的BiLSTM,如下所示:
model = Sequential()
model.add(Embedding(vocab_size, embedding_size, input_length=55, weights=[pretrained_weights]))
model.add(Bidirectional(LSTM(units=embedding_size)))
model.add(Dense(5926, activation='softmax')) # number of classes
print(model.summary())
weights = model.layers[-1].get_weights()
print(weights)
print(len(weights))
print(weights[0][0].shape)
print(weights[0][0])
for e in zip(model.layers[-1].trainable_weights, model.layers[-1].get_weights()):
print('Param %s:\n%s' % (e[0],e[1]))
model.compile(loss='categorical_crossentropy',
optimizer = RMSprop(lr=0.0005),
metrics=['accuracy'])
model.fit(np.array(X_train), np.array(y_train), epochs=100, validation_data=(np.array(X_val), np.array(y_val)))
如果我打印最后一层图层的重量形状,我会得到这个:
Param <tf.Variable 'dense_14/kernel:0' shape=(200, 5926) dtype=float32_ref>:
所以形状是(200,5926)。
根据类的数量,我网络中的神经元数量是多少。我想找到一种方法来提取与每个预测相关的权重,因为那时我需要更新权重矩阵。
我的测试集由我的680个句子组成,每个句子我有1个标签。预测具有以下形状:
predictions = model.predict(np.array(X_test))
# shape predictions = (680, 5926)
有没有办法从softmax层中提取每个预测的权重(shape =(680,5926)?喜欢:
predictions = [probability_class_1, probability_class_2,......, probability_class_5926]
weights = [weight_class_1, weight_class_2, ......., weight_class_5926]
答案 0 :(得分:1)
您应该使用带掩码的第二个输入来说明哪些句子用于哪些句子并执行简单的元素乘法运算:
sentenceInputs = Input((sentenceLength,))
desiredVerbs = Input((5926,))
sentenceOutputs = Embedding(vocab_size, embedding_size, input_length=55, weights=[pretrained_weights])(sentenceInputs)
sentenceOutputs = Bidirectional(LSTM(units=embedding_size))(sentenceOutputs)
sentenceOuptuts = Dense(5926)(sentenceOutputs)
selectedOutputs = Multiply()([sentenceOutputs, desiredVerbs])
selectedOutputs = Activation('softmax')(selectedOutputs)
model = Model([sentenceInputs,desiredVerbs], selectedOutputs)
现在,使用所需的动词创建一个数组:
desired = np.zeros((X_train.shape[0], 5926))
#for each sentence, make the desired verbs be one:
desired[sentenceIndex, verbIndex] = 1.
#now, how you're going to do this is up to you
#if they're the same for all sentences:
verbs = [selectedVerbIndex1, selectedVerbIndex2, ...... ]
for verbIndex in verbs:
desired[:, verbIndex] = 1.
适合两种输入:
model.fit([np.array(X_train), desired], np.array(y_train), ......)
class_weight
中的fit
参数:您可以尝试使用原始模型(不遵循上述建议)并使用class_weight
方法中的参数fit
。
Dense(5)
的模型会更有趣吗?)
我也不太确定你的体重是否合理。
verbWeights = { i: 0. for i in range(5926) }
desiredVerbs = [verbIndex1, verbIndex2, .... ]
for verb in desiredVerbs:
verbWeights[verb] = 1.
model.fit(X_train, y_train, class_weight = desiredVerbs, ....)