Question

我正在关注Keras的词嵌入教程，并从该特定代码中复制了代码（进行了一些修改）：

Using pre-trained word embeddings in a Keras model

这是一个主题分类问题，他们正在加载预先训练的单词向量，并通过固定的嵌入层使用它们。

实际上，当使用预训练的嵌入向量时，我可以达到95％的精度。这是代码：

embedding_layer = Embedding(len(embed_matrix), len(embed_matrix.columns), weights=[embed_matrix],
                           input_length=data.shape[1:], trainable=False)

sequence_input = Input(shape=(MAXLEN,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)

x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Dropout(0.2)(x)
x = Conv1D(128, 5, activation='relu')(x)
x = MaxPooling1D(35)(x)  # global max pooling
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
output = Dense(target.shape[1], activation='softmax')(x)

model = Model(sequence_input, output)
model.compile(loss='categorical_crossentropy', optimizer='adam', 
metrics=['acc'])
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=2, 
batch_size=128)

当我删除嵌入向量并使用完全随机的向量时，问题就出现了，令人惊讶地实现了更高的准确度：96.5％。

代码相同，但有一个修改：称重 = [ random_matrix ]。这是一个形状与 embed_matrix 相同的矩阵，但是使用随机值。所以这是现在的嵌入层：

embedding_layer = Embedding(len(embed_matrix), 
len(embed_matrix.columns), weights=[random_matrix],
                        input_length=data.shape[1:], trainable=False)

我用随机权重进行了多次实验，结果总是相似的。请注意，即使这些权重是随机的，但 trainable 参数仍然是 False ，因此NN不会更新它们。

在那之后，我完全去除了嵌入层，并使用单词序列作为输入，希望这些权重不会影响模型的准确性。这样，我得到的准确率只有16％。

那么，这是怎么回事？随机嵌入如何比预训练获得相同或更好的性能？

为什么使用词索引（当然是归一化的）作为输入会导致如此低的准确性？

我也尝试了LSTM，但准确率仅达到15％。

提前谢谢你们。

Keras-使用预训练词嵌入的问题

0 个答案: