Question

我实现了一个python脚本，在该脚本中，我尝试对我通过的2个特征（句子，概率）进行CNN训练，并预测句子是对还是不对。这类似于该领域中流行的情感分析任务。

最初，我为名为triples的句子生成词嵌入。每个三元组/句子有正好5个字。因此，单词嵌入的外观如下所示。

Number of lines 1860
[[2, 194, 21, 17, 227], [12, 228, 22, 17, 229], [2, 230, 21, 17, 229], ...]
Shape of triple:  (1860, 5)
Shape of truth:  (1860,)

三元组是句子，真理是目标阶层。

在我的数据集中，我有3个字段（包括目标类truth），其中以下2个是我要训练模型的特征：

三元组或句子（我已将其转换为词嵌入的向量）。
每个句子的概率（这是[0,1]范围内的软真值。

因此，我定义了一个多输入CNN模型，其中第一个输入是单词嵌入的向量，第二个输入是概率。然后，我合并这两个输入，到现在为止，一切似乎都正常。

但是，在传递两个数组（word embedding vector array和我定义为probabilities的{{1}}数组）时遇到了麻烦。

我尝试如下所示拟合这两个功能。

stv

但是，我不断收到以下错误消息。

ValueError：检查输入时出错：预期input_1具有3个维度，但数组的形状为（1302，5）

Python实现

model.fit([X_train_pad,np.array(stv[:-num_validation_samples])], y_train, batch_size=128, epochs=25, validation_data=([X_test_pad,np.array(stv[-num_validation_samples:])], y_test), verbose=2)

任何有关如何解决此问题的建议将不胜感激。

Answer 1

Keras中的Conv1D层希望由3D张量（Batch_size，size1，size2）馈送
通过读取错误，您似乎仅在馈入2D张量。

如果X_train_pad为（1302，5），我想它是（5，1）嵌入数组的1302个样本。

因此，在拟合模型之前，请尝试执行以下操作：

X_train_pad = np.expand_dims(X_train_pad, -1)

Answer 2

根据我收到的一些指导，似乎不需要我对概率（psl）层进行任何培训。因此，只需要对合并层的嵌入层进行训练，然后可以将psl层通过管道传输到嵌入层并拟合模型。

因此，这是工作脚本中模型的修改部分。

定义模型层

input1 = layers.Input(shape=(max_length,))
embedding = layers.Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length)(input1)

cov = layers.Conv1D(128, 4, activation='relu')(embedding)
pooling = layers.GlobalMaxPooling1D()(cov)

input2 = layers.Input(shape=(1,))

concat = layers.Concatenate(axis=-1)([pooling, input2])

l1 = layers.Dense(10, activation='relu')(concat)
out = layers.Dense(1, activation='sigmoid')(l1)

model = models.Model(inputs=[input1, input2], outputs=[out])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

拆分并拟合模型

VALIDATION_SPLIT = 0.3

indices = np.arange(triple_pad.shape[0])
np.random.shuffle(indices)
triple_pad = triple_pad[indices]
truth = truth[indices]
num_validation_samples = int(VALIDATION_SPLIT * triple_pad.shape[0])

X_train_pad = triple_pad[:-num_validation_samples]
X_train_psl = stv[:-num_validation_samples]
y_train = truth[:-num_validation_samples]

X_test_pad = triple_pad[-num_validation_samples:]
X_test_psl = stv[-num_validation_samples:]
y_test = truth[-num_validation_samples:]

print('Shape of X_train_pad tensor: ', X_train_pad.shape)
print('Shape of y_train tensor: ', y_train.shape)
print('Shape of X_test_pad tensor: ', X_test_pad.shape)
print('Shape of y_test tensor: ', y_test.shape)

print(colored('Training...', 'green'))

history = model.fit([X_train_pad, X_train_psl], y_train, batch_size=128, epochs=25,
                    validation_data=([X_test_pad, X_test_psl], y_test), verbose=2)

希望这可以帮助遇到此问题的其他人尝试在深度学习模型中使用多个输入。

ValueError：检查输入时出错：预期input_1具有3维

Python实现

2 个答案:

定义模型层

拆分并拟合模型