convert sequences in input data set for LSTM

时间:2018-06-04 17:41:38

标签: keras lstm

I am completely new to python and LSTM. I have a data set looks like this ( more than 600 labels):

Label,sequence

1,18 2 32 38 20 3314 3315 3316 3317 3318 3319

1,536 794 795 960 25 335 336 26 27

1,296 23 24 25 28 28 336 26 27

1,112 30 28 336 26 27

2,296 23 24 1955 1955 348 27 437

2,112 31 426 3724 3714 3715 3715 529 390 391 390 531

3,1300

3,1300 1320 1321 1322 1322

3,1303 1304 1305 1306 1307 1309 1357 1333

3,1323 1324 1325 1326 1327 1328

3,1300 1320

....

So as I searched input data form LSTM in Keras must be in 3D shape. I do not know how I can convert these data to suitable format. I found these code but not sure how I can use it:

model = Sequential()
model.add(LSTM(128, input_shape=(10, 16))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd')

# fit model
model.fit(X, y, epochs=3, batch=16)

Please advise me how I can reshape my data set?

1 个答案:

答案 0 :(得分:0)

每个标签都有不同的长度序列,第一步是pad_sequences将所有序列填充到某个最大长度。您将获得类似(samples, max_seq_length, 1)的内容。现在你可以进入LSTM(..., input_shape=(max_seq_length, 1))。要让LSTM跳过时间步长,您必须在 LSTM之前使用Masking图层。因此,填充序列将重塑您的数据,屏蔽将跳过填充的数据,LSTM将立即处理序列。