所以,我试图使用tensorflow进行简单分类,我怀疑是
如果我使用LSTM进行文本分类(例如:情感分类),那么我们进行数据填充,之后我们使用word_embedding进行LSTM张量流,所以在word_embedding查找后2维数据变为3维或2级矩阵变为3级:
就好像我有两个文字:
import tensorflow as tf
text_seq=[[11,21,43,22,11,4,1,3,5,2,8],[4,2,11,4,11,0,0,0,0,0,0]] #2x11
#text_seq are index of words from word_to_index dict
a=tf.get_variable('word_embedding',shape=[50,50],dtype=tf.float32,initializer=tf.random_uniform_initializer(-0.01,0.01))
lookup=tf.nn.embedding_lookup(a,text_seq)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(lookup).shape)
我会得到:
(2, 11, 50)
我可以很容易地将其提供给LSTM,因为LSTM接受等级3
但我的问题是假设我有数值浮动数据而不是文本数据,我想使用RNN进行分类,
假设我的数据是:
import numpy as np
float_data=[[11.1,21.5,43.6,22.1,11.44],[33.5,12.7,7.4,73.1,89.1],[33.5,12.7,7.4,73.1,89.1],[33.5,12.7,7.4,73.1,89.1],[33.5,12.7,7.4,73.1,89.1],[33.5,12.7,7.4,73.1,89.1]]
labels=[1,2,3,4,5,6]
#2x5
batch_size=2
input_data_batch=[[11.1,21.5,43.6,22.1,11.44],[33.5,12.7,7.4,73.1,89.1]]
#now should I reshape my data to make it rank 3 like this
reshape_one=np.reshape(input_data_batch,[-1,batch_size,5])
print(reshape_one)
# or like this ?
reshape_two=np.reshape(input_data_batch,[batch_size,-1,5])
print(reshape_two)
输出:
first one
[[[11.1 21.5 43.6 22.1 11.44]
[33.5 12.7 7.4 73.1 89.1 ]]]
second one
[[[11.1 21.5 43.6 22.1 11.44]]
[[33.5 12.7 7.4 73.1 89.1 ]]]
答案 0 :(得分:0)
LSTM和其他序列模型可以采用时间主要(即维度为时间,批次,通道)或批量主要(维度为批次,时间,通道)的输入。我不知道你传递给tf的哪个实现的标志,所以我无法从你提供的代码中看出你是想要批量主要还是时间主要的输入。