Keras令牌生成器

时间:2018-08-15 14:28:57

标签: tensorflow keras

任何人都知道kera.preprocessing.text.Tockenizer到底如何工作?

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
data = pad_sequences(sequences, maxlen=maxlen)
labels = np.asarray(labels)
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
x_train = data[:training_samples]
y_train = labels[:training_samples]
x_val = data[training_samples: training_samples + validation_samples]
y_val = labels[training_samples: training_samples + validation_samples]
  

找到了88413个唯一令牌。数据张量的形状:(24984,100)的形状   标签张量:(24984,)

tokenizer.texts_to_sequences('You are Amrock!')
  

出[18]:   [[5128],[1601],[1205],[],[3],[1480],[962],[],[3],   [1978],[1480],[1601],[1144],[2292],[]]

Out [18]到底是什么意思?

0 个答案:

没有答案