无法打印出我的张量的形状(Keras)

时间:2018-12-24 02:40:25

标签: python keras

对于Keras来说是新手,尝试打印形状时遇到了问题,因此可以将其用作input_shape。到目前为止,这是我的代码:

df = pd.read_csv(pathname, encoding = "ISO-8859-1")
df = df[['content_cleaned', 'meaningful']] 
df = df.sample(frac=1) #Shuffling the data

X = np.asarray(df[['content_cleaned']])
y = np.asarray(df[['meaningful']])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21) 

tokenizer = Tokenizer() 
X_train = keras.preprocessing.text.Tokenizer(num_words=100)
X_test = keras.preprocessing.text.Tokenizer(num_words=100)

encoder = LabelBinarizer()
encoder.fit(y_train) 
y_train = encoder.transform(y_train)
encoder.fit(y_test)
y_test = encoder.transform(y_test)

print(X_train.shape)

代码在最终的打印语句中失败。错误消息:

AttributeError: 'Tokenizer' object has no attribute 'shape'

再次,我对此很陌生,似乎无法弄清楚如何克服此错误。任何帮助都会很棒!

编辑:我对代码进行了一些修改,以尝试实现其他用户的建议。这是代码(已更改):

# Create tokenizer
tokenizer = Tokenizer(num_words=100) #No row has more than 100 words.

#Tokenize the predictors (text)
X_train = tokenizer.sequences_to_matrix(X_train, mode="binary")
X_test = tokenizer.sequences_to_matrix(X_test, mode="binary")

在声明X_train变量时失败。错误消息是:

TypeError: '>=' not supported between instances of 'str' and 'int'

编辑2:进行以下更改,代码将运行。当我运行print命令时,什么都没打印:

X_train = tokenizer.sequences_to_matrix(int(input(X_train)), mode="binary")
X_test = tokenizer.sequences_to_matrix(int(input(X_test)), mode="binary")

1 个答案:

答案 0 :(得分:0)

我相信这是因为尽管您首先将其设置为numpy数组...

from django.utils.dateparse import parse_date converted_birthday = parse_date(birthdate)

...并提供数据...

X = np.asarray(df[['content_cleaned']])

...然后,将其设为Tokenizer对象,该对象显然没有'shape'属性。

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21)