如何做出新的预测

时间:2018-07-04 21:46:36

标签: python machine-learning artificial-intelligence prediction

让我解释一下,我正在使用人工神经网络。 该模型具有15个变量,14个独立变量和一个相关变量。 在自变量中,我有3个分类变量 (day of week, month, direction(north,south, etc...))。 我已经包围了他们(monday = 1, tuesday = 2, and so...), 我也热编码他们 (monday = [1,0,0,0], tuesday = [0,1,0,0])

我的问题是,我该如何使用新值进行预测,像这样?

X=['Monday','January','South']

这是代码

# Classification template

# Importing the libraries
import numpy as np
import pandas as pd

# Importing the dataset

dataset = pd.read_csv('clean.csv')

X = dataset.iloc[:, [4,5,6,9,12,15,16]].values
y = dataset.iloc[:, 14].values

#Encoding categorical Data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelenconder_X = LabelEncoder()
X[:,1] = labelenconder_X.fit_transform(X[:,1])

labelenconder_X_2 = LabelEncoder()
X[:,2] = labelenconder_X_2.fit_transform(X[:,2])

labelenconder_X_7 = LabelEncoder()
X[:,4] = labelenconder_X_7.fit_transform(X[:,4])

labelenconder_X_9 = LabelEncoder()
X[:,5] = labelenconder_X_9.fit_transform(X[:,5])

labelenconder_X_10 = LabelEncoder()
X[:,6] = labelenconder_X_10.fit_transform(X[:,6])

onehotencoder = OneHotEncoder(categorical_features=[1,2,4,5,6])

X = onehotencoder.fit_transform(X).toarray()

X = X[:, 1:]



# Splitting the dataset into the Training set and Test set
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

# Feature Scaling
#from sklearn.preprocessing import StandardScaler
#sc = StandardScaler()
#X_train = sc.fit_transform(X_train)
#X_test = sc.transform(X_test)

# Fitting classifier to the Training set
# Create your classifier here
import keras
from keras.models import Sequential
from keras.layers import Dense


classifier = Sequential()

                    #INPUT LAYER AND HIDDEN LAYER
classifier.add(Dense(units = 5, kernel_initializer = 'uniform', activation = 'relu', input_dim =9))

                    #ADDING SECOND HIDDEN LAYER
classifier.add(Dense(units = 5, kernel_initializer = 'uniform', activation =  'relu'))

                    #adding output node 
classifier.add(Dense(units= 1, kernel_initializer = 'uniform', activation = 'sigmoid'))

                    #Applygin Stochasting Gradient Descent

classifier.compile(optimizer='adam', loss = 'binary_crossentropy', metrics=['accuracy'])

classifier.fit(X_train, y_train, batch_size =28, epochs = 100)


classifier.save('ANN2.h5')
model = keras.models.load_model('ANN2.h5')
y_predict = model.predict(X_test)
y_predict = (y_predict > 0.40)

1 个答案:

答案 0 :(得分:0)

如果您要对一周中的所有天进行编码以进行预测,则星期一可能应该为[1,0,0,0,0,0,0]。或者您使用回归(0.0-6.0)代替分类。

但是,由于您在这里使用的是X而不是y,所以我不确定您提供的X=['Monday','January','South']是作为输入而不是输出({{1} }。如果是这样,则您不需要单次编码,而可以将其编码为例如y

  • 星期一0
  • 星期二1
  • ...
  • 1月0日
  • 2月1日
  • 北0
  • 东部1
  • ...

我同意@morsecodist的观点,即需要更多信息才能正确回答您的问题。