卷积神经网络中图像的维度应该是什么

时间:2018-02-23 19:12:20

标签: image-processing tensorflow deep-learning keras

初学者到深度学习..

我正在尝试使用卫星图片(谷歌地图)为普纳市识别贫民窟。因此,在训练数据集中,我提供了大约100张贫民窟图像和100张其他区域的图像。但即使准确率很高,我的模型也无法正确分类输入图像。 我想这可能是因为图像的尺寸。 我将所有图像的大小调整为128 * 128像素。 核心大小为3 * 3.

链接到地图: https://www.google.co.in/maps/@18.5129661,73.822531,286m/data=!3m1!1e3?hl=en

以下是代码

import os,cv2
import glob
import numpy as np
from keras.utils import plot_model
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from keras.models import Model
from keras.layers import Input, Convolution2D, MaxPooling2D, Flatten, Dense, Dropout


PATH = os.getcwd()
data_path = PATH + '/dataset/*'


files = glob.glob(data_path)
X = []

for myFiles in files:
 image = cv2.imread(myFiles)
 image_resize = cv2.resize(image, (256, 256))
 X.append(image_resize)

image_data = np.array(X)
image_data = image_data.astype('float32')
image_data /= 255
print("Image_data shape ", image_data.shape)


no_of_classes = 2
no_of_samples = image_data.shape[0]
label = np.ones(no_of_samples, dtype='int64')

label[0:86] = 0     #Slum
label[87:] = 1    #noSlum

Y = to_categorical(label, no_of_classes)


#shuffle dataset

x,y = shuffle(image_data , Y, random_state = 2)

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state = 2)

#print(x_train)
#print(y_train)


input_shape = image_data[0].shape

input = Input(input_shape)

conv_1 = Convolution2D(32,(3,3), padding='same', activation='relu')(input)
conv_2 = Convolution2D(32,(3,3), padding = 'same', activation = 'relu')(conv_1)
pool_1 = MaxPooling2D(pool_size = (2,2))(conv_2)
drop_1 = Dropout(0.5)(pool_1)

conv_3 = Convolution2D(64,(3,3), padding='same', activation='relu')(drop_1)
conv_4 = Convolution2D(64,(3,3), padding='same', activation = 'relu')(conv_3)
pool_2 = MaxPooling2D(pool_size = (2,2))(conv_4)
drop_2 = Dropout(0.5)(pool_2)

flat_1 = Flatten()(drop_2)
hidden = Dense(64,activation='relu')(flat_1)
drop_3 = Dropout(0.5)(hidden)
out = Dense(no_of_classes,activation = 'softmax')(drop_3)

model = Model(inputs = input, outputs = out)

model.compile(loss = 'categorical_crossentropy', optimizer = 'rmsprop',  metrics= ['accuracy'])

model.fit(x_train,y_train,batch_size=10,nb_epoch=20,verbose =1, validation_data=(x_test,y_test))

model.save('model.h5')

score = model.evaluate(x_test,y_test,verbose=1)
print('Test Loss: ',score[0])
print('Test Accuracy: ',score[1])


test_image = x_test[0:1]
print(test_image.shape)

print (model.predict(test_image))

1 个答案:

答案 0 :(得分:2)

通常,您在上面描述的行为类似于NN无法识别输入图像上的小对象。想象一下,如果没有看到任何东西,你会给出一个128 * 128的粗糙噪声图像 - 你想让NN正确地对物体进行分类吗?

怎么办? 1)尝试手动将数据集中的某些输入图像转换为128 * 128大小,并查看您真正训练NN的数据。所以,它会给你更多的洞察力 - >也许你需要有更好的图像尺寸

2)添加更多具有更多神经元的Conv层,通过向输出函数添加更多非线性,您可以检测小型和更复杂的对象。谷歌如ResNet这样伟大的神经网络结构。

3)添加更多训练数据,100张图像不足以获得适当的结果

4)同时添加数据增强技术(在你的情况下旋转似乎很强)

不要放弃:)最终,你会解决它。祝你好运