Tensor Flow Mninst外部图像预测不起作用

时间:2016-04-29 11:41:48

标签: python image-processing classification tensorflow conv-neural-network

我是神经网络的新手。我已经通过了TensorFlow mninst ML初学者

使用了tensorflow基本的mnist教程

并尝试使用外部图像进行预测输入图像描述

我更新了tensorflow提供的mnist示例

On top of that i have added few things :
1. Saving trained models locally
2. loading the saved models.
3. preprocessing the image into 28 * 28.

i have attached the image for reference

 1. while training the models, save it locally. So i can reuse it at any point of time.
 2. once after training, loading the models.
 3. creating an external image via gimp which contains any one values ranging from [0 - 9]
 4. using opencv to convert the image into 28 * 28 image and reversing the bit as well.
 5. Then trying to predict.

我能够训练模型并妥善保存。

我得到了不正确的预测。

下面找到我的代码

1.TrainSimple.py

    # Load MNIST Data
    from tensorflow.examples.tutorials.mnist import input_data
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

    from random import randint
    from scipy import misc

    # Start TensorFlow InteractiveSession
    import tensorflow as tf
    sess = tf.InteractiveSession()

    # Placeholders
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])

    # Variables
    W = tf.Variable(tf.zeros([784,10]))
    b = tf.Variable(tf.zeros([10]))

    sess.run(tf.initialize_all_variables())

    # Predicted Class and Cost Function
    y = tf.nn.softmax(tf.matmul(x,W) + b)
    cross_entropy = -tf.reduce_sum(y_*tf.log(y))

    saver = tf.train.Saver()  # defaults to saving all variables

    # GradientDescentOptimizer
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)


    # Train the Model

       for i in range(40000):
            if (i + 1) == 40000 :
                saver.save(sess, "/Users/xxxx/Desktop/TensorFlow/"+"/model.ckpt", global_step=i)
        batch = mnist.train.next_batch(50)
        train_step.run(feed_dict={x: batch[0], y_: batch[1]})

    # Evaluate the Model
    correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
  1. loadImageAndPredict.py

    from random import randint
    from scipy import misc
    import numpy as np
    import cv2
    
    def preProcess(invert_file):
        print "preprocessing the images" + invert_file
        image=cv2.imread(invert_file,0)
        ret,image_thresh = cv2.threshold(image,127,255,cv2.THRESH_BINARY)
        l,b=image.shape
    
        fr=0
        lr=0
        fc=0
        lc=0
    
        i=0
        while len(set(image_thresh[i,]))==1:
            i+=1
        fr=i
    
        i=0
        while len(set(image_thresh[-1+i,]))==1:
            i-=1
        lr=i+l
    
        j=0
        while len(set(image_thresh[0:,j]))==1:
            j+=1
        fc=j
    
        j=0
        while len(set(image_thresh[0:,-1+j]))==1:
            j-=1
        lc=j+b
    
        image_crop=image_thresh[fr:lr,fc:lc]
        image_padded= cv2.copyMakeBorder(image_crop,5,5,5,5,cv2.BORDER_CONSTANT,value=255)
        image_resized = cv2.resize(image_padded, (28, 28))
        image_resized = (255-image_resized)
        cv2.imwrite(invert_file, image_resized)
    
    import tensorflow as tf
    sess = tf.InteractiveSession()
    
    # Placeholders
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
    
    # # Variables
    W = tf.Variable(tf.zeros([784,10]))
    b = tf.Variable(tf.zeros([10]))
    
    
    
    # Predicted Class and Cost Function
    y = tf.nn.softmax(tf.matmul(x,W) + b)
    cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
    
    saver = tf.train.Saver()  # defaults to saving all variables - in this case w and b
    
    # Train the Model
    # GradientDescentOptimizer
    train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
    flag_1 = 0
    
    # create an an array where we can store 1 picture
    images = np.zeros((1,784))
    # and the correct values
    correct_vals = np.zeros((1,10))
    
    
    preProcess("4_white.png")
    
    gray = cv2.imread("4_white.png", 0)
    
    flatten = gray.flatten() / 255.0
    """
    we need to store the flatten image and generate
    the correct_vals array
    correct_val for a digit (9) would be
    [0,0,0,0,0,0,0,0,0,1]
    """
    images[0] = flatten
    # print images[0]
    print len(images[0])
    
    sess.run(tf.initialize_all_variables())
    
    ckpt = tf.train.get_checkpoint_state("/Users/xxxx/Desktop/TensorFlow")
    if ckpt and ckpt.model_checkpoint_path:
        saver.restore(sess, ckpt.model_checkpoint_path)
        my_classification = sess.run(tf.argmax(y, 1), feed_dict={x: [images[0]]})
        print 'Neural Network predicted', my_classification[0], "for your digit"
    
  2. 我不确定我做了什么错误。

    认为简单模型可能不起作用我已使用此卷积代码进行预测。

    https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/image/mnist/convolutional.py

    即便如此也无法正确预测:(

1 个答案:

答案 0 :(得分:0)

要检查的一些事项:

  1. 你的训练损失是否会下降?
  2. 训练数据集的准确度是否高?
  3. 您是否对验证数据集(预留的训练集的一部分?)获得高准确度?
  4. 您对目标数据集的准确度是否高?
  5. 如果你的训练损失很低(1.),那么你就不会学习,需要尝试不同的超参数,例如学习率。

    如果你有高(2.)和低(3.),你就过度拟合,需要训练的时间更长,或者有更高的正则化惩罚。如果您有高(3.)和低(4.),则您的训练集不能代表您的实际训练集。需要使您的训练集更具代表性,或者至少更难,例如,通过添加扭曲