编写基本的XOR神经网络程序

时间:2017-10-29 08:37:35

标签: python-3.x machine-learning tensorflow neural-network xor

我正在尝试编写一个从头开始识别xor函数的神经网络。完整代码是here(在python 3中)。

我目前收到错误:

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients

我是tensorflow的新手,我不明白为什么会这样。任何人都可以帮我纠正我的代码吗?提前谢谢。

P.S。如果问题需要更多详细信息,请在downvoting之前告知我们。再次感谢!

编辑:代码的相关部分:

def initialize_parameters():
    # Create Weights and Biases for Hidden Layer and Output Layer
    W1 = tf.get_variable("W1", [2, 2], initializer = tf.contrib.layers.xavier_initializer())
    b1 = tf.get_variable("b1", [2, 1], initializer = tf.zeros_initializer())
    W2 = tf.get_variable("W2", [1, 2], initializer = tf.contrib.layers.xavier_initializer())
    b2 = tf.get_variable("b2", [1, 1], initializer = tf.zeros_initializer())
    parameters = {
            "W1" : W1,
            "b1" : b1,
            "W2" : W2,
            "b2" : b2
    }
    return parameters

def forward_propogation(X, parameters):

    threshold = tf.constant(0.5, name = "threshold")
    W1, b1 = parameters["W1"], parameters["b1"]
    W2, b2 = parameters["W2"], parameters["b2"]

    Z1 = tf.add(tf.matmul(W1, X), b1)
    A1 = tf.nn.relu(Z1)
    tf.squeeze(A1)
    Z2 = tf.add(tf.matmul(W2, A1), b2)
    A2 = tf.round(tf.sigmoid(Z2))
    print(A2.shape)
    tf.squeeze(A2)
    A2 = tf.reshape(A2, [1, 1])
    print(A2.shape)
    return A2

def compute_cost(A, Y):

    logits = tf.transpose(A)
    labels = tf.transpose(Y)
    cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits, labels = labels)
    return cost

def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500):

    ops.reset_default_graph()
    (n_x, m) = X_train.shape
    n_y = Y_train.shape[0]
    costs = []
    X, Y = create_placeholders(n_x, n_y)
    parameters = initialize_parameters()
    A2 = forward_propogation(X, parameters)
    cost = compute_cost(A2, Y)
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
    init = tf.global_variables_initializer()

    with tf.Session() as session:
        session.run(init)
        for epoch in range(num_epochs):
            epoch_cost = 0
            _, epoch_cost = session.run([optimizer, cost], feed_dict = {X : X_train, Y : Y_train})
        parameters = session.run(parameters)
        correct_prediction = tf.equal(tf.argmax(A2), tf.argmax(Y))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
        print("Training Accuracy is {0} %...".format(accuracy.eval({X : X_train, Y : Y_train})))
        print("Test Accuracy is {0} %...".format(accuracy.eval({X : X_test, Y : Y_test})))
    return parameters

1 个答案:

答案 0 :(得分:0)

错误是由您在定义tf.roundknown issue时使用A2引起的。

在此特定任务中,解决方案根本就是不使用tf.round。请注意,tf.sigmoid的输出是01之间的值,可以将其解释为结果概率1。交叉熵损失函数测量到目标的距离01,并根据此距离计算所需的权重更新。在交叉熵之前调用tf.round会将概率压缩到01 - 这会使交叉熵变得毫无意义。

顺便说一下,tf.losses.softmax_cross_entropy应该会更好,因为你已经在第二层自己应用了sigmoid。