为什么此GradientDescentOptimizer会卡住?

时间:2019-07-15 01:41:38

标签: tensorflow gradient-descent

我正在尝试编写一个使用GradientDescentOptimizer的示例,但是优化很快就陷入了困境。我所有的数据都是根据公式y = (2 * x_1) + (8 * x_2)生成的,因此,由于没有局部最小值,梯度下降难道不是很容易找到最佳解吗?

import numpy as np 
import os
import random
import tensorflow as tf 

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.ERROR)

np.random.seed(101) 
tf.set_random_seed(101) 

n_values = 100
learning_rate = 0.001
training_epochs = 1000

x_vals = np.random.random_sample((n_values, 2))
y_vals = [(2 * x_vals[i][0] + 8 * x_vals[i][1]) for i in range(n_values)]
y_vals = np.reshape(y_vals, (-1, 1))

n_dims = x_vals.shape[1]

X = tf.placeholder(tf.float32, [None, 2]) 
Y = tf.placeholder(tf.float32, [None, 1]) 

W = tf.Variable(tf.ones([1, n_dims])) 

y_pred = tf.reduce_sum(tf.multiply(X, W), axis=(-1, 1))  
cost = tf.reduce_sum(tf.pow(y_pred - Y, 2)) / (2 * tf.cast(n_values, tf.float32))  
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) 

init = tf.global_variables_initializer()

with tf.Session() as sess: 

    sess.run(init) 

    for epoch in range(training_epochs):           
        sess.run(optimizer, feed_dict = {X : x_vals, Y : y_vals})           
        if (epoch) % 50 == 0: 
            c = sess.run(cost, feed_dict = {X : x_vals, Y : y_vals}) 
            print("Epoch", (epoch + 1), ": cost =", c, "W =", sess.run(W))     

这是结果

Epoch 1 : cost = 1048.1746 W = [[1.2004547 1.21069  ]] 
Epoch 51 : cost = 429.50342 W = [[4.111497 4.421291]]
Epoch 101 : cost = 428.04016 W = [[4.170494  4.6341734]]
Epoch 151 : cost = 427.94107 W = [[4.1271544 4.6886673]] 
Epoch 201 : cost = 427.90067 W = [[4.0954566 4.720226 ]] 
Epoch 251 : cost = 427.88373 W = [[4.0747733 4.740489 ]] 
Epoch 301 : cost = 427.87656 W = [[4.0613766 4.7535996]]
Epoch 351 : cost = 427.8736 W = [[4.0527034 4.762087 ]]
Epoch 401 : cost = 427.8724 W = [[4.0470877 4.767582 ]] 
Epoch 451 : cost = 427.87186 W = [[4.043453  4.7711387]] 
Epoch 501 : cost = 427.87167 W = [[4.0411    4.7734404]] 
Epoch 551 : cost = 427.87155 W = [[4.039577  4.7749314]] 
Epoch 601 : cost = 427.87146 W = [[4.0385904 4.775896 ]] 
Epoch 651 : cost = 427.87152 W = [[4.0379524 4.7765207]]
Epoch 701 : cost = 427.87146 W = [[4.0375395 4.776925 ]]
Epoch 751 : cost = 427.87143 W = [[4.0372725 4.7771864]] 
Epoch 801 : cost = 427.87146 W = [[4.0370994 4.7773557]] 
Epoch 851 : cost = 427.8714 W = [[4.0369873 4.777465 ]]
Epoch 901 : cost = 427.87146 W = [[4.036914  4.7775364]] 
Epoch 951 : cost = 427.87146 W = [[4.036866 4.777584]] 

W值仍在变化很小,但是如果我增加历元,W值最终将根本停止变化。我可以更改学习率,但是所有这些都会导致它或早或晚陷入困境。

为什么GradientDescentOptimizer不能为没有随机性的完美数据集找到解决方案?我的代码有问题吗?

1 个答案:

答案 0 :(得分:0)

y_pred和Y的尺寸在下面的代码中应保持一致。但是y_pred是一维的,Y是二维的

y_pred = tf.reduce_sum(tf.multiply(X, W), axis=(-1, 1))
cost = tf.reduce_sum(tf.pow(y_pred - Y, 2)) / (2 * tf.cast(n_values, tf.float32))

您可以尝试下面的代码,它将产生预期的输出。

y_pred = tf.reshape(tf.reduce_sum(tf.multiply(X, W),axis=1),shape=(-1,1))