Question

以下解决方案

如果您只是对解决此问题感兴趣，可以跳到下面的答案。

原始问题

我正在使用tensorflow进行强化学习。一群代理并行使用该模型，一个中心实体根据收集的数据对其进行训练。

我在这里找到了：Is it thread-safe when using tf.Session in inference service?，tensorflow会话是线程安全的。所以我只是让预测和更新并行运行。

但现在我想改变设置。我现在需要保留两个模型，而不是更新和训练单个模型。一个用于预测，第二个用于训练。在一些训练步骤之后，将来自第二个的权重复制到第一个。以下是keras中的一个最小示例。对于多处理，建议最终确定图形，但我不能复制权重：

# the usual imports
import numpy as np
import tensorflow as tf

from keras.models import *
from keras.layers import *

# set up the first model
i = Input(shape=(10,))
b = Dense(1)(i)
prediction_model = Model(inputs=i, outputs=b)

# set up the second model
i2 = Input(shape=(10,))
b2 = Dense(1)(i2)
training_model = Model(inputs=i2, outputs=b2)

# look at this code, to check if the weights are the same
# here the output is different
prediction_model.predict(np.ones((1, 10)))
training_model.predict(np.ones((1, 10)))

# now to use them in multiprocessing, the following is necessary
prediction_model._make_predict_function()
training_model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()

# the following line is the critical part
# if this is uncommented, the two options below both fail
# default_graph.finalize()

# option 1, use keras methods to update the weights
prediction_model.set_weights(training_model.get_weights())

# option 2, use tensorflow to update the weights
update_ops = [tf.assign(to_var, from_var) for to_var, from_var in
              zip(prediction_model.trainable_weights, training_model.trainable_weights)]
sess.run(update_ops)

# now the predictions are the same
prediction_model.predict(np.ones((1, 10)))
training_model.predict(np.ones((1, 10)))

根据上述问题，建议最终确定图表。如果它没有最终确定，可能会有内存泄漏（！？），所以这似乎是一个强烈的推荐。

但是如果我最终确定它，我就不能再更新权重了。令我困惑的是：可以训练网络，因此允许改变权重。分配看起来像重量只是被覆盖，为什么这与应用优化程序步骤不同？

Answer 1

简而言之，我的问题是为最终图表的权重指定值。如果在完成后完成此分配，则tensorflow会抱怨无法再更改图表。

我很困惑为什么禁止这样做。毕竟，允许通过反向传播来改变权重。

但问题与改变权重无关。 Keras set_weights()令人困惑，因为它看起来好像只是覆盖了权重（就像在backprop中一样）。实际上，在幕后，添加和执行分配操作。这些新操作表示图表中的更改，并且禁止更改。

因此解决方案是在最终确定图形之前设置赋值操作。您必须重新排序代码：

# the usual imports
import numpy as np
import tensorflow as tf

from keras.models import *
from keras.layers import *

# set up the first model
i = Input(shape=(10,))
b = Dense(1)(i)
prediction_model = Model(inputs=i, outputs=b)

# set up the second model
i2 = Input(shape=(10,))
b2 = Dense(1)(i2)
training_model = Model(inputs=i2, outputs=b2)

# set up operations to move weights from training to prediction
update_ops = [tf.assign(to_var, from_var) for to_var, from_var in
              zip(prediction_model.trainable_weights, training_model.trainable_weights)]

# now to use them in multiprocessing, the following is necessary
prediction_model._make_predict_function()
training_model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()

default_graph.finalize()

# this can be executed now
sess.run(update_ops)

# now the predictions are the same
prediction_model.predict(np.ones((1, 10)))
training_model.predict(np.ones((1, 10)))

tensorflow：在最终确定图形

1 个答案: