我正在尝试在ValidationMonitor
DNNRegressor.fit
但在global_step = 1
时只触发一次。如何解决?
将every_n_step
设置为1
validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
test_dataset.data,
test_dataset.target,
every_n_steps=1)
regressor = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns,
hidden_units=[2],
model_dir="/home/maciej/tf-logs")
regressor.fit(x=train_dataset.data,
y=train_dataset.target,
steps=1000,
monitors=[validation_monitor])
我在stdout上得到的是关于损失的日志和关于验证的单一日志:
INFO:tensorflow:Saving dict for global step 201: global_step = 201, loss = 5.50003
INFO:tensorflow:Validation (step 201): loss = 5.50003, global_step = 201
INFO:tensorflow:global_step/sec: 96.1045
INFO:tensorflow:loss = 6.3935, step = 301
INFO:tensorflow:global_step/sec: 154.978
INFO:tensorflow:loss = 4.77587, step = 401
INFO:tensorflow:global_step/sec: 158.06
INFO:tensorflow:loss = 3.72956, step = 501
INFO:tensorflow:global_step/sec: 151.51
INFO:tensorflow:loss = 3.04578, step = 601
在张量板上也证实了这一点。
此处提供完整的代码和日志: https://gist.github.com/maciejjaskowski/9f791e517f379c41d20cc72619909fe6
答案 0 :(得分:0)
如Monitoring tutorial中所述, ValidationMonitors依赖于保存的检查点。
考虑每隔x秒添加一次检查点:
config=tf.contrib.learn.RunConfig(save_checkpoints_secs=5)
这对我有用。