优化动态形状的变量

时间:2016-11-29 10:08:53

标签: tensorflow

我想使用一个预先知道形状未知的变量,它会不时变化(虽然ndim已知且已修复)。

我声明如下:

initializer = tf.random_uniform_initializer()
shape = (s0, s1, s2)  # these are symbolic vars
foo_var = tf.Variable(initializer(shape=shape), name="foo", validate_shape=False)

当我创建计算图到我要优化w.r.t的点时,这似乎有效。这个变量,即:

optimizer = tf.train.AdamOptimizer(learning_rate=0.1, epsilon=1e-4)
optim = optimizer.minimize(loss, var_list=[foo_var])

在某个函数create_zeros_slot中的优化器失败,它似乎依赖于静态形状信息(它使用primary.get_shape().as_list())。 (我在上游报告here。)

那么,使用优化器只适用于具有静态形状的变量吗?

即。对于变量形状的每次变化,我需要重建计算图吗? 或者有什么方法可以避免娱乐?

2 个答案:

答案 0 :(得分:0)

你在做什么没有任何意义。如果动态变量的形状发生变化,您将如何优化它?有时会有价值,有时也不会。当你去保存图形时变量所在的形状是什么? adam优化器还需要在执行之间存储有关变量中每个参数的信息,而在不知道大小的情况下,它无法执行。

现在,如果你想要做的只是一次使用变量的一部分,你可以使用它并使用它们。只要变量具有切片最大边界的固定形状,这将正常工作。例如......

initializer = tf.random_uniform_initializer()
shape = (S0, S1, S2)  # these are now constants for the max bounds of si
foo_var = tf.Variable(initializer(shape=shape), name="foo")

foo_var_sub = foo_var[:s0, :s1, :s2]

在这种情况下,优化器只会对foo_var中对切片有贡献的部分起作用。

答案 1 :(得分:0)

我现在的解决方案有点难看但有效。

def _tf_create_slot_var(primary, val, scope):
  """Helper function for creating a slot variable."""

  from tensorflow.python.ops import variables
  slot = variables.Variable(val, name=scope, trainable=False, validate_shape=primary.get_shape().is_fully_defined())
  # pylint: disable=protected-access
  if isinstance(primary, variables.Variable) and primary._save_slice_info:
    # Primary is a partitioned variable, so we need to also indicate that
    # the slot is a partitioned variable.  Slots have the same partitioning
    # as their primaries.
    real_slot_name = scope[len(primary.op.name + "/"):-1]
    slice_info = primary._save_slice_info
    slot._set_save_slice_info(variables.Variable.SaveSliceInfo(
        slice_info.full_name + "/" + real_slot_name,
        slice_info.full_shape[:],
        slice_info.var_offset[:],
        slice_info.var_shape[:]))
  # pylint: enable=protected-access
  return slot


def _tf_create_zeros_slot(primary, name, dtype=None, colocate_with_primary=True):
  """Create a slot initialized to 0 with same shape as the primary object.

  Args:
    primary: The primary `Variable` or `Tensor`.
    name: Name to use for the slot variable.
    dtype: Type of the slot variable.  Defaults to the type of `primary`.
    colocate_with_primary: Boolean.  If True the slot is located
      on the same device as `primary`.

  Returns:
    A `Variable` object.
  """
  if dtype is None:
    dtype = primary.dtype
  from tensorflow.python.ops import array_ops
  val = array_ops.zeros(
      primary.get_shape().as_list() if primary.get_shape().is_fully_defined() else tf.shape(primary),
      dtype=dtype)
  from tensorflow.python.training import slot_creator
  return slot_creator.create_slot(primary, val, name, colocate_with_primary=colocate_with_primary)


def monkey_patch_tf_slot_creator():
    """
    The TensorFlow optimizers cannot handle variables with unknown shape.
    We hack this.
    """
    from tensorflow.python.training import slot_creator
    slot_creator._create_slot_var = _tf_create_slot_var
    slot_creator.create_zeros_slot = _tf_create_zeros_slot

然后我需要在某个时候致电monkey_patch_tf_slot_creator()