Question

我读到Tensorflow 2.0将有一些重大变化，其中很大一部分将是eager-execution [1]，所以我尝试使用tensorflow的eager模式。

我从github-repo上获取了一个代码，并尝试以eager-mode运行它（但是，没有像建议的那样使用Keras-Model / Layers）。事实证明，它相当慢。因此，我尝试了不同的修改，并将其与模型的原始来源（图形模式）进行了比较。结果是，图形模式比快速模式快22倍。我完全清楚，图形模式更快，但是这个数字呢？

是否总是这样？还是需要对变量进行一些特殊的修改/配置才能获得与图形模式相当的性能？

这两个变体的源代码都可以在[2]中找到。

谢谢！

急切模式：

# With 
#  with tf.device("/gpu:0"):
#    ...
#
# Runtime is 0.35395
# Runtime is 0.12711
# Runtime is 0.12438
# Runtime is 0.12428
# Runtime is 0.12572
# Runtime is 0.12593
# Runtime is 0.12505
# Runtime is 0.12527
# Runtime is 0.12418
# Runtime is 0.12340

图形模式：

# Runtime is 0.81241
# Runtime is 0.00573
# Runtime is 0.00573
# Runtime is 0.00570
# Runtime is 0.00555
# Runtime is 0.00564
# Runtime is 0.00545
# Runtime is 0.00540
# Runtime is 0.00591
# Runtime is 0.00574

[1] https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/JHDpgRyFVUs

[2] https://gist.github.com/lhlmgr/f6709e5aba4a5314b5221d58232b09bd

Answer 1

使用急切的执行可能意味着消除使用TensorFlow图所养成的一些习惯，因为曾经运行过一次的代码段（例如，构建该图以计算损失的Python函数）将重复运行（同一Python函数现在将计算损失）每次迭代）。

我粗略地看了所提供的代码链接，并发现了一些容易获得的胜利，使用标准的Python分析工具也可能会看到这些胜利。您可能要使用这些（cProfile，pyspy等）

例如，Keras网络当前实现为：

class NFModel(tf.keras.Model):
  def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)

  def call(self, *args, **kwargs):
    num_layers = 6
    d, r = 2, 2
    bijectors = []

    for i in range(num_layers):
      with tf.variable_scope('bijector_%d' % i):
        V = tf.get_variable('V', [d, r], dtype=DTYPE)  # factor loading
        shift = tf.get_variable('shift', [d], dtype=DTYPE)  # affine shift
        L = tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE)  # lower triangular
        bijectors.append(tfb.Affine(
          scale_tril=tfd.fill_triangular(L),
          scale_perturb_factor=V,
          shift=shift,
        ))

        alpha = tf.get_variable('alpha', [], dtype=DTYPE)
        abs_alpha = tf.abs(alpha) + .01
        bijectors.append(LeakyReLU(alpha=abs_alpha))

    base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
    mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
    dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)

相反，如果您一次在__init__中创建变量，并且在每次网络调用时避免进行tf.get_variable调用，那么您应该会看到很大的进步。

class NFModel(tf.keras.Model):
  def __init__(self, *args, **kwargs):
    super(NFModel, self).__init__(*args, **kwargs)
    num_layers = 6
    d, r = 2, 2
    self.num_layers = num_layers
    self.V = [tf.get_variable('V', [d, r], dtype=DTYPE)  for _ in range(num_layers)]
    self.shift = [tf.get_variable('shift', [d], dtype=DTYPE)   for _ in range(num_layers)]
    self.L = [tf.get_variable('L', [d * (d + 1) / 2], dtype=DTYPE)  for _ in range(num_layers)]
    self.alpha = [tf.get_variable('alpha', [], dtype=DTYPE) for _ in range(num_layers)]


  def call(self, *args, **kwargs):
    bijectors = []

    for i in range(self.num_layers):
      V = self.V[i]
      shift = self.shift[i]
      L = self.L[i]
      bijectors.append(tfb.Affine(
        scale_tril=tfd.fill_triangular(L),
        scale_perturb_factor=V,
        shift=shift,
      ))

      alpha = self.alpha[i]
      abs_alpha = tf.abs(alpha) + .01
      bijectors.append(LeakyReLU(alpha=abs_alpha))

    base_dist = tfd.MultivariateNormalDiag(loc=tf.zeros([2], DTYPE))
    mlp_bijector = tfb.Chain(list(reversed(bijectors[:-1])), name='2d_mlp_bijector')
    dist = tfd.TransformedDistribution(distribution=base_dist, bijector=mlp_bijector)

    return {"dist": dist}

可能还有其他这样的轻松优势，使用配置文件工具可以向正确的方向轻推。

另外，请注意，根据RFC

，TF 2.0不太关注“渴望执行”，而更多地关注与图形的交互方式。

希望有帮助。

急切模式非常慢（比图形模式慢22倍）

1 个答案: