Question

我的成本函数涉及log的计算（det（A））（假设det（A）为正，因此日志有意义，但A不是Hermitian，因此Cholesky分解在这里不适用）。当det（A）非常大/小时，对det（A）的直接调用将溢出/下溢。为了避免这种情况，人们使用

这样的数学事实

log（det（A））= Tr（log（A）），

其中后者可以使用LU分解进行评估（这比特征值/ SVD更有效）。这个算法已经在numpy中实现为numpy.linalg.slogdet，所以问题是如何从TensorFlow调用numpy。

这是我试过的

import numpy as np
import tensorflow as tf
from tensorflow.python.framework import function

def logdet_np(a):
    _, l = np.linalg.slogdet(a)
    return l

def logdet1(a):
    return tf.py_func(logdet_np, [a], tf.float64)

@function.Defun(tf.float64, func_name='LogDet')
def logdet2(a):
    return tf.py_func(logdet_np, [a], tf.float64)

with tf.Session() as sess:
    a = tf.constant(np.eye(500)*10.)
    #print(sess.run(logdet1(a)))
    print(sess.run(logdet2(a)))

我首先定义一个python函数来传递numpy结果。然后我使用logdet定义了两个tf.py_func函数。第二个函数由function.Defun修饰，用于稍后定义TensorFlow函数及其渐变。当我测试它们时，我发现第一个函数logdet1起作用并给出了正确的结果。但第二个函数logdet2返回一个KeyError。

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-
packages/tensorflow/python/ops/script_ops.py in __call__(self, token, args)
     77   def __call__(self, token, args):
     78     """Calls the registered function for `token` with args."""
---> 79     func = self._funcs[token]
     80     if func is None:
     81       raise ValueError("callback %s is not found" % token)

KeyError: 'pyfunc_0'

我的问题是Defun装饰器出了什么问题？为什么它与py_func冲突？如何正确地在TensorFlor中包装numpy函数？

定义logdet的渐变的其余部分与问题matrix determinant differentiation in tensorflow相关。根据该问题的解决方案，人们试图写

@function.Defun(tf.float64, tf.float64, func_name='LogDet_Gradient')
def logdet_grad(a, grad):
    a_adj_inv = tf.matrix_inverse(a, adjoint=True)
    out_shape = tf.concat([tf.shape(a)[:-2], [1, 1]], axis=0)
    return tf.reshape(grad, out_shape) * a_adj_inv
@function.Defun(tf.float64, func_name='LogDet', grad_func=logdet_grad)
def logdet(a):
    return tf.py_func(logdet_np, [a], tf.float64, stateful=False, name='LogDet')

如果可以解决Defun和py_func之间的冲突，上述代码就可以解决，这是我在上面提出的关键问题。

Answer 1

在@MaxB的帮助下，我在这里发布代码，为log（abs（det（A）））及其渐变定义函数logdet。

logdet调用numpy函数numpy.linalg.slogdet使用log（det（A））= Tr（log（A））的思想来计算行列式的对数，这是健壮的反对行列式的上溢/下溢。它基于LU分解，与基于特征值/ SVD的方法相比更有效。
numpy函数slogdet返回一个元组，其中包含行列式的符号和日志（abs（det（A）））。符号将被忽略，因为它不会对优化中的梯度信号有所贡献。
根据grad log（det（A））= inv（A）^ T，通过矩阵求逆来计算logdet的梯度。它基于TensorFlow在_MatrixDeterminantGrad上的代码略作修改。

import numpy as np
import tensorflow as tf
# from https://gist.github.com/harpone/3453185b41d8d985356cbe5e57d67342
# Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
# from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/linalg_grad.py
# Gradient for logdet
def logdet_grad(op, grad):
    a = op.inputs[0]
    a_adj_inv = tf.matrix_inverse(a, adjoint=True)
    out_shape = tf.concat([tf.shape(a)[:-2], [1, 1]], axis=0)
    return tf.reshape(grad, out_shape) * a_adj_inv
# define logdet by calling numpy.linalg.slogdet
def logdet(a, name = None):
    with tf.name_scope(name, 'LogDet', [a]) as name:
        res = py_func(lambda a: np.linalg.slogdet(a)[1], 
                      [a], 
                      tf.float64, 
                      name=name, 
                      grad=logdet_grad) # set the gradient
        return res

可以测试logdet是否适用于非常大/小的行列式且其梯度也是正确的。

i = tf.constant(np.eye(500))
x = tf.Variable(np.array([10.]))
y = logdet(x*i)
dy = tf.gradients(y, [x])
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run([y, dy]))

结果：[1151.2925464970251, [array([ 50.])]]

Answer 2

如果你的问题是溢出，你可以用简单的数学来解决它。

所以你只需要得到特征值，记录它们并总结它们。

Answer 3

您可以使用SVD分解A：

A = U S V'

由于产品的决定因素是决定因素的乘积，U和V'的决定因素是1或-1，而S的决定因素是非负的，

abs(det(A)) = det(S)

因此，（正）行列式的对数可以计算为

tf.reduce_sum(tf.log(svd(A, compute_uv=False)))

从TF1.1开始，tf.svd缺少渐变（未来版本可能会有它），因此我建议采用 kofd 代码中的实现：

https://github.com/tensorflow/tensorflow/issues/6503

当行列式上溢/下溢时，计算TensorFlow中行列式的对数

3 个答案: