Question

我正在构建一个自定义RNN单元，在该单元中，我只希望在每个步骤中仅更新一部分状态。建立了keras模型，但是我得到一个错误，即梯度是不可微的。由于使用了tf.gather_nd和tf.scatter_nd函数，这很有意义，但我想知道是否有解决此问题的方法？任何帮助将不胜感激！

当前使用TF2.0和自定义keras层。

以下代码显示了自定义RNN单元中的调用函数。对于单元格中该步骤周围的大多数状态，基本上都有一个旁路。

    def call(self, inputs, state):
        u, p = inputs
        state = state[0] + tf.zeros((state[0].shape))
        u = tf.reshape(u, (u.shape[0],self.players*2*(self.outputs+1),1))
        xp = self.get_game_states(state, p)
        xp_plus = tf.matmul(self.A, xp) + tf.matmul(self.B, u)
        state = self.update_states(state, p, xp_plus, xp)
        return state, [state]

    def get_game_states(self, states, players_in_game):
        batch_size = players_in_game.shape[0]
        state = tf.stack([tf.gather_nd(states[i], players_in_game[i]) for i in range(batch_size)],axis=0)
        state = tf.reshape(state, (batch_size, self.player_states*self.players*2, 1))
        return state

    def update_states(self, states, players_in_game, xp_plus, xp):
        state_change = xp_plus - xp
        state_change = tf.reshape(state_change, ((state_change.shape[0], self.players*2,-1)))
        sparse_tensor = tf.stack([tf.scatter_nd(players_in_game[i], state_change[i], states[0].shape) for i in range(states.shape[0])])
        states = states + sparse_tensor
        return states

ValueError：操作具有None用于渐变。请确保您所有的操作都定义了渐变（即可区分）。没有渐变的常见操作：K.argmax，K.round，K.eval。

如何解决：自定义RNN单元中的不可微函数

0 个答案: