检查点keras模型:TypeError:无法pickle _thread.lock对象

时间:2017-11-02 02:30:01

标签: tensorflow keras pickle

似乎过去在不同的上下文here中发生了错误,但我没有直接转储模型 - 我正在使用ModelCheckpoint回调。知道会出现什么问题吗?

信息:

  • Keras版本2.0.8
  • Tensorflow版本1.3.0
  • Python 3.6

重现错误的最小示例:

from keras.layers import Input, Lambda, Dense
from keras.models import Model
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import tensorflow as tf
import numpy as np

x = Input(shape=(30,3))
low = tf.constant(np.random.rand(30, 3).astype('float32'))
high = tf.constant(1 + np.random.rand(30, 3).astype('float32'))
clipped_out_position = Lambda(lambda x, low, high: tf.clip_by_value(x, low, high),
                                      arguments={'low': low, 'high': high})(x)

model = Model(inputs=x, outputs=[clipped_out_position])
optimizer = Adam(lr=.1)
model.compile(optimizer=optimizer, loss="mean_squared_error")
checkpoint = ModelCheckpoint("debug.hdf", monitor="val_loss", verbose=1, save_best_only=True, mode="min")
training_callbacks = [checkpoint]
model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)

错误输出:

Train on 67 samples, validate on 33 samples
Epoch 1/50
10/67 [===>..........................] - ETA: 0s - loss: 0.1627Epoch 00001: val_loss improved from inf to 0.17002, saving model to debug.hdf
Traceback (most recent call last):
  File "debug_multitask_inverter.py", line 19, in <module>
    model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1631, in fit

▽
    validation_steps=validation_steps)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1233, in _fit_loop
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 414, in on_epoch_end
    self.model.save(filepath, overwrite=True)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2556, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/models.py", line 107, in save_model
    'config': model.get_config()
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2397, in get_config
    return copy.deepcopy(config)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.lock objects

3 个答案:

答案 0 :(得分:11)

保存Lambda图层时,也会保存传入的arguments。在这种情况下,它包含两个tf.Tensor s。似乎Keras现在不支持在模型配置中序列化tf.Tensor

但是,numpy数组可以毫无问题地序列化。因此,不是在tf.Tensor中传递arguments,而是传入numpy数组,并将它们转换为lambda函数中的tf.Tensor

x = Input(shape=(30,3))
low = np.random.rand(30, 3)
high = 1 + np.random.rand(30, 3)
clipped_out_position = Lambda(lambda x, low, high: tf.clip_by_value(x, tf.constant(low, dtype='float32'), tf.constant(high, dtype='float32')),
                              arguments={'low': low, 'high': high})(x)

上述行的一个问题是,在尝试加载此模型时,您可能会看到NameError: name 'tf' is not defined。那是因为TensorFlow没有导入重建Lambda图层的文件(core.py)。

tf更改为K.tf可以解决问题。另外,您可以tf.constant()替换K.constant()low会自动将highfrom keras import backend as K x = Input(shape=(30,3)) low = np.random.rand(30, 3) high = 1 + np.random.rand(30, 3) clipped_out_position = Lambda(lambda x, low, high: K.tf.clip_by_value(x, K.constant(low), K.constant(high)), arguments={'low': low, 'high': high})(x) 转换为float32张量。

public static void main(String[] args) 
{

    String[] test = {"this", "is", "the", "example", "of", "the", "call"};
    String[] result = removeFromArray(test, "the");
    System.out.println(Arrays.toString(result));
}

public static String[] removeFromArray(String[] arr, String toRemove)
{
    int newLength = 0;
    for(int i = 0; i < arr.length; i++)
    {    
        if(arr[i].contains(toRemove))
        {
            newLength++;
        }
    }
    String[] result = new String[arr.length-newLength];
    for(int i = 0; i < (result.length); i++)
    {
        if(arr[i].contains(toRemove))
        {

        }
        else
        {
            result[i] = arr[i];
        }
    }
    return result;
}

答案 1 :(得分:1)

https://github.com/keras-team/keras/issues/8343#issuecomment-470376703上查看我的帖子。

引发此异常是因为您尝试序列化tf.tensor,因此避免该序列化的任何方法都可以使用。包括:

  • 不序列化:仅保存模型权重,因为当您尝试使用json / yaml字符串保存模型结构时,会发生序列化。

  • 消除张量流张量:确保模型中的每个张量均由keras层生成。或尽可能使用ndarray数据,就像@ Yu-Yang建议的一样。

答案 2 :(得分:1)

要澄清一下:这不是不是的问题,Keras无法在Lambda层中腌制张量(其他情况,请参见在下面的),而是尝试将python函数(此处为lambda函数)的参数独立于该函数(此处为lambda函数本身的上下文之外)进行序列化。这适用于“静态”参数,但否则失败。为了避免这种情况,应该将非静态函数参数包装在另一个函数中。

以下是两种解决方法:


  1. 使用 static 变量,例如python / numpy-variables(如上所述):
low = np.random.rand(30, 3)
high = 1 + np.random.rand(30, 3)

x = Input(shape=(30,3))
clipped_out_position = Lambda(lambda x: tf.clip_by_value(x, low, high))(x)

  1. 使用functools.partial包装lambda函数:
import functools

clip_by_value = functools.partial(
   tf.clip_by_value,
   clip_value_min=low,
   clip_value_max=high)

x = Input(shape=(30,3))
clipped_out_position = Lambda(lambda x: clip_by_value(x))(x)

  1. 使用闭包包装lambda函数:
low = tf.constant(np.random.rand(30, 3).astype('float32'))
high = tf.constant(1 + np.random.rand(30, 3).astype('float32'))

def clip_by_value(t):
    return tf.clip_by_value(t, low, high)

x = Input(shape=(30,3))
clipped_out_position = Lambda(lambda x: clip_by_value(x))(x)

通知:尽管您有时可以放弃创建显式lambda函数,而使用以下更简洁的代码段:

clipped_out_position = Lambda(clip_by_value)(x)

缺少lambda函数的额外包装层(即 lambda t: clip_by_value(t)在执行“时,可能仍然导致相同的问题”函数参数的“深层复制”,应避免


  1. 最后,您可以将模型逻辑包装到单独的Keras层中,在这种情况下,这看起来可能有点过分设计:
x = Input(shape=(30,3))
low = Lambda(lambda t: tf.constant(np.random.rand(30, 3).astype('float32')))(x)
high = Lambda(lambda t: tf.constant(1 + np.random.rand(30, 3).astype('float32')))(x)
clipped_out_position = Lambda(lambda x: tf.clip_by_value(*x))((x, low, high))

通知 tf.clip_by_value(*x) 在最后一个Lambda层中只是一个未打包的参数元组,也可以用更详细的形式编写 tf.clip_by_value(x[0], x[1], x[2])代替。


在下面):请注意:在这种情况下,您的lambda函数试图捕获类实例的一部分(该实例也将破坏序列化)(由于{{ 3}}):

class MyModel:
    def __init__(self):
        self.low = np.random.rand(30, 3)
        self.high = 1 + np.random.rand(30, 3)

    def run(self):
        x = Input(shape=(30,3))
        clipped_out_position = Lambda(lambda x: tf.clip_by_value(x, self.low, self.high))(x)
        model = Model(inputs=x, outputs=[clipped_out_position])
        optimizer = Adam(lr=.1)
        model.compile(optimizer=optimizer, loss="mean_squared_error")
        checkpoint = ModelCheckpoint("debug.hdf", monitor="val_loss", verbose=1, save_best_only=True, mode="min")
        training_callbacks = [checkpoint]
        model.fit(np.random.rand(100, 30, 3), 
                 [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)

MyModel().run()

可以通过以下默认参数技巧来确保尽早绑定:

        (...)
        clipped_out_position = Lambda(lambda x, l=self.low, h=self.high: tf.clip_by_value(x, l, h))(x)
        (...)
相关问题