加载张量流层的自定义权重

时间:2017-07-28 15:13:19

标签: machine-learning tensorflow neural-network deep-learning conv-neural-network

我在theano训练了一个DNN但是由于某些问题,我改用了tensorflow。我将权重从theano转换为tensorflow格式。我在张量流中建立了与在theano中相同的架构。但是如何使用我在磁盘上的weight file来初始化图层的权重。这是我的基础架构:

input_layer = keras.layers.InputLayer(input_shape=(224,224,3),input_tensor=features)

# Conv block 1
conv1_1 = tf.layers.conv2d(inputs=input_layer, 
                           filters=64, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv1_1')

conv1_2 = tf.layers.conv2d(inputs=conv1_1, 
                           filters=64, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv1_2')

pool1 = tf.layers.max_pooling2d(inputs=conv1_2,
                                pool_size=(2,2), 
                                strides=(2,2), 
                                name='pool1')


# Conv block 2
conv2_1 = tf.layers.conv2d(inputs=pool1, 
                           filters=128, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv2_1')

conv2_2 = tf.layers.conv2d(inputs=conv2_1, 
                           filters=128, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv2_2')

pool2 = tf.layers.max_pooling2d(inputs=conv2_2,
                                pool_size=(2,2), 
                                strides=(2,2), 
                                name='pool2')

# Conv block 3
conv3_1 = tf.layers.conv2d(inputs=pool2, 
                           filters=256, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv3_1')

conv3_2 = tf.layers.conv2d(inputs=conv3_1, 
                           filters=256, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv3_2')

conv3_3 = tf.layers.conv2d(inputs=conv3_2, 
                           filters=256, kernel_size=[3,3], 
                           padding='same', 
                           activation=tf.nn.relu,
                           name='conv3_3')

pool3 = tf.layers.max_pooling2d(inputs=conv3_3,
                                pool_size=(2,2), 
                                strides=(2,2), 
                                name='pool3')

如何从我在磁盘上的重量文件中加载这些图层的权重?请帮忙

1 个答案:

答案 0 :(得分:3)

有许多不同的方法可以实现这一目标。我说最简单的方法是使用np.savez

将权重(参数)矩阵和偏向量导出为数组

例如,您可以构建字典并添加数组

params = {}
...

params['fc1/weights'] = this_weight_matrix
params['fc1/biases'] = this_bias_vector
...

np.savez('model_weights', **params)

然后,假设你设置了TensorFlow图;这是一个完全连接的层作为包装函数的示例:

def fc_layer(input_tensor, n_output_units, name,
             activation_fn=None, seed=None,
             weight_params=None, bias_params=None):

    with tf.variable_scope(name):

        if weight_params is not None:
            weights = tf.Variable(weight_params, name='weights',
                                  dtype=tf.float32)
        else:
            weights = tf.Variable(tf.truncated_normal(
                shape=[input_tensor.get_shape().as_list()[-1], n_output_units],
                    mean=0.0,
                    stddev=0.1,
                    dtype=tf.float32,
                    seed=seed),
                name='weights',)

        if bias_params is not None:
            biases = tf.Variable(bias_params, name='biases', 
                                 dtype=tf.float32)

        else:
            biases = tf.Variable(tf.zeros(shape=[n_output_units]),
                                 name='biases', 
                                 dtype=tf.float32)

        act = tf.matmul(input_tensor, weights) + biases

        if activation_fn is not None:
            act = activation_fn(act)

    return act

接下来,假设您将保存到磁盘的参数加载回Python会话:

param_dict = np.load('model_weigths.npz')

然后,当您设置实际图形时(使用之前的包装函数),您可以按如下方式进行:

g = tf.Graph()
with g.as_default():
    fc1 = fc_layer(input_tensor=tf_x, 
                   n_output_units=n_hidden_1, 
                   name='fc1',
                   weight_params=fixed_params['fc1/weights'], 
                   bias_params=fixed_params['fc1/biases'],
                   activation_fn=tf.nn.relu)
...