我之前在Theano上训练了一个深度神经网络,但由于某些问题,我转而使用TensorFlow。我已经将权重从Theano格式转换为TensorFlow格式。我在TensorFlow中构建了与Theano中相同的架构。但是,如何用我磁盘上的权重文件
来初始化这些层的权重呢?这是我的基础架构:
input_layer = keras.layers.InputLayer(input_shape=(224,224,3),input_tensor=features)# Conv block 1conv1_1 = tf.layers.conv2d(inputs=input_layer, filters=64, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv1_1')conv1_2 = tf.layers.conv2d(inputs=conv1_1, filters=64, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv1_2')pool1 = tf.layers.max_pooling2d(inputs=conv1_2, pool_size=(2,2), strides=(2,2), name='pool1')# Conv block 2conv2_1 = tf.layers.conv2d(inputs=pool1, filters=128, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv2_1')conv2_2 = tf.layers.conv2d(inputs=conv2_1, filters=128, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv2_2')pool2 = tf.layers.max_pooling2d(inputs=conv2_2, pool_size=(2,2), strides=(2,2), name='pool2')# Conv block 3conv3_1 = tf.layers.conv2d(inputs=pool2, filters=256, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv3_1')conv3_2 = tf.layers.conv2d(inputs=conv3_1, filters=256, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv3_2')conv3_3 = tf.layers.conv2d(inputs=conv3_2, filters=256, kernel_size=[3,3], padding='same', activation=tf.nn.relu, name='conv3_3')pool3 = tf.layers.max_pooling2d(inputs=conv3_3, pool_size=(2,2), strides=(2,2), name='pool3')
我如何从磁盘上的权重文件中加载这些层的权重?请帮助我
回答:
有许多不同的方法可以实现这一点。我认为最简单的方法是使用np.savez
将权重(参数)矩阵和偏置向量导出为数组
例如,你可以构建一个字典并添加数组
params = {}...params['fc1/weights'] = this_weight_matrixparams['fc1/biases'] = this_bias_vector...np.savez('model_weights', **params)
然后,假设你已经设置了TensorFlow图;这是一个作为包装函数的全连接层的示例:
def fc_layer(input_tensor, n_output_units, name, activation_fn=None, seed=None, weight_params=None, bias_params=None): with tf.variable_scope(name): if weight_params is not None: weights = tf.Variable(weight_params, name='weights', dtype=tf.float32) else: weights = tf.Variable(tf.truncated_normal( shape=[input_tensor.get_shape().as_list()[-1], n_output_units], mean=0.0, stddev=0.1, dtype=tf.float32, seed=seed), name='weights',) if bias_params is not None: biases = tf.Variable(bias_params, name='biases', dtype=tf.float32) else: biases = tf.Variable(tf.zeros(shape=[n_output_units]), name='biases', dtype=tf.float32) act = tf.matmul(input_tensor, weights) + biases if activation_fn is not None: act = activation_fn(act) return act
接下来,假设你将保存到磁盘的参数重新加载到Python会话中:
param_dict = np.load('model_weigths.npz')
然后,当你设置实际的图(使用之前的包装函数)时,你可以按以下方式进行:
g = tf.Graph()with g.as_default(): fc1 = fc_layer(input_tensor=tf_x, n_output_units=n_hidden_1, name='fc1', weight_params=fixed_params['fc1/weights'], bias_params=fixed_params['fc1/biases'], activation_fn=tf.nn.relu)...