我在使用Keras
的函数式API时,对输出感到困惑。
model = tf.keras.applications.ResNet50(include_top = False, weights = None, input_shape = (100,100,3), pooling = 'max', classifier_activation = 'relu')layer_outputs = [layer.output for layer in model.layers[:15]]model2 = tf.keras.models.Model(model.input, layer_outputs)model2.summary()_________________________________________________________________________________Layer (type) Output Shape Param # Connected to =================================================================================input_13 (InputLayer) [(None, 100, 100, 3) 0 _________________________________________________________________________________conv1_pad (ZeroPadding2D) (None, 106, 106, 3) 0 input_13[0][0] _________________________________________________________________________________conv1_conv (Conv2D) (None, 50, 50, 64) 9472 conv1_pad[0][0] _________________________________________________________________________________conv1_bn (BatchNormalization) (None, 50, 50, 64) 256 conv1_conv[0][0] _________________________________________________________________________________conv1_relu (Activation) (None, 50, 50, 64) 0 conv1_bn[0][0] _________________________________________________________________________________pool1_pad (ZeroPadding2D) (None, 52, 52, 64) 0 conv1_relu[0][0] _________________________________________________________________________________pool1_pool (MaxPooling2D) (None, 25, 25, 64) 0 pool1_pad[0][0] _________________________________________________________________________________conv2_block1_1_conv (Conv2D) (None, 25, 25, 64) 4160 pool1_pool[0][0] _________________________________________________________________________________conv2_block1_1_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_1_conv[0][0] ________________________________________________________________________________conv2_block1_1_relu (Activation (None, 25, 25, 64) 0 conv2_block1_1_bn[0][0] _________________________________________________________________________________conv2_block1_2_conv (Conv2D) (None, 25, 25, 64) 36928 conv2_block1_1_relu[0][0] _________________________________________________________________________________conv2_block1_2_bn (BatchNormali (None, 25, 25, 64) 256 conv2_block1_2_conv[0][0] _________________________________________________________________________________conv2_block1_2_relu (Activation (None, 25, 25, 64) 0 conv2_block1_2_bn[0][0] _________________________________________________________________________________conv2_block1_0_conv (Conv2D) (None, 25, 25, 256) 16640 pool1_pool[0][0] _________________________________________________________________________________conv2_block1_3_conv (Conv2D) (None, 25, 25, 256) 16640 conv2_block1_2_relu[0][0] =================================================================================Total params: 84,608Trainable params: 84,224Non-trainable params: 384
input_anchor = tf.keras.layers.Input(shape = (100,100,3)) input_positive = tf.keras.layers.Input(shape = (100,100,3)) input_negative = tf.keras.layers.Input(shape = (100,100,3)) embedding_anchor = model2(input_anchor)embedding_positive = model2(input_positive)embedding_negative = model2(input_negative)output = tf.keras.layers.concatenate([embedding_anchor[0], embedding_positive[0], embedding_negative[0]] , axis = -1)siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)siamese.summary()_________________________________________________________________________________Layer (type) Output Shape Param # Connected to =================================================================================input_32 (InputLayer) [(None, 100, 100, 3) 0 _________________________________________________________________________________input_33 (InputLayer) [(None, 100, 100, 3) 0 _________________________________________________________________________________input_34 (InputLayer) [(None, 100, 100, 3) 0 _________________________________________________________________________________model_5 (Functional) [(None, 100, 100, 3) 84608 input_32[0][0] input_33[0][0] input_34[0][0] _________________________________________________________________________________concatenate_7 (Concatenate) (None, 100, 100, 9) 0 model_5[15][0] model_5[16][0] model_5[17][0] =================================================================================Total params: 84,608Trainable params: 84,224Non-trainable params: 384
让我困惑的是,为什么我的输出是(None,100,100,9)
,而我以为它应该是(None,25,25,768)
。我感觉这可能与我的model2有关,但我不知道如何获得正确的形状。任何帮助都将不胜感激。
回答:
你的代码中还有另一个问题需要考虑。为了使其有用,我们在这里添加了一些信息,解释为什么你会遇到这种情况,因为你觉得这很 confusing。
导致你困惑的原因
假设我们有一个模型,M
,比如A - > B -> C -> D -> E -> F -> G
,其中每个字母代表一个层,从字母A
开始作为输入,G
作为输出。现在,假设我们有一个输入图像X
,我们想从模型M
的层D
获取输出特征图。为此,我们只需在tf.keras
中执行如下操作:
feat_model_a = tf.keras.Model(M.input, M.layers[3].output)
现在,如果我们出于某些原因想要获取所有中间层的激活,我们才需要执行如下操作:
all_feat = [layer.output for layer in M.layers]feat_model_b = tf.keras.Model(inputs = M.input, outputs = all_feat)
现在,你可以意识到这两个模型,feat_model_a
和feat_model_b
的输出数量是不同的。对于feat_model_a
,它将产生一个单一输出,即M.layers[3].output
,而对于feat_model_b
,它将产生all_feat
乘以层的输出数量的输出。
这里有一个例子,使用你的代码,观察以下两种情况。首先,我们将构建一个模型,它将从基础模型中提供所有输出特征图(你所做的事情相同):
# 基础模型
model = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=(100,100,3), pooling='max', classifier_activation='relu')
# 获取前15层的输出,总共15个输出
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
pred = model2(tf.ones((1, 100, 100, 3)))
print(len(pred)) # model2 为15层产生15个特征图
print(pred[0].shape) # 将给出第一层的特征图
print(pred[-1].shape) # 将给出最后一层的特征图
15
(1, 100, 100, 3)
(1, 25, 25, 256)
在这里你可以看到,model2
根据设计提供了15个输出特征图。希望现在你能理解,当你传递
model = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=(100,100,3), pooling='max', classifier_activation='relu')
layer_outputs = [layer.output for layer in model.layers[:15]]
model2 = tf.keras.models.Model(model.input, layer_outputs)
input_anchor = tf.keras.layers.Input(shape=(100,100,3))
......
embedding_anchor = model2(input_anchor)
embedding_anchor[0]
的第一个索引不是(256 x 256)
的形状,而是(100 x 100)
的形状 – 这确实是你设计model2
的方式。但是,如果我们只想获取第15个输出特征图,我们需要这样做:
# 基础模型
model = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=(100,100,3), pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[14].output)
pred = model2(tf.ones((1, 100, 100, 3)))
print(len(pred)) # 第15个特征图
print(pred[0].shape) # 只第15个
print(pred[-1].shape) # 只第15个
1
(1, 25, 25, 256)
(1, 25, 25, 256)
希望现在你的困惑已经解除。我在官方文档的函数式API页面上看到,在提取和重用图层图中的节点部分,他们展示了这一点,但缺乏足够的细节。我们是需要所有特征图输出还是只需要单层输出,完全取决于我们的需求。
选择合适的层进行特征提取
这里还有另一个问题我认为你应该考虑。绘制Keras模型并不是一个花哨的工具,而是一种方便的调试模型特征流信息的方式。你选择了ResNet
的前15层,但你忘记了它内部是否有一些桥接。如果你绘制ResNet
,并查看前15层的图,你会发现有两个conv
层在第7或8个位置分开 – 层conv2_block1_1_conv
和conv2_block1_0_conv
。它们后来在第17个位置汇合。所以,当你选择第15个位置时,你也丢失了一些层的操作。这就是为什么当前的位置至少应该在第17层。
这里是完整的工作代码
model = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=(100,100,3), pooling='max', classifier_activation='relu')
model2 = tf.keras.models.Model(model.input, model.layers[16].output)
input_anchor = tf.keras.layers.Input(shape=(100,100,3))
input_positive = tf.keras.layers.Input(shape=(100,100,3))
input_negative = tf.keras.layers.Input(shape=(100,100,3))
embedding_anchor = model2(input_anchor)
embedding_positive = model2(input_positive)
embedding_negative = model2(input_negative)
output = tf.keras.layers.concatenate( [embedding_anchor[0], embedding_positive[0], embedding_negative[0]], axis=-1)
siamese = tf.keras.models.Model([input_anchor, input_positive, input_negative], output)
tf.keras.utils.plot_model(siamese, show_shapes=True, show_layer_names=True, expand_nested=True)