我尝试将vgg16网络的c0nv4_3层而不是conv5_3层连接到Faster R-CNN的RPN网络。这里是vgg16网络的Python代码。我已经修改了这些行:
def _image_to_head(self, is_training, reuse=False): with tf.variable_scope(self._scope, self._scope, reuse=reuse): net = slim.repeat(self._image, 2, slim.conv2d, 64, [3, 3], trainable=False, scope='conv1') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool1') net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], trainable=False, scope='conv2') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool2') net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], trainable=is_training, scope='conv3') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool3') net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], trainable=is_training, scope='conv4') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool4') net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], trainable=is_training, scope='conv5') self._act_summaries.append(net) self._layers['head'] = netreturn net
修改为:
def _image_to_head(self, is_training, reuse=False): with tf.variable_scope(self._scope, self._scope, reuse=reuse): net = slim.repeat(self._image, 2, slim.conv2d, 64, [3, 3], trainable=False, scope='conv1') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool1') net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], trainable=False, scope='conv2') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool2') net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], trainable=is_training, scope='conv3') net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool3') net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], trainable=is_training, scope='conv4') self._act_summaries.append(net) self._layers['head'] = netreturn net
如上所示,我删除了conv5和pool4层;因为我的目标物体较小,我希望得到更好的结果,但结果却变差了。我认为我需要在conv4的末尾添加一个反卷积层吗?还是有其他方法?
谢谢
回答:
也有方法可以减少瓶颈特征的长度。
为什么不添加反卷积层:
- 你将用随机值初始化反卷积层
- 你不是在微调网络,你只是在通过网络进行前向传递
- 因此,反卷积层的输出会使conv4的特征随机化
池化层:
-
平均池化(根据窗口大小,返回该窗口的平均值)。例如,如果窗口(2,2)的值为[3,2,4,3],将结果为一个值:6
-
最大池化(根据窗口大小,返回该窗口的最大值)。例如,如果窗口(2,2)的值为[3,2,4,3],将结果为一个值:3
查看池化层的更多信息,请点击这里