使用带有缩放参数的Caffe训练CNN

我已经根据bvlc_reference_caffenet的train_val.prototxt进行了调整,以在Caffe中实现VGG-16的克隆,并使用GTX 1050成功训练了它,设置了batch_size: 6base_lr: 0.0648 (~ 0.01 * sqrt(256/6) ~ 0.01 * sqrt(42))。然而,由于目标平台的精度有限,我希望将输入数据从[0;255]缩放到[0;1]。为了缩放数据,我引入了scale: 0.00390625参数(这个参数是从Caffe的LeNet示例中借鉴的,该示例在目标平台上运行良好)。但是,使用scale参数后,准确率并未提高(超过40000次迭代),训练过程中的损失值也没有变化。

如何在使用scale参数的情况下训练这个CNN?

train_val.prototxt

name: "ES VGG"layer {  name: "data"  type: "Data"  top: "data"  top: "label"  include {    phase: TRAIN  }  transform_param {    scale: 0.00390625    mirror: true    crop_size: 224    mean_file: "/local/datasets/imagenet/ilsvrc12/imagenet_mean.binaryproto"  }  data_param {    source: "/local/datasets/imagenet/ilsvrc12_train_lmdb"    batch_size: 6    backend: LMDB  }}layer {  name: "data"  type: "Data"  top: "data"  top: "label"  include {    phase: TEST  }  transform_param {    scale: 0.00390625    mirror: false    crop_size: 224    mean_file: "/local/datasets/imagenet/ilsvrc12/imagenet_mean.binaryproto"  }  data_param {    source: "/local/datasets/imagenet/ilsvrc12_val_lmdb"    batch_size: 6    backend: LMDB  }}layer {  name: "conv1_1"  type: "Convolution"  bottom: "data"  top: "conv1_1"  convolution_param {    num_output: 64    kernel_size: 3    pad: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu1_1"  type: "ReLU"  bottom: "conv1_1"  top: "conv1_1"}layer {  name: "conv1_2"  type: "Convolution"  bottom: "conv1_1"  top: "conv1_2"  convolution_param {    num_output: 64    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu1_2"  type: "ReLU"  bottom: "conv1_2"  top: "conv1_2"}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1_2"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2_1"  type: "Convolution"  bottom: "pool1"  top: "conv2_1"  convolution_param {    num_output: 128    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu2_1"  type: "ReLU"  bottom: "conv2_1"  top: "conv2_1"}layer {  name: "conv2_2"  type: "Convolution"  bottom: "conv2_1"  top: "conv2_2"  convolution_param {    num_output: 128    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu2_2"  type: "ReLU"  bottom: "conv2_2"  top: "conv2_2"}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2_2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv3_1"  type: "Convolution"  bottom: "pool2"  top: "conv3_1"  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_1"  type: "ReLU"  bottom: "conv3_1"  top: "conv3_1"}layer {  name: "conv3_2"  type: "Convolution"  bottom: "conv3_1"  top: "conv3_2"  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_2"  type: "ReLU"  bottom: "conv3_2"  top: "conv3_2"}layer {  name: "conv3_3"  type: "Convolution"  bottom: "conv3_2"  top: "conv3_3"  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_3"  type: "ReLU"  bottom: "conv3_3"  top: "conv3_3"}layer {  name: "pool3"  type: "Pooling"  bottom: "conv3_3"  top: "pool3"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv4_1"  type: "Convolution"  bottom: "pool3"  top: "conv4_1"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_1"  type: "ReLU"  bottom: "conv4_1"  top: "conv4_1"}layer {  name: "conv4_2"  type: "Convolution"  bottom: "conv4_1"  top: "conv4_2"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_2"  type: "ReLU"  bottom: "conv4_2"  top: "conv4_2"}layer {  name: "conv4_3"  type: "Convolution"  bottom: "conv4_2"  top: "conv4_3"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_3"  type: "ReLU"  bottom: "conv4_3"  top: "conv4_3"}layer {  name: "pool4"  type: "Pooling"  bottom: "conv4_3"  top: "pool4"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv5_1"  type: "Convolution"  bottom: "pool4"  top: "conv5_1"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu5_1"  type: "ReLU"  bottom: "conv5_1"  top: "conv5_1"}layer {  name: "conv5_2"  type: "Convolution"  bottom: "conv5_1"  top: "conv5_2"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu5_2"  type: "ReLU"  bottom: "conv5_2"  top: "conv5_2"}layer {  name: "conv5_3"  type: "Convolution"  bottom: "conv5_2"  top: "conv5_3"  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {    type: "xavier"    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "pool5"  type: "Pooling"  bottom: "conv5_3"  top: "pool5"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}

solver.prototxt

net: "models/es_vgg/train_val.prototxt"test_iter: 1000test_interval: 1000base_lr: 0.0648lr_policy: "step"gamma: 0.1stepsize: 100000display: 20max_iter: 18900000momentum: 0.9weight_decay: 0.0005snapshot: 10000snapshot_prefix: "models/es_vgg/es_vgg_train"solver_mode: GPU

回答:

如果你将输入除以255,你需要将第一个卷积层"conv1_1"的权重乘以255来补偿这一变化。
请参考网络手术了解如何操作。

例如(在Python中):

import caffenet = caffe.Net('models/es_vgg/train_val.prototxt', caffe.TEST)  # 没有提供.caffemodel权重 - 权重随机初始化# 通过255缩放第一个卷积层的内核net.params['conv1_1'][0].data[...] = 255. * net.params['conv1_1'][0].data# 保存缩放后的权重net.save('models/es_vgg/init_scaled.caffemodel')

现在你需要使用'models/es_vgg/init_scaled.caffemodel'开始你的训练。

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注