我已经根据bvlc_reference_caffenet的train_val.prototxt进行了调整,以在Caffe中实现VGG-16的克隆,并使用GTX 1050成功训练了它,设置了batch_size: 6
和 base_lr: 0.0648 (~ 0.01 * sqrt(256/6) ~ 0.01 * sqrt(42))
。然而,由于目标平台的精度有限,我希望将输入数据从[0;255]缩放到[0;1]。为了缩放数据,我引入了scale: 0.00390625
参数(这个参数是从Caffe的LeNet示例中借鉴的,该示例在目标平台上运行良好)。但是,使用scale
参数后,准确率并未提高(超过40000次迭代),训练过程中的损失值也没有变化。
如何在使用scale
参数的情况下训练这个CNN?
train_val.prototxt
name: "ES VGG"layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 mirror: true crop_size: 224 mean_file: "/local/datasets/imagenet/ilsvrc12/imagenet_mean.binaryproto" } data_param { source: "/local/datasets/imagenet/ilsvrc12_train_lmdb" batch_size: 6 backend: LMDB }}layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 mirror: false crop_size: 224 mean_file: "/local/datasets/imagenet/ilsvrc12/imagenet_mean.binaryproto" } data_param { source: "/local/datasets/imagenet/ilsvrc12_val_lmdb" batch_size: 6 backend: LMDB }}layer { name: "conv1_1" type: "Convolution" bottom: "data" top: "conv1_1" convolution_param { num_output: 64 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu1_1" type: "ReLU" bottom: "conv1_1" top: "conv1_1"}layer { name: "conv1_2" type: "Convolution" bottom: "conv1_1" top: "conv1_2" convolution_param { num_output: 64 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu1_2" type: "ReLU" bottom: "conv1_2" top: "conv1_2"}layer { name: "pool1" type: "Pooling" bottom: "conv1_2" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv2_1" type: "Convolution" bottom: "pool1" top: "conv2_1" convolution_param { num_output: 128 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu2_1" type: "ReLU" bottom: "conv2_1" top: "conv2_1"}layer { name: "conv2_2" type: "Convolution" bottom: "conv2_1" top: "conv2_2" convolution_param { num_output: 128 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu2_2" type: "ReLU" bottom: "conv2_2" top: "conv2_2"}layer { name: "pool2" type: "Pooling" bottom: "conv2_2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv3_1" type: "Convolution" bottom: "pool2" top: "conv3_1" convolution_param { num_output: 256 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu3_1" type: "ReLU" bottom: "conv3_1" top: "conv3_1"}layer { name: "conv3_2" type: "Convolution" bottom: "conv3_1" top: "conv3_2" convolution_param { num_output: 256 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu3_2" type: "ReLU" bottom: "conv3_2" top: "conv3_2"}layer { name: "conv3_3" type: "Convolution" bottom: "conv3_2" top: "conv3_3" convolution_param { num_output: 256 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu3_3" type: "ReLU" bottom: "conv3_3" top: "conv3_3"}layer { name: "pool3" type: "Pooling" bottom: "conv3_3" top: "pool3" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv4_1" type: "Convolution" bottom: "pool3" top: "conv4_1" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu4_1" type: "ReLU" bottom: "conv4_1" top: "conv4_1"}layer { name: "conv4_2" type: "Convolution" bottom: "conv4_1" top: "conv4_2" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu4_2" type: "ReLU" bottom: "conv4_2" top: "conv4_2"}layer { name: "conv4_3" type: "Convolution" bottom: "conv4_2" top: "conv4_3" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu4_3" type: "ReLU" bottom: "conv4_3" top: "conv4_3"}layer { name: "pool4" type: "Pooling" bottom: "conv4_3" top: "pool4" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv5_1" type: "Convolution" bottom: "pool4" top: "conv5_1" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu5_1" type: "ReLU" bottom: "conv5_1" top: "conv5_1"}layer { name: "conv5_2" type: "Convolution" bottom: "conv5_1" top: "conv5_2" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "relu5_2" type: "ReLU" bottom: "conv5_2" top: "conv5_2"}layer { name: "conv5_3" type: "Convolution" bottom: "conv5_2" top: "conv5_3" convolution_param { num_output: 512 kernel_size: 3 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "pool5" type: "Pooling" bottom: "conv5_3" top: "pool5" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}
solver.prototxt
net: "models/es_vgg/train_val.prototxt"test_iter: 1000test_interval: 1000base_lr: 0.0648lr_policy: "step"gamma: 0.1stepsize: 100000display: 20max_iter: 18900000momentum: 0.9weight_decay: 0.0005snapshot: 10000snapshot_prefix: "models/es_vgg/es_vgg_train"solver_mode: GPU
回答:
如果你将输入除以255
,你需要将第一个卷积层"conv1_1"
的权重乘以255
来补偿这一变化。
请参考网络手术了解如何操作。
例如(在Python中):
import caffenet = caffe.Net('models/es_vgg/train_val.prototxt', caffe.TEST) # 没有提供.caffemodel权重 - 权重随机初始化# 通过255缩放第一个卷积层的内核net.params['conv1_1'][0].data[...] = 255. * net.params['conv1_1'][0].data# 保存缩放后的权重net.save('models/es_vgg/init_scaled.caffemodel')
现在你需要使用'models/es_vgg/init_scaled.caffemodel'
开始你的训练。