在Caffe中,我创建了一个简单的网络来分类人脸图像,如下所示:
myExampleNet.prototxt
name: "myExample"layer { name: "example" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 } data_param { source: "examples/myExample/myExample_train_lmdb" batch_size: 64 backend: LMDB }}layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "examples/myExample/myExample_test_lmdb" batch_size: 100 backend: LMDB }}layer { name: "ip1" type: "InnerProduct" bottom: "data" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 50 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 155 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST }}layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss"}
myExampleSolver.prototxt
net: "examples/myExample/myExampleNet.prototxt"test_iter: 15test_interval: 500base_lr: 0.01momentum: 0.9weight_decay: 0.0005lr_policy: "inv"gamma: 0.0001power: 0.75display: 100max_iter: 30000snapshot: 5000snapshot_prefix: "examples/myExample/myExample"solver_mode: CPU
我使用Caffe的convert_imageset
工具创建了LMDB数据库,我的训练数据大约有40000张,测试数据大约有16000张,共有155个类别,每个类别的训练和测试图像分别大约有260张和100张。
我使用以下命令处理训练数据:
build/tools/convert_imageset -resize_height=100 -resize_width=100 -shuffle examples/myExample/myData/data/ examples/myExample/myData/data/labels_train.txt examples/myExample/myExample_train_lmdb
并使用以下命令处理测试数据:
build/tools/convert_imageset -resize_height=100 -resize_width=100 -shuffle examples/myExample/myData/data/ examples/myExample/myData/data/labels_test.txt examples/myExample/myExample_test_lmdb
但是在30000次迭代后,我的损失值仍然很高,准确率很低:
...I0127 09:25:55.602881 27305 solver.cpp:310] Iteration 30000, loss = 4.98317I0127 09:25:55.602917 27305 solver.cpp:330] Iteration 30000, Testing net (#0)I0127 09:25:55.602926 27305 net.cpp:676] Ignoring source layer exampleI0127 09:25:55.827739 27305 solver.cpp:397] Test net output #0: accuracy = 0.0126667I0127 09:25:55.827764 27305 solver.cpp:397] Test net output #1: loss = 5.02207 (* 1 = 5.02207 loss)
当我将数据集改为MNIST,并将ip2
层的num_output
从155改为10时,损失值显著降低,准确率提高了!
哪里出了问题?
回答:
你的代码本身不一定有问题。
你对MNIST数据集取得的良好结果确实表明你的模型在代码上是“正确”的,意味着它不会产生编码错误等,但这绝不保证它在另一个不同的问题上也会表现良好。
请记住,原则上,预测一个10类问题(如MNIST)比预测一个155类问题要容易得多;在前一种情况下,基线(即简单随机猜测)的准确率大约为10%,而在后一种情况下仅为约0.65%。再加上你的数据集大小(与MNIST相当)也不大(它们也是彩色图片吗,即与单通道MNIST相比的3通道图片?),你的结果可能就不那么令人费解和惊讶了。
此外,MNIST数据集被证明非常容易拟合(我自己也尝试过构建不适合MNIST的模型,但至今未成功),你很容易得出现在社区中已成为共识的结论,即在MNIST上的良好表现并不能真正说明一个模型架构的优劣。