经过一番努力后,我决定尝试一个最简单的任务,训练一个网络来分类一个数字是否为非负数。但我失败了…
我使用以下代码生成了数据。我不确定这些数据是否正确。我从文件中读取了数据,看起来是正确的…
#pragma comment(lib, "hdf5")#pragma comment(lib, "hdf5_cpp")#include <cstdint>#include <array>#include <random>#include <vector>using namespace std;#include <H5Cpp.h>using namespace H5;mt19937 rng;float randf(float i_min, float i_max){ return rng() * ((i_max - i_min) / 0x100000000) + i_min;}#define NAME "pos_neg"#define TRAIN_SET_SIZE 0x100000#define TEST_SET_SIZE 0x10000void make(const string &i_cat, uint32_t i_count){ H5File file(NAME "." + i_cat + ".h5", H5F_ACC_TRUNC); hsize_t dataDim[2] = { i_count, 1 }; hsize_t labelDim = i_count; FloatType dataType(PredType::NATIVE_FLOAT); DataSpace dataSpace(2, dataDim); DataSet dataSet = file.createDataSet("data", dataType, dataSpace); IntType labelType(PredType::NATIVE_INT); DataSpace labelSpace(1, &labelDim); DataSet labelSet = file.createDataSet("label", labelType, labelSpace); vector<float> data(i_count); vector<int> labels(i_count); for (uint32_t i = 0; i < i_count / 2; ++i) { labels[i * 2] = 0; data[i * 2] = randf(0.f, 1.f); labels[i * 2 + 1] = 1; data[i * 2 + 1] = randf(-1.f, 0.f); } dataSet.write(&data[0], PredType::NATIVE_FLOAT); labelSet.write(&labels[0], PredType::NATIVE_INT);}int main(){ make("train", TRAIN_SET_SIZE); make("test", TEST_SET_SIZE);}
网络结构如下
name: "PosNegNet"layer { name: "data" type: "HDF5Data" top: "data" top: "label" include { phase: TRAIN } hdf5_data_param { source: "pos_neg_train.txt" batch_size: 64 }}layer { name: "data" type: "HDF5Data" top: "data" top: "label" include { phase: TEST } hdf5_data_param { source: "pos_neg_test.txt" batch_size: 65536 }}layer { name: "fc1" type: "InnerProduct" bottom: "data" top: "fc1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } }}layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc1" bottom: "label" top: "loss"}layer { name: "accuracy" type: "Accuracy" bottom: "fc1" bottom: "label" top: "accuracy" include { phase: TEST }}
我尝试了一组参数
net: "pos_neg.prototxt"test_iter: 1test_interval: 500base_lr: 0.001momentum: 0.9momentum2: 0.999lr_policy: "fixed"display: 100max_iter: 10000snapshot: 5000snapshot_prefix: "pos_neg"type: "Adam"solver_mode: GPU
我在Windows上运行了caffe.exe,结果总是得到loss = 0,accuracy = 0.5。
我知道我肯定做错了什么,但不知道从哪里开始找问题,除了去翻阅源代码…
我还发现caffe运行得相当慢。在1080Ti上,每秒只能进行大约16次迭代,处理的是每批1024个float[64]的数据。这是正常的吗,还是我又做错了什么?
回答:
在你的"fc1"
中设置num_output: 2
:当使用"SoftmaxWithLoss"
和/或"Accuracy"
层时,caffe期望你的预测是一个类概率的向量。在你的情况下,你有两个类,因此这个向量应该长度为2(而不是当前的1)。
或者,你可以保持num_output: 1
,并将损失函数改为"SigmoidCrossEntropyLoss"
层。然而,你将无法再使用"Accuracy"
层…