在Caffe中何时停止训练?

我正在使用bvlc_reference_caffenet进行训练。我同时进行训练和测试。下面是我训练网络的日志示例:

I0430 11:49:08.408740 23343 data_layer.cpp:73] Restarting data prefetching from start.I0430 11:49:21.221074 23343 data_layer.cpp:73] Restarting data prefetching from start.I0430 11:49:34.038710 23343 data_layer.cpp:73] Restarting data prefetching from start.I0430 11:49:46.816813 23343 data_layer.cpp:73] Restarting data prefetching from start.I0430 11:49:56.630870 23334 solver.cpp:397]     Test net output #0: accuracy = 0.932502I0430 11:49:56.630940 23334 solver.cpp:397]     Test net output #1: loss = 0.388662 (* 1 = 0.388662 loss)I0430 11:49:57.218236 23334 solver.cpp:218] Iteration 71000 (0.319361 iter/s, 62.625s/20 iters), loss = 0.00146191I0430 11:49:57.218300 23334 solver.cpp:237]     Train net output #0: loss = 0.00146191 (* 1 = 0.00146191 loss)I0430 11:49:57.218308 23334 sgd_solver.cpp:105] Iteration 71000, lr = 0.001I0430 11:50:09.168726 23334 solver.cpp:218] Iteration 71020 (1.67357 iter/s, 11.9505s/20 iters), loss = 0.000806865I0430 11:50:09.168778 23334 solver.cpp:237]     Train net output #0: loss = 0.000806868 (* 1 = 0.000806868 loss)I0430 11:50:09.168787 23334 sgd_solver.cpp:105] Iteration 71020, lr = 0.001I0430 11:50:21.127496 23334 solver.cpp:218] Iteration 71040 (1.67241 iter/s, 11.9588s/20 iters), loss = 0.000182312I0430 11:50:21.127539 23334 solver.cpp:237]     Train net output #0: loss = 0.000182314 (* 1 = 0.000182314 loss)I0430 11:50:21.127562 23334 sgd_solver.cpp:105] Iteration 71040, lr = 0.001I0430 11:50:33.248086 23334 solver.cpp:218] Iteration 71060 (1.65009 iter/s, 12.1206s/20 iters), loss = 0.000428604I0430 11:50:33.248260 23334 solver.cpp:237]     Train net output #0: loss = 0.000428607 (* 1 = 0.000428607 loss)I0430 11:50:33.248272 23334 sgd_solver.cpp:105] Iteration 71060, lr = 0.001I0430 11:50:45.518955 23334 solver.cpp:218] Iteration 71080 (1.62989 iter/s, 12.2707s/20 iters), loss = 0.00108446I0430 11:50:45.519006 23334 solver.cpp:237]     Train net output #0: loss = 0.00108447 (* 1 = 0.00108447 loss)I0430 11:50:45.519011 23334 sgd_solver.cpp:105] Iteration 71080, lr = 0.001I0430 11:50:51.287315 23341 data_layer.cpp:73] Restarting data prefetching from start.I0430 11:50:57.851781 23334 solver.cpp:218] Iteration 71100 (1.62169 iter/s, 12.3328s/20 iters), loss = 0.00150949I0430 11:50:57.851828 23334 solver.cpp:237]     Train net output #0: loss = 0.0015095 (* 1 = 0.0015095 loss)I0430 11:50:57.851837 23334 sgd_solver.cpp:105] Iteration 71100, lr = 0.001I0430 11:51:09.912184 23334 solver.cpp:218] Iteration 71120 (1.65832 iter/s, 12.0604s/20 iters), loss = 0.00239335I0430 11:51:09.912330 23334 solver.cpp:237]     Train net output #0: loss = 0.00239335 (* 1 = 0.00239335 loss)I0430 11:51:09.912340 23334 sgd_solver.cpp:105] Iteration 71120, lr = 0.001I0430 11:51:21.968586 23334 solver.cpp:218] Iteration 71140 (1.65888 iter/s, 12.0563s/20 iters), loss = 0.00161807I0430 11:51:21.968646 23334 solver.cpp:237]     Train net output #0: loss = 0.00161808 (* 1 = 0.00161808 loss)I0430 11:51:21.968654 23334 sgd_solver.cpp:105] Iteration 71140, lr = 0.001

让我感到困惑的是损失值。我原本打算在损失值低于0.0001时停止训练我的网络,但有两个损失值:训练损失和测试损失。训练损失似乎稳定在0.0001左右,但测试损失却高达0.388,远超过我设定的阈值。我应该用哪个损失值来决定停止训练?


回答:

测试和训练性能之间存在如此大的差距,可能表明您对数据进行了过拟合
验证集的目的是确保您不会过拟合。您应该使用验证集上的表现来决定是否停止训练或继续进行。

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注