我在使用Keras进行多类别分类问题。我使用EarlyStopping(monitor='val_loss', patience=4)
作为学习的停止标准,即如果验证损失在4个epoch内没有减少,训练就会停止。使用val_acc作为停止标准更好还是val_loss更好呢?因为我看到val_loss在增加,但val_acc也在增加。考虑以下第8个和第13个epoch的输出。
Epoch 1/200240703/240703 [==============================] - 4831s - loss: 0.8581 - acc: 0.7603 - val_loss: 0.6247 - val_acc: 0.8160Epoch 2/200240703/240703 [==============================] - 4855s - loss: 0.6099 - acc: 0.8166 - val_loss: 0.5742 - val_acc: 0.8300Epoch 3/200240703/240703 [==============================] - 4627s - loss: 0.5573 - acc: 0.8308 - val_loss: 0.5600 - val_acc: 0.8337Epoch 4/200240703/240703 [==============================] - 4624s - loss: 0.5265 - acc: 0.8395 - val_loss: 0.5550 - val_acc: 0.8347Epoch 5/200240703/240703 [==============================] - 4623s - loss: 0.5042 - acc: 0.8452 - val_loss: 0.5529 - val_acc: 0.8377Epoch 6/200240703/240703 [==============================] - 4624s - loss: 0.4879 - acc: 0.8507 - val_loss: 0.5521 - val_acc: 0.8378Epoch 7/200240703/240703 [==============================] - 4625s - loss: 0.4726 - acc: 0.8555 - val_loss: 0.5554 - val_acc: 0.8383Epoch 8/200240703/240703 [==============================] - 4621s - loss: 0.4604 - acc: 0.8585 - val_loss: 0.5513 - val_acc: 0.8383Epoch 9/200240703/240703 [==============================] - 4716s - loss: 0.4508 - acc: 0.8606 - val_loss: 0.5649 - val_acc: 0.8366Epoch 10/200240703/240703 [==============================] - 4602s - loss: 0.4409 - acc: 0.8637 - val_loss: 0.5626 - val_acc: 0.8389Epoch 11/200240703/240703 [==============================] - 4651s - loss: 0.4318 - acc: 0.8662 - val_loss: 0.5710 - val_acc: 0.8387Epoch 12/200240703/240703 [==============================] - 4706s - loss: 0.4239 - acc: 0.8687 - val_loss: 0.5737 - val_acc: 0.8384Epoch 13/200240703/240703 [==============================] - 4706s - loss: 0.4190 - acc: 0.8698 - val_loss: 0.5730 - val_acc: 0.8391
回答:
一般来说,损失比准确率更好的衡量标准,因为它具有更高的精度。准确率的可能值与验证集中的样本数量一样多。另一方面,损失有连续的可能值,因此你可以更精确地跟踪正在发生的事情。另一方面,准确率更容易分析,因为它是可解释的(它只是一个百分比),所以在没有领域专业知识的情况下,基于损失的标准会更难使用,但如果正确使用,可能会稍微更精确。