我一直在训练一个卷积神经网络模型,用于二分类任务,以将乳腺X光片图像块分类为正常和异常。以下是我的训练图表:
尽管结果有一定的稳定性,但对于二分类任务,我希望训练/验证准确率达到0.9
或更高。我检查了训练输出的结果,似乎网络陷入了鞍点。以下是训练输出的样本:
Epoch 48/400134/134 [==============================] - ETA: 0s - loss: 0.2837 - binary_accuracy: 0.8762Epoch 00048: val_loss did not improve from 0.37938134/134 [==============================] - 39s 294ms/step - loss: 0.2837 - binary_accuracy: 0.8762 - val_loss: 0.3802 - val_binary_accuracy: 0.8358Epoch 49/400134/134 [==============================] - ETA: 0s - loss: 0.2820 - binary_accuracy: 0.8846Epoch 00049: val_loss did not improve from 0.37938134/134 [==============================] - 39s 294ms/step - loss: 0.2820 - binary_accuracy: 0.8846 - val_loss: 0.3844 - val_binary_accuracy: 0.8312Epoch 50/400134/134 [==============================] - ETA: 0s - loss: 0.2835 - binary_accuracy: 0.8806Epoch 00050: val_loss did not improve from 0.37938134/134 [==============================] - 39s 292ms/step - loss: 0.2835 - binary_accuracy: 0.8806 - val_loss: 0.3827 - val_binary_accuracy: 0.8293Epoch 51/400134/134 [==============================] - ETA: 0s - loss: 0.2754 - binary_accuracy: 0.8843Epoch 00051: val_loss did not improve from 0.37938134/134 [==============================] - 39s 293ms/step - loss: 0.2754 - binary_accuracy: 0.8843 - val_loss: 0.3847 - val_binary_accuracy: 0.8246Epoch 52/400134/134 [==============================] - ETA: 0s - loss: 0.2773 - binary_accuracy: 0.8832Epoch 00052: val_loss did not improve from 0.37938134/134 [==============================] - 39s 290ms/step - loss: 0.2773 - binary_accuracy: 0.8832 - val_loss: 0.4020 - val_binary_accuracy: 0.8293Epoch 53/400134/134 [==============================] - ETA: 0s - loss: 0.2762 - binary_accuracy: 0.8825Epoch 00053: val_loss did not improve from 0.37938134/134 [==============================] - 39s 290ms/step - loss: 0.2762 - binary_accuracy: 0.8825 - val_loss: 0.3918 - val_binary_accuracy: 0.8106Epoch 54/400134/134 [==============================] - ETA: 0s - loss: 0.2734 - binary_accuracy: 0.8881Epoch 00054: val_loss did not improve from 0.37938134/134 [==============================] - 39s 290ms/step - loss: 0.2734 - binary_accuracy: 0.8881 - val_loss: 0.4216 - val_binary_accuracy: 0.8181Epoch 55/400134/134 [==============================] - ETA: 0s - loss: 0.2902 - binary_accuracy: 0.8804Epoch 00055: val_loss improved from 0.37938 to 0.36383, saving model to /content/drive/My Drive/Breast Mammography/Patch Classifier/Training/normal-abnormal_patch_classification_weights_clr-055- 0.3638.hdf5134/134 [==============================] - 40s 301ms/step - loss: 0.2902 - binary_accuracy: 0.8804 - val_loss: 0.3638 - val_binary_accuracy: 0.8396
我正在考虑以下选项:
- 从学习停止的时期加载网络,并使用阶梯学习率调度进行训练(从较大的学习率开始,以逃离鞍点)
- 执行离线增强以增加训练数据量,同时保留在线增强期间执行的所有变换
- 更改网络架构(目前我使用的是自定义的,稍微浅一些(我认为是半深层)的模型)
除了我上面提到的选项外,还有没有人有其他建议?另外,我使用的是SGD
,其中momentum=0.9
。在实践中,有没有比带动量的SGD更容易逃离鞍点的优化器?还有,BATCH_SIZE
(我设置为32
,而TRAINING_SIZE=4300
– 没有类别不平衡)对学习有何影响?
回答: