我尝试使用了SGD、Adadelta、Adabound、Adam等优化器,但验证准确率仍然出现波动。我尝试了Keras中的所有激活函数,但仍然存在val_acc的波动。
训练样本数:1352
验证样本数:339
验证准确率
# 第一(也是唯一)个CONV => RELU => POOL块 inpt = Input(shape = input_shape) x = Conv2D(32, (3, 3), padding = "same")(inpt) x = Activation("swish")(x) x = BatchNormalization(axis = channel_dim)(x) x = MaxPooling2D(pool_size = (3, 3))(x) # x = Dropout(0.25)(x) # 第一组CONV => RELU => CONV => RELU => POOL块 x = Conv2D(64, (3, 3), padding = "same")(x) x = Activation("swish")(x) x = BatchNormalization(axis = channel_dim)(x) x = Conv2D(64, (3, 3), padding = "same")(x) x = Activation("swish")(x) x = BatchNormalization(axis = channel_dim)(x) x = MaxPooling2D(pool_size = (2, 2))(x) # x = Dropout(0.25)(x) # 第二组CONV => RELU => CONV => RELU => POOL块 x = Conv2D(128, (3, 3), padding = "same")(x) x = Activation("swish")(x) x = BatchNormalization(axis = channel_dim)(x) x = Conv2D(128, (3, 3), padding = "same")(x) x = Activation("swish")(x) x = BatchNormalization(axis = channel_dim)(x) x = MaxPooling2D(pool_size = (2, 2))(x) # x = Dropout(0.25)(x) # 第一(也是唯一)个全连接层 x = Flatten()(x) # 更改为GlobalMaxPooling2D x = Dense(256, activation = 'swish')(x) x = BatchNormalization(axis = channel_dim)(x) x = Dropout(0.4)(x) x = Dense(128, activation = 'swish')(x) x = BatchNormalization()(x) x = Dropout(0.4)(x) x = Dense(64, activation = 'swish')(x) x = BatchNormalization()(x) x = Dropout(0.3)(x) x = Dense(32, activation = 'swish')(x) x = BatchNormalization()(x) x = Dense(nc, activation = 'softmax')(x) model = Model(inputs=inpt, outputs = x)
model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics = ['accuracy'])
回答:
您的模型可能对噪声过于敏感,请参见此回答。
根据链接中的回答和从您的模型中看到的情况,您的网络对于现有数据量可能过于复杂(模型大而数据不足 ==> 过拟合 ==> 对噪声敏感)。我建议您使用一个更简单的模型作为健全性检查。
学习率也可能是导致波动的原因之一(如Neb所述)。您使用的是SGD的默认学习率(为0.01,可能过高)。尝试使用1.e-3或更低的学习率。