我在使用Tensorflow-gpu后端的Keras训练一个模型,任务是检测卫星图像中的建筑物。损失值在下降(这是好的),但方向为负,准确率也在下降。但好的一面是,模型的预测效果在改善。我的担忧是为什么损失值会是负的。此外,为什么模型在准确率下降的情况下还能有所改善?
from tensorflow.keras.layers import Conv2Dfrom tensorflow.keras.layers import BatchNormalizationfrom tensorflow.keras.layers import Activationfrom tensorflow.keras.layers import MaxPool2D as MaxPooling2Dfrom tensorflow.keras.layers import UpSampling2Dfrom tensorflow.keras.layers import concatenatefrom tensorflow.keras.layers import Inputfrom tensorflow.keras import Modelfrom tensorflow.keras.optimizers import RMSprop# LAYERSinputs = Input(shape=(300, 300, 3))# 300down0 = Conv2D(32, (3, 3), padding='same')(inputs)down0 = BatchNormalization()(down0)down0 = Activation('relu')(down0)down0 = Conv2D(32, (3, 3), padding='same')(down0)down0 = BatchNormalization()(down0)down0 = Activation('relu')(down0)down0_pool = MaxPooling2D((2, 2), strides=(2, 2))(down0)# 150down1 = Conv2D(64, (3, 3), padding='same')(down0_pool)down1 = BatchNormalization()(down1)down1 = Activation('relu')(down1)down1 = Conv2D(64, (3, 3), padding='same')(down1)down1 = BatchNormalization()(down1)down1 = Activation('relu')(down1)down1_pool = MaxPooling2D((2, 2), strides=(2, 2))(down1)# 75center = Conv2D(1024, (3, 3), padding='same')(down1_pool)center = BatchNormalization()(center)center = Activation('relu')(center) center = Conv2D(1024, (3, 3), padding='same')(center)center = BatchNormalization()(center)center = Activation('relu')(center)# centerup1 = UpSampling2D((2, 2))(center)up1 = concatenate([down1, up1], axis=3)up1 = Conv2D(64, (3, 3), padding='same')(up1)up1 = BatchNormalization()(up1)up1 = Activation('relu')(up1)up1 = Conv2D(64, (3, 3), padding='same')(up1)up1 = BatchNormalization()(up1)up1 = Activation('relu')(up1)up1 = Conv2D(64, (3, 3), padding='same')(up1)up1 = BatchNormalization()(up1)up1 = Activation('relu')(up1)# 150up0 = UpSampling2D((2, 2))(up1)up0 = concatenate([down0, up0], axis=3)up0 = Conv2D(32, (3, 3), padding='same')(up0)up0 = BatchNormalization()(up0)up0 = Activation('relu')(up0)up0 = Conv2D(32, (3, 3), padding='same')(up0)up0 = BatchNormalization()(up0)up0 = Activation('relu')(up0) up0 = Conv2D(32, (3, 3), padding='same')(up0)up0 = BatchNormalization()(up0)up0 = Activation('relu')(up0)# 300x300x3classify = Conv2D(1, (1, 1), activation='sigmoid')(up0)# 300x300x1model = Model(inputs=inputs, outputs=classify)model.compile(optimizer=RMSprop(lr=0.0001), loss='binary_crossentropy', metrics=[dice_coeff, 'accuracy'])history = model.fit(sample_input, sample_target, batch_size=4, epochs=5)OUTPUT:Epoch 6/10500/500 [==============================] - 76s 153ms/step - loss: -293.6920 - dice_coeff: 1.8607 - acc: 0.2653Epoch 7/10500/500 [==============================] - 75s 150ms/step - loss: -309.2504 - dice_coeff: 1.8730 - acc: 0.2618Epoch 8/10500/500 [==============================] - 75s 150ms/step - loss: -324.4123 - dice_coeff: 1.8810 - acc: 0.2659Epoch 9/10136/500 [=======>......................] - ETA: 55s - loss: -329.0757 - dice_coeff: 1.8940 - acc: 0.2757
问题出在哪里?(请忽略dice_coeff,这是自定义损失函数)
回答:
你的输出对于二分类问题来说并未进行归一化处理。(数据可能也没有进行归一化)。
如果你加载了图像,它的值可能在0到255之间,甚至0到65355之间。
你应该对y_train
进行归一化处理(除以y_train.max()
),并在模型的末尾使用'sigmoid'
激活函数。