我正在使用Keras和U-Net进行多类别分割工作。
我的神经网络输出有12个类别,使用了Softmax激活函数。我的输出形状为(N,288,288,12)。
为了拟合我的模型,我使用了sparse_categorical_crossentropy损失函数。
我想为我的不平衡数据集初始化模型的权重。
我找到了这个有用的链接,并尝试实现它;由于Keras中的class_weight
不适用于超过2个类别的情况,我使用了样本权重。
我的代码是:
inputs = tf.keras.layers.Input((IMG_WIDHT, IMG_HEIGHT, IMG_CHANNELS)) smooth = 1. s = tf.keras.layers.Lambda(lambda x: x / 255)(inputs) c1 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( s) # Kernelsize : start with some weights initial value c1 = tf.keras.layers.Dropout(0.1)(c1) c1 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( c1) # Kernelsize : start with some weights initial value p1 = tf.keras.layers.MaxPool2D((2, 2))(c1) c2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( p1) # Kernelsize : start with some weights initial value c2 = tf.keras.layers.Dropout(0.1)(c2) c2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( c2) # Kernelsize : start with some weights initial value p2 = tf.keras.layers.MaxPool2D((2, 2))(c2) c3 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( p2) # Kernelsize : start with some weights initial value c3 = tf.keras.layers.Dropout(0.1)(c3) c3 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( c3) # Kernelsize : start with some weights initial value p3 = tf.keras.layers.MaxPool2D((2, 2))(c3) c4 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( p3) # Kernelsize : start with some weights initial value c4 = tf.keras.layers.Dropout(0.1)(c4) c4 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( c4) # Kernelsize : start with some weights initial value p4 = tf.keras.layers.MaxPool2D((2, 2))(c4) c5 = tf.keras.layers.Conv2D(256, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( p4) # Kernelsize : start with some weights initial value c5 = tf.keras.layers.Dropout(0.1)(c5) c5 = tf.keras.layers.Conv2D(256, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')( c5) # Kernelsize : start wi u6 = tf.keras.layers.Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(c5) u6 = tf.keras.layers.concatenate([u6, c4]) c6 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(u6) c6 = tf.keras.layers.Dropout(0.2)(c6) c6 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c6) u7 = tf.keras.layers.Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(c6) u7 = tf.keras.layers.concatenate([u7, c3]) c7 = tf.keras.layers.Conv2D(64, (2, 2), activation='relu', kernel_initializer='he_normal', padding='same')(u7) c7 = tf.keras.layers.Dropout(0.2)(c7) c7 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c7) u8 = tf.keras.layers.Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(c7) u8 = tf.keras.layers.concatenate([u8, c2]) c8 = tf.keras.layers.Conv2D(32, (2, 2), activation='relu', kernel_initializer='he_normal', padding='same')(u8) c8 = tf.keras.layers.Dropout(0.1)(c8) c8 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c8) u9 = tf.keras.layers.Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same')(c8) u9 = tf.keras.layers.concatenate([u9, c1], axis=3) c9 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(u9) c9 = tf.keras.layers.Dropout(0.1)(c9) c9 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c9) outputs = tf.keras.layers.Conv2D(12, (1, 1), activation='softmax')(c9) outputs = tf.keras.layers.Flatten(data_format=None) (outputs) model = tf.keras.Model(inputs=[inputs], outputs=[outputs]) cc = tf.keras.optimizers.Adam(learning_rate=0.0001, beta_1=0.9, beta_2=0.999, amsgrad=False) model.compile(optimizer=cc, loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'],sample_weight_mode="temporal") # metrics =[dice_coeff] model.summary() model.summary() checkpointer = tf.keras.callbacks.ModelCheckpoint('chek12class3.h5', verbose = 1, save_best_only = True) # print('############## Initial weights ############## : ', model.get_weights()) #callbacks = [ # tf.keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), tf.keras.callbacks.TensorBoard(log_dir='logs')] #history = model.fit(train_generator, validation_split=0.1, batch_size=4,epochs = 100 ,callbacks = callbacks) #,callbacks = callbacksclass_weights = np.zeros((82944, 12)) class_weights[:, 0] += 7 class_weights[:, 1] += 10 class_weights[:, 2] += 2 class_weights[:, 3] += 3 class_weights[:, 4] += 4 class_weights[:, 5] += 5 class_weights[:, 6] += 6 class_weights[:, 7] += 50 class_weights[:, 8] += 8 class_weights[:, 9] += 9 class_weights[:, 10] += 50 class_weights[:, 11] += 11 history = model.fit(X_train, Y_train, validation_split=0.18, batch_size=1,epochs = 60 ,sample_weight=class_weights) #class_weight=clas
82944是我的样本的高度和宽度(288*288),12是类别的数量。
我遇到了这个错误:
ValueError: Found a sample_weight array with shape (82944, 12) for an input with shape (481, 288, 288). sample_weight cannot be broadcast.
根据这个链接,样本权重应该按照(训练数据的数量,训练数据的形状)的格式工作。
然后我在输出之前添加了一个Flatten层,但仍然不起作用。
我的模型架构如下:
Model: "model"__________________________________________________________________________________________________Layer (type) Output Shape Param # Connected to ==================================================================================================input_1 (InputLayer) [(None, 288, 288, 3) 0 __________________________________________________________________________________________________lambda (Lambda) (None, 288, 288, 3) 0 input_1[0][0] __________________________________________________________________________________________________conv2d (Conv2D) (None, 288, 288, 16) 448 lambda[0][0] __________________________________________________________________________________________________dropout (Dropout) (None, 288, 288, 16) 0 conv2d[0][0] __________________________________________________________________________________________________conv2d_1 (Conv2D) (None, 288, 288, 16) 2320 dropout[0][0] __________________________________________________________________________________________________max_pooling2d (MaxPooling2D) (None, 144, 144, 16) 0 conv2d_1[0][0] __________________________________________________________________________________________________conv2d_2 (Conv2D) (None, 144, 144, 32) 4640 max_pooling2d[0][0] __________________________________________________________________________________________________dropout_1 (Dropout) (None, 144, 144, 32) 0 conv2d_2[0][0] __________________________________________________________________________________________________conv2d_3 (Conv2D) (None, 144, 144, 32) 9248 dropout_1[0][0] __________________________________________________________________________________________________max_pooling2d_1 (MaxPooling2D) (None, 72, 72, 32) 0 conv2d_3[0][0] __________________________________________________________________________________________________conv2d_4 (Conv2D) (None, 72, 72, 64) 18496 max_pooling2d_1[0][0] __________________________________________________________________________________________________dropout_2 (Dropout) (None, 72, 72, 64) 0 conv2d_4[0][0] __________________________________________________________________________________________________conv2d_5 (Conv2D) (None, 72, 72, 64) 36928 dropout_2[0][0] __________________________________________________________________________________________________max_pooling2d_2 (MaxPooling2D) (None, 36, 36, 64) 0 conv2d_5[0][0] __________________________________________________________________________________________________conv2d_6 (Conv2D) (None, 36, 36, 128) 73856 max_pooling2d_2[0][0] __________________________________________________________________________________________________dropout_3 (Dropout) (None, 36, 36, 128) 0 conv2d_6[0][0] __________________________________________________________________________________________________conv2d_7 (Conv2D) (None, 36, 36, 128) 147584 dropout_3[0][0] __________________________________________________________________________________________________max_pooling2d_3 (MaxPooling2D) (None, 18, 18, 128) 0 conv2d_7[0][0] __________________________________________________________________________________________________conv2d_8 (Conv2D) (None, 18, 18, 256) 295168 max_pooling2d_3[0][0] __________________________________________________________________________________________________dropout_4 (Dropout) (None, 18, 18, 256) 0 conv2d_8[0][0] __________________________________________________________________________________________________conv2d_9 (Conv2D) (None, 18, 18, 256) 590080 dropout_4[0][0] __________________________________________________________________________________________________conv2d_transpose (Conv2DTranspo (None, 36, 36, 128) 131200 conv2d_9[0][0] __________________________________________________________________________________________________concatenate (Concatenate) (None, 36, 36, 256) 0 conv2d_transpose[0][0] conv2d_7[0][0] __________________________________________________________________________________________________conv2d_10 (Conv2D) (None, 36, 36, 128) 295040 concatenate[0][0] __________________________________________________________________________________________________dropout_5 (Dropout) (None, 36, 36, 128) 0 conv2d_10[0][0] __________________________________________________________________________________________________conv2d_11 (Conv2D) (None, 36, 36, 128) 147584 dropout_5[0][0] __________________________________________________________________________________________________conv2d_transpose_1 (Conv2DTrans (None, 72, 72, 64) 32832 conv2d_11[0][0] __________________________________________________________________________________________________concatenate_1 (Concatenate) (None, 72, 72, 128) 0 conv2d_transpose_1[0][0] conv2d_5[0][0] __________________________________________________________________________________________________conv2d_12 (Conv2D) (None, 72, 72, 64) 32832 concatenate_1[0][0] __________________________________________________________________________________________________dropout_6 (Dropout) (None, 72, 72, 64) 0 conv2d_12[0][0] __________________________________________________________________________________________________conv2d_13 (Conv2D) (None, 72, 72, 64) 36928 dropout_6[0][0] __________________________________________________________________________________________________conv2d_transpose_2 (Conv2DTrans (None, 144, 144, 32) 8224 conv2d_13[0][0] __________________________________________________________________________________________________concatenate_2 (Concatenate) (None, 144, 144, 64) 0 conv2d_transpose_2[0][0] conv2d_3[0][0] __________________________________________________________________________________________________conv2d_14 (Conv2D) (None, 144, 144, 32) 8224 concatenate_2[0][0] __________________________________________________________________________________________________dropout_7 (Dropout) (None, 144, 144, 32) 0 conv2d_14[0][0] __________________________________________________________________________________________________conv2d_15 (Conv2D) (None, 144, 144, 32) 9248 dropout_7[0][0] __________________________________________________________________________________________________conv2d_transpose_3 (Conv2DTrans (None, 288, 288, 16) 2064 conv2d_15[0][0] __________________________________________________________________________________________________concatenate_3 (Concatenate) (None, 288, 288, 32) 0 conv2d_transpose_3[0][0] conv2d_1[0][0] __________________________________________________________________________________________________conv2d_16 (Conv2D) (None, 288, 288, 16) 4624 concatenate_3[0][0] __________________________________________________________________________________________________dropout_8 (Dropout) (None, 288, 288, 16) 0 conv2d_16[0][0] __________________________________________________________________________________________________conv2d_17 (Conv2D) (None, 288, 288, 16) 2320 dropout_8[0][0] __________________________________________________________________________________________________conv2d_18 (Conv2D) (None, 288, 288, 12) 204 conv2d_17[0][0] ==================================================================================================
我认为这个解决方案可能会起作用:
sample_weights = np.zeros(len(Y_train)) # 这里设置你自己的权重: sample_weights[Y_train[Y_train==0]] = 7 sample_weights[Y_train[Y_train==1]] = 10 sample_weights[Y_train[Y_train==2]] = 2 sample_weights[Y_train[Y_train==3]] = 3 sample_weights[Y_train[Y_train==4]] = 4 sample_weights[Y_train[Y_train==5]] = 5 sample_weights[Y_train[Y_train==6]] = 6 sample_weights[Y_train[Y_train==7]] = 50 sample_weights[Y_train[Y_train==8]] = 8 sample_weights[Y_train[Y_train==9]] = 9 sample_weights[Y_train[Y_train==10]] = 50 sample_weights[Y_train[Y_train==11]] = 11
我遇到了这个错误:
ValueError: Found a sample_weight array with shape (481,). In order to use timestep-wise sample weighting, you should pass a 2D sample_weight array.
回答:
你误用了sample_weight
。正如其名所示,它为每个样本分配权重;因此,尽管你只有481个样本,你传递了长度为82944的对象(而且是二维的),因此会出现预期的错误:
ValueError: Found a sample_weight array with shape (82944, 12) for an input with shape (481, 288, 288). sample_weight cannot be broadcast.
所以,你实际上需要的是一个长度等于训练样本数量的一维sample_weight
数组,每个元素是对应样本的权重——这些权重对于每个类别应该相同,正如你所展示的那样。
这是你如何使用一些12个类别的虚拟数据y
和仅30个样本来做的示例:
import numpy as npy = np.random.randint(12, size=30) # 虚拟数据,12个类别y# array([ 8, 0, 6, 8, 9, 9, 7, 11, 6, 4, 6, 3, 10, 8, 7, 7, 11,# 2, 5, 8, 8, 1, 7, 2, 7, 9, 5, 2, 0, 0])sample_weights = np.zeros(len(y))# 这里设置你自己的权重:sample_weights[y==0] = 7 sample_weights[y==1] = 10 sample_weights[y==2] = 2 sample_weights[y==3] = 3 sample_weights[y==4] = 4 sample_weights[y==5] = 5 sample_weights[y==6] = 6 sample_weights[y==7] = 50 sample_weights[y==8] = 8 sample_weights[y==9] = 9 sample_weights[y==10] = 50 sample_weights[y==11] = 11 sample_weights# 结果:array([ 8., 7., 6., 8., 9., 9., 50., 11., 6., 4., 6., 3., 50., 8., 50., 50., 11., 2., 5., 8., 8., 10., 50., 2., 50., 9., 5., 2., 7., 7.])
让我们将它们放入一个漂亮的数据框中,以便更好地查看:
import pandas as pdd = {'y': y, 'weight': sample_weights}df = pd.DataFrame(d)print(df.to_string(index=False))# 结果: y weight 8 8.0 0 7.0 6 6.0 8 8.0 9 9.0 9 9.0 7 50.0 11 11.0 6 6.0 4 4.0 6 6.0 3 3.0 10 50.0 8 8.0 7 50.0 7 50.0 11 11.0 2 2.0 5 5.0 8 8.0 8 8.0 1 10.0 7 50.0 2 2.0 7 50.0 9 9.0 5 5.0 2 2.0 0 7.0 0 7.0
当然,你应该在你的model.fit
中用sample_weight=sample_weights
替换sample_weight=class_weights
。