我想对Inception V3进行微调,以识别UC Merced土地使用数据集。该数据集包含21个类别,每个类别有100张图像。我手动将数据集分为5个折叠,每个折叠和每个类别中,我有60张图像用于训练,20张用于验证,20张用于测试。例如:在第一个折叠中,每个类别的0到59号图像用于训练,60到79号图像用于验证,依此类推。在第二个折叠中,0到19号图像用于测试,80到99号图像用于验证,依此类推。我应用了交叉验证,因此最终将用数据集中的所有图像来测试网络。
通过这种微调,我达到了93%的准确率,目标是97%。
# importfrom keras.preprocessing import imageimport osfrom matplotlib import pyplot as pltimport numpy as npfrom keras.preprocessing.image import ImageDataGeneratorfrom keras.models import Sequential, Modelfrom keras.layers import Conv2D, MaxPooling2D, Activation, Dropout, Flatten, Dense, GlobalAveragePooling2Dfrom keras import backend as Kfrom keras import applicationsfrom keras import utilsfrom keras import optimizersfrom sklearn.model_selection import KFoldimport randomfrom keras.preprocessing import imageimport osfrom matplotlib import pyplot as pltdirectory_train=["./UCMerced_LandUse2/Images/Fold1/Training","./UCMerced_LandUse2/Images/Fold2/Training","./UCMerced_LandUse2/Images/Fold3/Training","./UCMerced_LandUse2/Images/Fold4/Training","./UCMerced_LandUse2/Images/Fold5/Training"]directory_validation=["./UCMerced_LandUse2/Images/Fold1/Validation","./UCMerced_LandUse2/Images/Fold2/Validation","./UCMerced_LandUse2/Images/Fold3/Validation","./UCMerced_LandUse2/Images/Fold4/Validation","./UCMerced_LandUse2/Images/Fold5/Validation"]directory_test=["./UCMerced_LandUse2/Images/Fold1/Test","./UCMerced_LandUse2/Images/Fold2/Test","./UCMerced_LandUse2/Images/Fold3/Test","./UCMerced_LandUse2/Images/Fold4/Test","./UCMerced_LandUse2/Images/Fold5/Test"]
img_width, img_height = 256, 256num_samples=2100batch_size = 10Datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)Gen = ImageDataGenerator(rescale=1./255)train_generator=[]valid_generator=[]test_generator=[]print("Creazione Train Generator")for i in range(5): train_generator.append(Datagen.flow_from_directory( directory_train[i], target_size=(img_width, img_height), batch_size=batch_size, class_mode='categorical' ))print(train_generator[0].n//batch_size)print("Creazione Validation Generator")for i in range(5): valid_generator.append(Gen.flow_from_directory( directory_validation[i], target_size=(img_width, img_height), batch_size=batch_size, class_mode='categorical' ))print(valid_generator[i].n//batch_size)print("Creazione Test Generator")for i in range(5): test_generator.append(Gen.flow_from_directory( directory_test[i], target_size=(img_width, img_height), batch_size=batch_size, class_mode='categorical' ))
# Inception V3 with pre-trained weightsbase_model = applications.InceptionV3(weights='imagenet', include_top=False,input_shape=(256,256,3),classes=21)base_model.trainable=True;num_epochs = 100history=[]risultati=[]for i in range(5): model=Sequential() model.add(base_model) model.add(GlobalAveragePooling2D()) model.add(Dense(1024,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1024,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(100,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(21,activation='softmax')) model.summary() model.compile(loss='categorical_crossentropy', optimizer=optimizers.SGD(1e-4,momentum=0.9), metrics=['accuracy']) print(i) model.fit_generator(train_generator[i],steps_per_epoch=train_generator[i].n//batch_size,epochs=num_epochs,validation_data=valid_generator[i],validation_steps=valid_generator[i].n//batch_size,shuffle=True) model.save('sasaprova.h5') print(i) scores=model.evaluate_generator(test_generator[i]) print(scores[1]) risultati.append(scores[1]*100)print(np.mean(risultati))
有什么建议吗?
回答:
有几点建议可以提高分类准确率:
-
使用带有noisy_student权重的EfficientNet。需要训练的参数较少。由于其可扩展的架构,它能提供更好的准确率。
-
你可以使用测试时的数据增强。在测试数据生成器中,进行简单的水平翻转、垂直翻转(如果数据看起来合理)以及仿射变换。这将生成数据的多个视图,帮助模型平均出更可能的类别。
-
你的数据增强可以更全面。查看imgaug库。另外,随机擦除、剪切和混合策略已被证明是有用的。
-
尝试标签平滑。这也可以帮助你的分类器为正确类别分配更高的概率。
-
尝试学习率预热。