我有一个已经训练好的模型,我想对目录中的图像进行二分类预测。我有超过10万张图像,为了提高效率,我希望进行批量预测。我该如何对我的图像进行批量预测,获取预测结果,并根据类别将图像存储在两个不同的文件夹中?
这是我目前的代码…
model_filepath = r"C:\Users\model_200.h5"model = tf.keras.models.load_model(model_filepath)test_dir = r"C:\Users\image_testing_folder"batch_size = 64IMG_HEIGHT = 200IMG_WIDTH = 200test_image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)test_image_gen = test_image_generator.flow_from_directory(directory=str(test_dir), batch_size=batch_size, shuffle=False, target_size=(IMG_HEIGHT, IMG_WIDTH), )predictions = (model.predict(test_image_gen) > 0.5).astype("int32")predictions
一种解决方案是将预测结果与图像文件路径关联起来,然后使用shutil.move()将原始图像移动到目标文件夹。我该如何做?有没有比使用ImageDataGenerator和.flow_from_directory更好的批量预测方法?
回答:
你可以创建一个自定义数据集,这样你也可以轻松地检索文件名:
import tensorflow as tffrom tensorflow.keras.layers import *from tensorflow.keras import Sequentialfrom glob2 import globfrom shutil import copyimport numpy as npfiles = glob('group1\\*\\*.jpg')imsize = 64def load(file_path): img = tf.io.read_file(file_path) img = tf.image.decode_png(img, channels=3) img = tf.image.convert_image_dtype(img, tf.float32) img = tf.image.resize(img, size=(imsize, imsize)) return img, file_pathds = tf.data.Dataset.from_tensor_slices(files).\ take(100).\ shuffle(100).\ map(load).batch(4)model = Sequential()model.add(Conv2D(8, (3, 3), input_shape=(imsize, imsize, 3), activation='relu'))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Flatten())model.add(Dense(units=32, activation='relu'))model.add(Dropout(0.5))model.add(Dense(units=2, activation='sigmoid'))model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])model.build(input_shape=(imsize, imsize, 3))categories = np.array(['cats', 'dogs'])target_dir = 'newpics'for cat in categories: os.makedirs(os.path.join(target_dir, cat), exist_ok=True)for images, filenames in ds: preds = model(images) targets = categories[np.argmax(preds, axis=1)] for file, destination in zip(filenames, targets): copy(file.numpy().decode(), os.path.join(target_dir, destination, os.path.basename(file.numpy().decode()) )) print(file.numpy().decode(), '-->', os.path.join(target_dir, destination, os.path.basename(file.numpy().decode()) ))
group1\cats\cat.4051.jpg --> newpics\cats\cat.4051.jpggroup1\cats\cat.4091.jpg --> newpics\dogs\cat.4091.jpggroup1\cats\cat.4055.jpg --> newpics\cats\cat.4055.jpggroup1\cats\cat.4041.jpg --> newpics\cats\cat.4041.jpggroup1\cats\cat.4090.jpg --> newpics\cats\cat.4090.jpggroup1\cats\cat.4071.jpg --> newpics\dogs\cat.4071.jpggroup1\cats\cat.4082.jpg --> newpics\cats\cat.4082.jpggroup1\cats\cat.4037.jpg --> newpics\cats\cat.4037.jpggroup1\cats\cat.4005.jpg --> newpics\cats\cat.4005.jpg
你只需要更改glob模式和文件夹即可。