如何在Tensorflow中使用测试集加载和评估CNN？

我正在尝试在一个图像集上训练一个CNN。有两个文件夹：training_set和test_set，每个文件夹包含两个类。它们看起来像这样：

training_set/    classA/        img1.png        img2.png        ...    classB/        img1.png        img2.png        ...

test_set/    classA/        img1.png        img2.png        ...    classB/        img1.png        img2.png        ...

代码看起来像这样，其中训练集被分为训练和验证集：

import osimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layersfrom tensorflow.keras.preprocessing.image import ImageDataGeneratorfrom tensorflow.python.client import device_lib import numpy as npimport matplotlib.pyplot as pltprint("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))print(device_lib.list_local_devices())# Set image propertiesimg_height = 369img_width = 496batch_size = 32# Import data set from directorytrain_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_training_set",    labels='inferred',    label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size,    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = 0.2,    subset = "training")val_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_training_set",    labels='inferred',    label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size,    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = 0.2,    subset = "validation")

然后：

from matplotlib import pyplotimg_height = 369img_width = 496epochs = 25model = tf.keras.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Flatten())model.add(layers.Dense(64, activation='relu'))# Since we have two classes:model.add(layers.Dense(1, activation='sigmoid'))# BinaryCrossentropy because there are 2 classes optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)model.compile(optimizer=optimizer, loss=tf.keras.losses.BinaryCrossentropy(from_logits=False), metrics=['accuracy'])# Feed the modelhistory = model.fit(train_images, epochs=epochs, batch_size=32, verbose=1, validation_data=val_images)# Plotacc = history.history['accuracy']val_acc = history.history['val_accuracy']loss = history.history['loss']val_loss = history.history['val_loss']epochs_range = range(epochs)plt.figure(figsize=(8, 8))plt.subplot(1, 2, 1)plt.plot(epochs_range, acc, label='Training Accuracy')plt.plot(epochs_range, val_acc, label='Validation Accuracy')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy')plt.subplot(1, 2, 2)plt.plot(epochs_range, loss, label='Training Loss')plt.plot(epochs_range, val_loss, label='Validation Loss')plt.legend(loc='upper right')plt.title('Training and Validation Loss')plt.show()

现在模型已经训练好了，它显示了训练和验证的准确率和损失。我尝试使用以下代码加载我的测试集：

test_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_test_set",    labels='inferred',    label_mode="binary",    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size, # not really applicable as I want to use the whole set?    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = None)

但这是正确的方法吗？如何处理batch_size？我认为我会使用我的测试集评估模型，如下所示：

test_loss, test_acc = model.evaluate(test_images, verbose=2)print('\nTest accuracy:', test_acc)

但我认为这还不够，因为我还想要准确率、精确度、召回率和F1分数。我甚至不确定这里是否做对了（关于测试集的加载方式）。

所以基本上：如何加载我的测试集并计算准确率、精确度、召回率和F1分数？

回答：

您需要遍历数据，然后可以收集预测和真实类别。

predicted_probs = np.array([])true_classes =  np.array([])for images, labels in test_images:  predicted_probs = np.concatenate([predicted_probs,                       model(images)])  true_classes = np.concatenate([true_classes, labels.numpy()])

由于它们是Sigmoid输出，您需要使用一个阈值将其转换为类别，这里是0.5：

predicted_classes = [1 * (x[0]>=0.5) for x in predicted_probs]

之后您可以得到混淆矩阵等：

conf_matrix = tf.math.confusion_matrix(true_classes, predicted_classes)

学技术

如何在Tensorflow中使用测试集加载和评估CNN？

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复