如何在Tensorflow中使用测试集加载和评估CNN?

我正在尝试在一个图像集上训练一个CNN。有两个文件夹:training_set和test_set,每个文件夹包含两个类。它们看起来像这样:

training_set/    classA/        img1.png        img2.png        ...    classB/        img1.png        img2.png        ...
test_set/    classA/        img1.png        img2.png        ...    classB/        img1.png        img2.png        ...

代码看起来像这样,其中训练集被分为训练和验证集:

import osimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layersfrom tensorflow.keras.preprocessing.image import ImageDataGeneratorfrom tensorflow.python.client import device_lib import numpy as npimport matplotlib.pyplot as pltprint("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))print(device_lib.list_local_devices())# Set image propertiesimg_height = 369img_width = 496batch_size = 32# Import data set from directorytrain_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_training_set",    labels='inferred',    label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size,    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = 0.2,    subset = "training")val_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_training_set",    labels='inferred',    label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size,    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = 0.2,    subset = "validation")

然后:

from matplotlib import pyplotimg_height = 369img_width = 496epochs = 25model = tf.keras.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Flatten())model.add(layers.Dense(64, activation='relu'))# Since we have two classes:model.add(layers.Dense(1, activation='sigmoid'))# BinaryCrossentropy because there are 2 classes optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)model.compile(optimizer=optimizer, loss=tf.keras.losses.BinaryCrossentropy(from_logits=False), metrics=['accuracy'])# Feed the modelhistory = model.fit(train_images, epochs=epochs, batch_size=32, verbose=1, validation_data=val_images)# Plotacc = history.history['accuracy']val_acc = history.history['val_accuracy']loss = history.history['loss']val_loss = history.history['val_loss']epochs_range = range(epochs)plt.figure(figsize=(8, 8))plt.subplot(1, 2, 1)plt.plot(epochs_range, acc, label='Training Accuracy')plt.plot(epochs_range, val_acc, label='Validation Accuracy')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy')plt.subplot(1, 2, 2)plt.plot(epochs_range, loss, label='Training Loss')plt.plot(epochs_range, val_loss, label='Validation Loss')plt.legend(loc='upper right')plt.title('Training and Validation Loss')plt.show()

现在模型已经训练好了,它显示了训练和验证的准确率和损失。我尝试使用以下代码加载我的测试集:

test_images = tf.keras.preprocessing.image_dataset_from_directory(    "path_to_test_set",    labels='inferred',    label_mode="binary",    class_names = ['classA', 'classB'],    color_mode =  'rgb',    batch_size = batch_size, # not really applicable as I want to use the whole set?    image_size = (img_height, img_width),    shuffle = True,    seed = 123,    validation_split = None)

但这是正确的方法吗?如何处理batch_size?我认为我会使用我的测试集评估模型,如下所示:

test_loss, test_acc = model.evaluate(test_images, verbose=2)print('\nTest accuracy:', test_acc)

但我认为这还不够,因为我还想要准确率、精确度、召回率和F1分数。我甚至不确定这里是否做对了(关于测试集的加载方式)。

所以基本上:如何加载我的测试集并计算准确率、精确度、召回率和F1分数?


回答:

您需要遍历数据,然后可以收集预测和真实类别。

predicted_probs = np.array([])true_classes =  np.array([])for images, labels in test_images:  predicted_probs = np.concatenate([predicted_probs,                       model(images)])  true_classes = np.concatenate([true_classes, labels.numpy()])

由于它们是Sigmoid输出,您需要使用一个阈值将其转换为类别,这里是0.5:

predicted_classes = [1 * (x[0]>=0.5) for x in predicted_probs]

之后您可以得到混淆矩阵等:

conf_matrix = tf.math.confusion_matrix(true_classes, predicted_classes)

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注