如何从多类分类混淆矩阵中提取假阳性和假阴性

我正在使用以下Keras代码对mnist数据进行分类。从sklearn.metricsconfusion_matrix命令中,我得到了混淆矩阵,并且通过TruePositive= sum(numpy.diag(cm1))命令,我能够获取到真阳性。但我对如何获取真阴性、假阳性、假阴性感到困惑。我从这里阅读了解决方案,但用户评论让我感到困惑。请帮助我编写代码来获取这些参数。

from sklearn.metrics import confusion_matriximport kerasfrom keras.datasets import mnistfrom keras.models import Sequentialfrom keras.layers import Dense, Dropout, Flattenfrom keras.layers import Conv2D, MaxPooling2Dfrom keras import backend as Kimport numpy as np(x_train, y_train), (x_test, y_test) = mnist.load_data()batch_size = 128num_classes = 10epochs = 1img_rows, img_cols = 28, 28y_test1=y_testif K.image_data_format() == 'channels_first':    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)    input_shape = (1, img_rows, img_cols)else:    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)    input_shape = (img_rows, img_cols, 1)x_train = x_train.astype('float32')x_test = x_test.astype('float32')x_train /= 255x_test /= 255y_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes)model = Sequential()model.add(Conv2D(32, kernel_size=(3, 3),                 activation='relu',                 input_shape=input_shape))model.add(Conv2D(64, (3, 3), activation='relu'))model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))model.add(Flatten())#model.add(GlobalAveragePooling2D())#model.add(GlobalMaxPooling2D())model.add(Dense(128, activation='relu'))model.add(Dropout(0.5))model.add(Dense(num_classes, activation='softmax'))model.compile(loss=keras.losses.binary_crossentropy,              optimizer=keras.optimizers.Adadelta(),              metrics=['accuracy'])model.fit(x_train, y_train,          batch_size=batch_size,          epochs=epochs,          verbose=1,          validation_data=(x_test, y_test))pre_cls=model.predict_classes(x_test)cm1 = confusion_matrix(y_test1,pre_cls)print('Confusion Matrix : \n', cm1)TruePositive= sum(np.diag(cm1))

回答:

首先,你的代码中有一些遗漏——为了运行,我需要添加以下命令:

import keras(x_train, y_train), (x_test, y_test) = mnist.load_data()

添加这些后,针对混淆矩阵cm1

array([[ 965,    0,    1,    0,    0,    2,    6,    1,    5,    0],       [   0, 1113,    4,    2,    0,    0,    3,    0,   13,    0],       [   8,    0,  963,   14,    5,    1,    7,    8,   21,    5],       [   0,    0,    3,  978,    0,    7,    0,    6,   12,    4],       [   1,    0,    4,    0,  922,    0,    9,    3,    3,   40],       [   4,    1,    1,   27,    0,  824,    6,    1,   20,    8],       [  11,    3,    1,    1,    5,    6,  925,    0,    6,    0],       [   2,    6,   17,    8,    2,    0,    1,  961,    2,   29],       [   5,    1,    2,   13,    4,    6,    2,    6,  929,    6],       [   6,    5,    0,    7,    5,    6,    1,    6,   10,  963]])

以下是如何获取每个类别的请求的TP、FP、FN、TN:

真阳性只是对角线上的元素:

TruePositive = np.diag(cm1)TruePositive# array([ 965, 1113,  963,  978,  922,  824,  925,  961,  929,  963])

假阳性是相应列的总和,减去对角线上的元素:

FalsePositive = []for i in range(num_classes):    FalsePositive.append(sum(cm1[:,i]) - cm1[i,i])FalsePositive# [37, 16, 33, 72, 21, 28, 35, 31, 92, 92]

同样,假阴性是相应行的总和,减去对角线上的元素:

FalseNegative = []for i in range(num_classes):    FalseNegative.append(sum(cm1[i,:]) - cm1[i,i])FalseNegative# [15, 22, 69, 32, 60, 68, 33, 67, 45, 46]

现在,真阴性的计算稍微复杂一些;首先让我们思考一下,真阴性究竟意味着什么,比如对于类别0:它意味着所有被正确识别为“不是0”的样本。因此,我们应该做的实际上是从混淆矩阵中删除相应的行和列,然后将剩余的所有元素加起来:

TrueNegative = []for i in range(num_classes):    temp = np.delete(cm1, i, 0)   # 删除第i行    temp = np.delete(temp, i, 1)  # 删除第i列    TrueNegative.append(sum(sum(temp)))TrueNegative# [8998, 8871, 9004, 8950, 9057, 9148, 9040, 9008, 8979, 8945]

让我们进行一个健全性检查:对于每个类别,TP、FP、FN和TN的总和必须等于测试集的大小(这里是10,000):让我们确认这确实是事实:

l = len(y_test)for i in range(num_classes):    print(TruePositive[i] + FalsePositive[i] + FalseNegative[i] + TrueNegative[i] == l)

结果是

TrueTrueTrueTrueTrueTrueTrueTrueTrueTrue

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注