ValueError: 尝试计算ROC曲线时，输入形状错误（2, 256, 3）

首先，我是Python的新手。在尝试构建ROC曲线时，我在这行代码上遇到了错误：

fpr_keras, tpr_keras, thresholds_keras = roc_curve(Y_test.argmax(axis=1), decoded_imgs.argmax(axis=1))

错误如下：

ValueError: 输入形状错误（2, 256, 3）

当我尝试在重塑后进行形状调整时，我得到了第二个错误：

TypeError: ‘tuple’对象不可调用

我参考了这个链接，但我不明白我应该做什么，我在这个问题上卡住了。有人可以编辑我的代码吗？我尝试做的是：链接2

import kerasimport numpy as npfrom keras.datasets import mnistfrom get_dataset import get_datasetfrom stack import keras_modelX_train, X_test, Y_train, Y_test = get_dataset()from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, Densefrom keras.models import Modelinput_img = Input(shape=(256, 256, 3))x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)x = MaxPooling2D((2, 2), padding='same')(x)x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)x = MaxPooling2D((2, 2), padding='same')(x)x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)encoded = MaxPooling2D((2, 2), padding='same')(x)x = Conv2D(64, (3, 3), activation='relu', padding='same')(encoded)x = UpSampling2D((2, 2))(x)x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)x = UpSampling2D((2, 2))(x)x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)x = UpSampling2D((2, 2))(x)decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)autoencoder = Model(input_img, decoded)autoencoder.compile(optimizer='rmsprop', loss='mae',metrics=['mse', 'accuracy'])from keras.callbacks import ModelCheckpoint, TensorBoardcheckpoints = []from keras.preprocessing.image import ImageDataGeneratorgenerated_data = ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, rotation_range=0,  width_shift_range=0.1, height_shift_range=0.1, horizontal_flip = True, vertical_flip = False)generated_data.fit(X_train)epochs = 1batch_size = 5autoencoder.fit_generator(generated_data.flow(X_train, X_train, batch_size=batch_size), steps_per_epoch=X_train.shape[0]/batch_size, epochs=epochs, validation_data=(X_test, X_test), callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])autoencoder.fit(X_train, X_train, batch_size=batch_size, epochs=epochs, validation_data=(X_test, X_test), shuffle=True, callbacks=[TensorBoard(log_dir='/tmp/auti')])decoded_imgs = autoencoder.predict(X_test)from sklearn.metrics import roc_curve#2 256  3print(decoded_imgs.argmax(axis=1))print(decoded_imgs.argmax(axis=1).reshape(1,3))fpr_keras, tpr_keras, thresholds_keras = roc_curve(Y_test.argmax(axis=1), decoded_imgs.argmax(axis=1))ValueError: 输入形状错误（2, 256, 3)

在编辑这行代码后：

fpr_keras, tpr_keras, thresholds_keras = roc_curve(Y_test.argmax(axis=1), decoded_imgs.reshape(6,256,1))

我得到了这个错误：

ValueError: 发现输入变量的样本数不一致：[2, 4]

回答：

你似乎对ROC曲线和自编码器的基础知识有些困惑…

引用scikit-learn中roc_curve的文档：

roc_curve (y_true, y_score, pos_label=None, sample_weight=None, drop_intermediate=True)

参数：

y_true : array, shape = [n_samples]

真实的二元标签。如果标签不是{-1, 1}或{0, 1}，则应明确给出pos_label。

y_score : array, shape = [n_samples]

目标得分，可以是正类概率估计、置信值或非阈值决策度量（如某些分类器的“decision_function”返回的）。

换句话说，roc_curve的两个输入都应该是包含标量数值的一维数组，第一个包含真实类别，第二个包含预测得分。

尽管你没有展示自己的数据样本，我并不怀疑你的Y_test.argmax(axis=1)可能符合这一规范，但你的decoded_imgs.argmax(axis=1)（无论你如何重塑它）绝对不符合。为什么？因为自编码器的本质。

与你尝试在代码中使用的随机森林分类器不同，自编码器不是分类器：它们的功能是重建输入的（去噪、压缩等）版本，而不是产生类别预测（请参阅Keras博客中关于构建自编码器的教程以快速了解）。这意味着你的decoded_imgs实际上是转换后的图像（或任何图像类数据），而不是roc_curve所需的类别得分，因此出现了错误（从技术上讲，实际上是由于decoded_imgs不是一维数组，但希望你能明白这个概念）。

即使你在这里使用了分类器而不是自编码器，你也会遇到另一个问题：ROC曲线用于二元分类任务，而不是多类分类任务，如MNIST（实际上也有一些方法将它们应用于多类数据，但据我所知，这些方法并不广泛使用）。确实，从表面上看，scikit-learn的roc_curve即使在多类设置下也能工作：

import numpy as npfrom sklearn import metricsy = np.array([0, 1, 1, 2, 2]) # 3类问题scores = np.array([0.05, 0.1, 0.4, 0.35, 0.8])fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)  # 正常工作，无错误

但实际上，这只是因为我们明确定义了pos_label=2，因此，在幕后，scikit-learn将所有非2的标签视为负类，并随后将剩余的计算视为我们的问题是一个二元问题（即类2对所有其他类）。

在你的情况下（MNIST），你应该问自己：在10类MNIST数据集中，什么是“正类”？以及这个问题有意义吗？希望你能说服自己，答案并不像二元（0/1）情况那样简单明了。

总结：这里没有编程错误需要修复；你问题的根本原因只是你在尝试一些无意义且无效的事情，因为自编码器不产生类别预测，因此它们的输出不能用于计算ROC曲线。我友好地建议你先对相关概念和观念有扎实的理解，然后再进行应用…

学技术

ValueError: 尝试计算ROC曲线时，输入形状错误（2, 256, 3）

发表回复取消回复

相关文章：

Related Posts

在使用k近邻算法时，有没有办法获取被使用的“邻居”？

Theano在Google Colab上无法启用GPU支持

准确性评分似乎有误

Keras Functional API: “错误检查输入时：期望input_1具有4个维度，但得到形状为(X, Y)的数组”

如何使用sklearn.datasets.make_classification在指定范围内生成合成数据？

如何处理预测时不在训练集中的标签

发表回复 取消回复

发表回复取消回复