我正在学习神经网络，目前使用Keras库在CFAR-10数据集上实现了对象分类。这是我用Keras定义的神经网络定义：

# Define the model and train itmodel = Sequential()model.add(Dense(units = 60, input_dim = 1024, activation = 'relu'))model.add(Dense(units = 50, activation = 'relu'))model.add(Dense(units = 60, activation = 'relu'))model.add(Dense(units = 70, activation = 'relu'))model.add(Dense(units = 30, activation = 'relu'))model.add(Dense(units = 10, activation = 'sigmoid'))model.compile(loss='binary_crossentropy',              optimizer='adam',              metrics=['accuracy'])model.fit(X_train, y_train, epochs=50, batch_size=10000)

所以我有一个输入层，输入维度为1024或(1024,)（每张32*32*3的图像首先转换为灰度，得到32*32的维度），5个隐藏层和1个输出层，如上面的代码所定义。

当我训练我的模型50个周期时，我得到了0.9或90%的准确率。同样，当我使用测试数据集进行评估时，我也得到了大约90%的准确率。这里是评估模型的代码行：

print (model.evaluate(X_test, y_test))

这会打印出以下损失和准确率：

[1.611809492111206, 0.8999999761581421]

但是，当我通过对每个测试数据图像进行预测来手动计算准确率时，我得到了大约11%的准确率（这几乎与随机预测的概率相同）。这是我计算准确率的代码：

wrong = 0for x, y in zip(X_test, y_test):  if not (np.argmax(model.predict(x.reshape(1, -1))) == np.argmax(y)):    wrong += 1print (wrong)

这打印出10000个预测中有9002个是错误的。那么我在这里遗漏了什么？为什么这两种准确率完全相反（100 – 89 = 11%）？任何直观的解释都会有所帮助！谢谢。

编辑：

这是我处理数据集的代码：

# Process the training and testing data and make in Neural Network comfortable# convert given colored image to grayscaledef rgb2gray(rgb):  return np.dot(rgb, [0.2989, 0.5870, 0.1140])X_train, y_train, X_test, y_test = [], [], [], []def process_batch(batch_path, is_test = False):  batch = unpickle(batch_path)  imgs = batch[b'data']  labels = batch[b'labels']  for img in imgs:    img = img.reshape(3,32,32).transpose([1, 2, 0])    img = rgb2gray(img)    img = img.reshape(1, -1)    if not is_test:      X_train.append(img)    else:      X_test.append(img)  for label in labels:    if not is_test:      y_train.append(label)    else:      y_test.append(label)process_batch('cifar-10-batches-py/data_batch_1')process_batch('cifar-10-batches-py/data_batch_2')process_batch('cifar-10-batches-py/data_batch_3')process_batch('cifar-10-batches-py/data_batch_4')process_batch('cifar-10-batches-py/data_batch_5')process_batch('cifar-10-batches-py/test_batch', True)number_of_classes = 10number_of_batches = 5number_of_test_batch = 1X_train = np.array(X_train).reshape(meta_data[b'num_cases_per_batch'] * number_of_batches, -1)print ('Shape of training data: {0}'.format(X_train.shape))# create labels to one hot formaty_train = np.array(y_train)y_train = np.eye(number_of_classes)[y_train]print ('Shape of training labels: {0}'.format(y_train.shape))# Process testing dataX_test = np.array(X_test).reshape(meta_data[b'num_cases_per_batch'] * number_of_test_batch, -1)print ('Shape of testing data: {0}'.format(X_test.shape))# create labels to one hot formaty_test = np.array(y_test)y_test = np.eye(number_of_classes)[y_test]print ('Shape of testing labels: {0}'.format(y_test.shape))

回答：

之所以会发生这种情况，是因为你使用的损失函数。你使用的是二元交叉熵，而你应该使用分类交叉熵作为损失函数。二元交叉熵只适用于两个标签的问题，而你这里有10个标签，因为是CIFAR-10数据集。

当你显示准确率指标时，它实际上在误导你，因为它显示的是二元分类性能。解决方案是重新训练你的模型，选择categorical_crossentropy作为损失函数。

这篇帖子有更多详情：Keras binary_crossentropy vs categorical_crossentropy performance?

相关 – 这篇帖子回答的是另一个问题，但答案本质上就是你的问题所在：Keras: model.evaluate vs model.predict accuracy difference in multi-class NLP task

编辑

你在评论中提到，你的模型准确率徘徊在10%左右，并且没有改善。在检查你的Colab笔记本并将损失函数改为分类交叉熵后，似乎你没有对数据进行归一化。因为像素值原本是无符号8位整数，当你创建训练集时，值会转换为浮点数，但由于数据的动态范围，你的神经网络很难学习到正确的权重。当你尝试更新权重时，梯度非常小，基本上没有更新，因此你的网络表现得就像随机猜测一样。解决方案是在继续之前，将你的训练和测试数据集除以255：

X_train /= 255.0X_test /= 255.0

这将把你的数据从[0,255]转换为[0,1]的动态范围。由于较小的动态范围，你的模型将更容易训练，这应该有助于梯度传播，而不会因为归一化前的较大尺度而消失。因为你的原始模型规格有大量的密集层，由于数据的动态范围，梯度更新很可能会消失，这就是为什么最初的性能很差的原因。

当我运行你的笔记本时，我得到了37%的准确率。这对于CIFAR-10和仅使用全连接/密集网络来说并不意外。同样，当你现在运行你的笔记本时，准确率和错误示例的比例是一致的。

如果你想提高准确率，我有几个建议：

实际上包含颜色信息。CIFAR-10中的每个对象都有独特的颜色配置文件，这应该有助于区分。
添加卷积层。我不确定你在学习的哪个阶段，但卷积层有助于学习和提取图像中的正确特征，以便将最佳特征呈现给密集层，从而提高基于这些特征的分类准确率。现在你是在对原始像素进行分类，这在考虑到它们可能很嘈杂或由于旋转、平移、倾斜、缩放等因素而变得不受约束的情况下，并不建议这样做。

学技术

Keras准确性与实际准确性完全相反

编辑

发表回复取消回复

编辑

相关文章：

Related Posts

为什么我们在K-means聚类方法中使用kmeans.fit函数？

如何获取Keras中ImageDataGenerator的.flow_from_directory函数扫描的类名？

如何查看每个词的tf-idf得分

如何修复 ‘ValueError: Found input variables with inconsistent numbers of samples: [32979, 21602]’？

如何向神经网络输入两个不同大小的输入？

逻辑回归与机器学习有何关联

发表回复 取消回复

发表回复取消回复