我从Analytics Vidhya的教程中改编了一个简单的CNN模型。

问题在于我在保留集上的准确率不比随机好。我使用了大约8600张猫和狗的图片进行训练，这应该足够训练一个不错的模型，但测试集上的准确率只有49%。我的代码里是否有明显的遗漏？

import osimport numpy as npimport kerasfrom keras.models import Sequentialfrom sklearn.model_selection import train_test_splitfrom datetime import datetimefrom PIL import Imagefrom keras.utils.np_utils import to_categoricalfrom sklearn.utils import shuffledef main():    cat=os.listdir("train/cats")    dog=os.listdir("train/dogs")    filepath="train/cats/"    filepath2="train/dogs/"    print("[INFO] Loading images of cats and dogs each...", datetime.now().time())    #print("[INFO] Loading {} images of cats and dogs each...".format(num_images), datetime.now().time())    images=[]    label = []    for i in cat:        image = Image.open(filepath+i)        image_resized = image.resize((300,300))        images.append(image_resized)        label.append(0) #for cat images    for i in dog:        image = Image.open(filepath2+i)        image_resized = image.resize((300,300))        images.append(image_resized)        label.append(1) #for dog images    images_full = np.array([np.array(x) for x in images])    label = np.array(label)    label = to_categorical(label)    images_full, label = shuffle(images_full, label)    print("[INFO] Splitting into train and test", datetime.now().time())    (trainX, testX, trainY, testY) = train_test_split(images_full, label, test_size=0.25)    filters = 10    filtersize = (5, 5)    epochs = 5    batchsize = 32    input_shape=(300,300,3)    #input_shape = (30, 30, 3)    print("[INFO] Designing model architecture...", datetime.now().time())    model = Sequential()    model.add(keras.layers.InputLayer(input_shape=input_shape))    model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='same',                                                data_format="channels_last", activation='relu'))    model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))    model.add(keras.layers.Flatten())    model.add(keras.layers.Dense(units=2, input_dim=50,activation='softmax'))    #model.add(keras.layers.Dense(units=2, input_dim=5, activation='softmax'))    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])    print("[INFO] Fitting model...", datetime.now().time())    model.fit(trainX, trainY, epochs=epochs, batch_size=batchsize, validation_split=0.3)    model.summary()    print("[INFO] Evaluating on test set...", datetime.now().time())    eval_res = model.evaluate(testX, testY)    print(eval_res)if __name__== "__main__":    main()

回答：

我认为问题出在你的网络规模上，你只有一个Conv2D层，而且滤波器数量只有10个。这对于学习图像的深层表示来说实在太少了。

尝试大幅增加这个数量，使用像VGGnet这样的常见架构模块！
例如一个模块如下：

x = Conv2D(32, (3, 3) , padding='SAME')(model_input)x = LeakyReLU(alpha=0.3)(x)x = BatchNormalization()(x)x = Conv2D(32, (3, 3) , padding='SAME')(x)x = LeakyReLU(alpha=0.3)(x)x = BatchNormalization()(x)x = MaxPooling2D(pool_size=(2, 2))(x)x = Dropout(0.25)(x)

你需要尝试多个这样的模块，并且增加滤波器的数量以捕捉更深层次的特征。

另外一点，你不需要指定密集层的input_dim，Keras会自动处理这个！

最后但同样重要的是，为了正确分类你的图像，你需要一个完全连接的网络，而不仅仅是一个单层。

例如：

x = Flatten()(x)x = Dense(256)(x)x = LeakyReLU(alpha=0.3)(x)x = Dense(128)(x)x = LeakyReLU(alpha=0.3)(x)x = Dense(2)(x)x = Activation('softmax')(x)

尝试这些更改，并保持联系！

在原作者问题后的更新

图像是复杂的，它们包含了很多信息，比如形状、边缘、颜色等

为了捕捉最大量的信息，你需要通过多个卷积层来学习图像的不同方面。想象一下，第一个卷积层可能学会识别正方形，第二个卷积层可能学会识别圆形，第三个可能学会识别边缘，等等…

至于我的第二点，最后的完全连接层就像一个分类器，卷积网络会输出一个“代表”狗或猫的向量，现在你需要学习这种向量属于哪个类别。
直接将这个向量输入到最终层是不足以学习这种表示的。

这样更清楚了吗？

在原作者第二次评论后的最后更新

这里有两种定义Keras模型的方法，两者输出相同的结果！

model_input = Input(shape=(200, 1))x = Dense(32)(model_input)x = Dense(16)(x)x = Activation('relu')(x)model = Model(inputs=model_input, outputs=x)model = Sequential()model.add(Dense(32, input_shape=(200, 1)))model.add(Dense(16, activation = 'relu'))

架构示例

model = Sequential()model.add(keras.layers.InputLayer(input_shape=input_shape))model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))model.add(keras.layers.convolutional.Conv2D(32, (3,3), strides=(2, 2), padding='same', activation='relu'))model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))model.add(keras.layers.convolutional.Conv2D(64, (3,3), strides=(2, 2), padding='same', activation='relu'))model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))model.add(keras.layers.Flatten())model.add(keras.layers.Dense(128, activation='relu'))model.add(keras.layers.Dense(2, activation='softmax'))model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

不要忘记在将数据输入网络之前进行归一化处理。

在你的数据上简单地使用images_full = images_full / 255.0可以大幅提升你的准确率。
也可以尝试使用灰度图像，这样计算效率更高。

学技术

CNN在猫狗图像二分类上的准确率不比随机好

在原作者问题后的更新

在原作者第二次评论后的最后更新

架构示例

发表回复取消回复

在原作者问题后的更新

在原作者第二次评论后的最后更新

架构示例

相关文章：

Related Posts

在使用k近邻算法时，有没有办法获取被使用的“邻居”？

Theano在Google Colab上无法启用GPU支持

准确性评分似乎有误

Keras Functional API: “错误检查输入时：期望input_1具有4个维度，但得到形状为(X, Y)的数组”

如何使用sklearn.datasets.make_classification在指定范围内生成合成数据？

如何处理预测时不在训练集中的标签

发表回复 取消回复

发表回复取消回复