假设我有一块或多块由单一模式组成的瓷砖(例如材料如:木材、混凝土、砾石等),我想用这些瓷砖来训练我的分类器,然后使用训练好的分类器来确定另一幅图像中每个像素所属的类别。
下面是两个我想用来训练分类器的瓷砖示例:
假设我想分割下面的图像,以识别属于门和墙的像素。这只是一个例子,我知道这张图片的模式与上面的瓷砖并不完全相同:
对于这个特定的问题,是否有必要使用卷积神经网络?或者有没有办法通过浅层神经网络或其他分类器结合纹理特征来实现我的目标?
我已经使用Scikit-learn实现了一个分类器,它单独处理瓷砖的像素(见下面的代码,其中training_data
是一个单例向量),但我想改为训练分类器来识别纹理模式。
# 训练分类器classifier = SGDClassifier()classifier.fit(training_data, training_target)# 对给定图像进行分类test_data = image_gray.flatten().reshape((-1, 1))predictions = classifier.predict(test_data)image_classified = predictions.reshape(image_gray.shape)
我阅读了这篇评论,介绍了用于图像分割的近期深度学习方法,结果看起来很准确,但由于我从未使用过CNN,所以我对此感到有些害怕。
回答:
您可以使用U-Net
或SegNet
来进行图像分割。实际上,您可以通过在CNN中添加残差层来实现这种结果:
关于U-Net:
Arxiv: U-Net: 用于生物医学图像分割的卷积网络
SegNet:
Arxiv: SegNet: 用于图像分割的深度卷积编码器-解码器架构
以下是代码的简单示例: keras==1.1.0
U-Net:
shape=60batch_size = 30nb_classes = 10img_rows, img_cols = shape, shapenb_filters = 32pool_size = (2, 2)kernel_size = (3, 3)input_shape=(shape,shape,1)reg=0.001learning_rate = 0.013decay_rate = 5e-5momentum = 0.9sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)shape2recog0 = Sequential()recog0.add(Convolution2D(20, 3,3, border_mode='valid', input_shape=input_shape))recog0.add(BatchNormalization(mode=2))recog=recog0recog.add(Activation('relu'))recog.add(MaxPooling2D(pool_size=(2,2)))recog.add(UpSampling2D(size=(2, 2)))recog.add(Convolution2D(20, 3, 3,init='glorot_uniform'))recog.add(BatchNormalization(mode=2))recog.add(Activation('relu'))for i in range(0,2): print(i,recog0.layers[i].name)recog_res=recog0part=1recog0.layers[part].nameget_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])get_0_layer_output([x_train, 0])[0][0]pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]loss=x_train-predloss=loss.astype('float32')recog_res.add(Lambda(lambda x: x,input_shape=(56,56,20),output_shape=(56,56,20)))recog2=Sequential()recog2.add(Merge([recog,recog_res],mode='ave'))recog2.add(Activation('relu'))recog2.add(Convolution2D(20, 3, 3,init='glorot_uniform'))recog2.add(BatchNormalization(mode=2))recog2.add(Activation('relu'))recog2.add(Convolution2D(1, 1, 1,init='glorot_uniform'))recog2.add(Reshape((shape2,shape2,1)))recog2.add(Activation('relu'))recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])recog2.summary()x_train3=x_train2.reshape((1,shape2,shape2,1))recog2.fit(x_train,x_train3, nb_epoch=25, batch_size=30,verbose=1)
SegNet:
shape=60batch_size = 30nb_classes = 10img_rows, img_cols = shape, shapenb_filters = 32pool_size = (2, 2)kernel_size = (3, 3)input_shape=(shape,shape,1)reg=0.001learning_rate = 0.012decay_rate = 5e-5momentum = 0.9sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)recog0 = Sequential()recog0.add(Convolution2D(20, 4,4, border_mode='valid', input_shape=input_shape))recog0.add(BatchNormalization(mode=2))recog0.add(MaxPooling2D(pool_size=(2,2)))recog=recog0recog.add(Activation('relu'))recog.add(MaxPooling2D(pool_size=(2,2)))recog.add(UpSampling2D(size=(2, 2)))recog.add(Convolution2D(20, 1, 1,init='glorot_uniform'))recog.add(BatchNormalization(mode=2))recog.add(Activation('relu'))for i in range(0,8): print(i,recog0.layers[i].name)recog_res=recog0part=8recog0.layers[part].nameget_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])get_0_layer_output([x_train, 0])[0][0]pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]loss=x_train-predloss=loss.astype('float32')recog_res.add(Lambda(lambda x: x-np.mean(loss),input_shape=(28,28,20),output_shape=(28,28,20)))recog2=Sequential()recog2.add(Merge([recog,recog_res],mode='sum'))recog2.add(UpSampling2D(size=(2, 2)))recog2.add(Convolution2D(1, 3, 3,init='glorot_uniform'))recog2.add(BatchNormalization(mode=2))recog2.add(Reshape((shape2*shape2,)))recog2.add(Reshape((shape2,shape2,1)))recog2.add(Activation('relu'))recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])recog2.summary()x_train3=x_train2.reshape((1,shape2,shape2,1))recog2.fit(x_train,x_train3, nb_epoch=400, batch_size=30,verbose=1)
然后为分割的颜色添加一个阈值。