重复模式的图像语义分割无需使用卷积神经网络

假设我有一块或多块由单一模式组成的瓷砖（例如材料如：木材、混凝土、砾石等），我想用这些瓷砖来训练我的分类器，然后使用训练好的分类器来确定另一幅图像中每个像素所属的类别。

下面是两个我想用来训练分类器的瓷砖示例：

假设我想分割下面的图像，以识别属于门和墙的像素。这只是一个例子，我知道这张图片的模式与上面的瓷砖并不完全相同：

对于这个特定的问题，是否有必要使用卷积神经网络？或者有没有办法通过浅层神经网络或其他分类器结合纹理特征来实现我的目标？

我已经使用Scikit-learn实现了一个分类器，它单独处理瓷砖的像素（见下面的代码，其中training_data是一个单例向量），但我想改为训练分类器来识别纹理模式。

# 训练分类器classifier = SGDClassifier()classifier.fit(training_data, training_target)# 对给定图像进行分类test_data = image_gray.flatten().reshape((-1, 1))predictions = classifier.predict(test_data)image_classified = predictions.reshape(image_gray.shape)

我阅读了这篇评论，介绍了用于图像分割的近期深度学习方法，结果看起来很准确，但由于我从未使用过CNN，所以我对此感到有些害怕。

回答：

您可以使用U-Net或SegNet来进行图像分割。实际上，您可以通过在CNN中添加残差层来实现这种结果：

关于U-Net：

Arxiv: U-Net: 用于生物医学图像分割的卷积网络

SegNet：

Arxiv: SegNet: 用于图像分割的深度卷积编码器-解码器架构

以下是代码的简单示例： keras==1.1.0

U-Net:

shape=60batch_size = 30nb_classes = 10img_rows, img_cols = shape, shapenb_filters = 32pool_size = (2, 2)kernel_size = (3, 3)input_shape=(shape,shape,1)reg=0.001learning_rate = 0.013decay_rate = 5e-5momentum = 0.9sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)shape2recog0 = Sequential()recog0.add(Convolution2D(20, 3,3,                        border_mode='valid',                        input_shape=input_shape))recog0.add(BatchNormalization(mode=2))recog=recog0recog.add(Activation('relu'))recog.add(MaxPooling2D(pool_size=(2,2)))recog.add(UpSampling2D(size=(2, 2)))recog.add(Convolution2D(20, 3, 3,init='glorot_uniform'))recog.add(BatchNormalization(mode=2))recog.add(Activation('relu'))for i in range(0,2):    print(i,recog0.layers[i].name)recog_res=recog0part=1recog0.layers[part].nameget_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])get_0_layer_output([x_train, 0])[0][0]pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]loss=x_train-predloss=loss.astype('float32')recog_res.add(Lambda(lambda x: x,input_shape=(56,56,20),output_shape=(56,56,20)))recog2=Sequential()recog2.add(Merge([recog,recog_res],mode='ave'))recog2.add(Activation('relu'))recog2.add(Convolution2D(20, 3, 3,init='glorot_uniform'))recog2.add(BatchNormalization(mode=2))recog2.add(Activation('relu'))recog2.add(Convolution2D(1, 1, 1,init='glorot_uniform'))recog2.add(Reshape((shape2,shape2,1)))recog2.add(Activation('relu'))recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])recog2.summary()x_train3=x_train2.reshape((1,shape2,shape2,1))recog2.fit(x_train,x_train3,                nb_epoch=25,                batch_size=30,verbose=1)

SegNet:

shape=60batch_size = 30nb_classes = 10img_rows, img_cols = shape, shapenb_filters = 32pool_size = (2, 2)kernel_size = (3, 3)input_shape=(shape,shape,1)reg=0.001learning_rate = 0.012decay_rate = 5e-5momentum = 0.9sgd = SGD(lr=learning_rate,momentum=momentum, decay=decay_rate, nesterov=True)recog0 = Sequential()recog0.add(Convolution2D(20, 4,4,                        border_mode='valid',                        input_shape=input_shape))recog0.add(BatchNormalization(mode=2))recog0.add(MaxPooling2D(pool_size=(2,2)))recog=recog0recog.add(Activation('relu'))recog.add(MaxPooling2D(pool_size=(2,2)))recog.add(UpSampling2D(size=(2, 2)))recog.add(Convolution2D(20, 1, 1,init='glorot_uniform'))recog.add(BatchNormalization(mode=2))recog.add(Activation('relu'))for i in range(0,8):    print(i,recog0.layers[i].name)recog_res=recog0part=8recog0.layers[part].nameget_0_layer_output = K.function([recog0.layers[0].input, K.learning_phase()],[recog0.layers[part].output])get_0_layer_output([x_train, 0])[0][0]pred=[np.argmax(get_0_layer_output([x_train, 0])[0][i]) for i in range(0,len(x_train))]loss=x_train-predloss=loss.astype('float32')recog_res.add(Lambda(lambda x: x-np.mean(loss),input_shape=(28,28,20),output_shape=(28,28,20)))recog2=Sequential()recog2.add(Merge([recog,recog_res],mode='sum'))recog2.add(UpSampling2D(size=(2, 2)))recog2.add(Convolution2D(1, 3, 3,init='glorot_uniform'))recog2.add(BatchNormalization(mode=2))recog2.add(Reshape((shape2*shape2,)))recog2.add(Reshape((shape2,shape2,1)))recog2.add(Activation('relu'))recog2.compile(loss='mean_squared_error', optimizer=sgd,metrics = ['mae'])recog2.summary()x_train3=x_train2.reshape((1,shape2,shape2,1))recog2.fit(x_train,x_train3,                nb_epoch=400,                batch_size=30,verbose=1)

然后为分割的颜色添加一个阈值。

学技术

重复模式的图像语义分割无需使用卷积神经网络

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复