我在Python中实现Segnet。以下是代码。
img_w = 480img_h = 360pool_size = 2def build_model(img_w, img_h, pool_size): n_labels = 12 kernel = 3 encoding_layers = [ Conv2D(64, (kernel, kernel), input_shape=(img_h, img_w, 3), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(64, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), MaxPooling2D(pool_size = (pool_size,pool_size)), Convolution2D(128, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(128, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), MaxPooling2D(pool_size = (pool_size,pool_size)), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), MaxPooling2D(pool_size = (pool_size,pool_size)), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), MaxPooling2D(pool_size = (pool_size,pool_size)), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), MaxPooling2D(pool_size = (pool_size,pool_size)), ] autoencoder = models.Sequential() autoencoder.encoding_layers = encoding_layers for l in autoencoder.encoding_layers: autoencoder.add(l) decoding_layers = [ UpSampling2D(), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), UpSampling2D(), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(512, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), UpSampling2D(), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(256, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(128, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), UpSampling2D(), Convolution2D(128, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(64, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), UpSampling2D(), Convolution2D(64, (kernel, kernel), padding='same'), BatchNormalization(), Activation('relu'), Convolution2D(n_labels, (1, 1), padding='valid', activation="sigmoid"), BatchNormalization(), ] autoencoder.decoding_layers = decoding_layers for l in autoencoder.decoding_layers: autoencoder.add(l) autoencoder.add(Reshape((n_labels, img_h * img_w))) autoencoder.add(Permute((2, 1))) autoencoder.add(Activation('softmax')) return autoencodermodel = build_model(img_w, img_h, pool_size)
但它返回了错误。
---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-21-051f06a53a14> in <module>()----> 1 model = build_model(img_w, img_h, pool_size)<ipython-input-20-c37fd94c8641> in build_model(img_w, img_h, pool_size) 119 autoencoder.add(l) 120 --> 121 autoencoder.add(Reshape((n_labels, img_h * img_w))) 122 autoencoder.add(Permute((2, 1))) 123 autoencoder.add(Activation('softmax')) ValueError: total size of new array must be unchanged
我看不出错误的原因。当我将img_w和img_h改为256时,这个错误解决了,但问题是这不是原始数据集的图像大小,所以我不能使用这个方法。如何解决这个问题?
回答:
问题在于你进行了5次(2, 2)
的下采样,因此,让我们跟踪形状:
(360, 480) -> (180, 240) -> (90, 120) -> (45, 60) -> (22, 30) -> (11, 15)
现在进行上采样:
(11, 15) -> (22, 30) -> (44, 60) -> (88, 120) -> (176, 240) -> (352, 480)
所以,当你尝试使用原始形状reshape
输出时,由于模型不匹配,问题就出现了。
可能的解决方案:
-
调整你的图像大小,使输入的两个维度都能被
32
整除(例如(352, 480)
或(384, 480)
)。 -
在第三次上采样后添加
ZeroPadding2D(((1, 0), (0, 0)))
,将形状从(44, 60)
更改为(45, 60)
,这将使你的网络以一个良好的输出形状结束。
其他问题:
请注意,最后一个MaxPooling2D
后面紧接着第一个Upsampling2D
。这可能是一个问题,因为这会导致你的网络无用的瓶颈效应。