如何在Keras/TensorFlow中将图像分割成块/子图像?

我正在尝试重现这篇论文中的逻辑。该逻辑可以总结为以下图表:enter image description here

突出我的问题:

  • 我有一个256×256的输入图像。它通过densenet处理(下面有工作示例)
  • 这张相同的图像被分割成4个相等且不重叠的128×128片段。它们也都通过densenet处理并进行平均。

工作代码:

from keras.applications.densenet import DenseNet201from keras.layers import Dense, Flatten, Concatenatefrom keras.activations import relu#主图像sin1 = tf.keras.Input(shape=(256,256,3))#主图像的4个子块patch1 = tf.keras.Input(shape=(128,128,3))patch2 = tf.keras.Input(shape=(128,128,3))patch3 = tf.keras.Input(shape=(128,128,3))patch4 = tf.keras.Input(shape=(128,128,3))# CNN cnn = DenseNet201(include_top=False, pooling='avg')#完整256x256图像的输出out1 = cnn(in1)#4个128x128块的输出path_out1 = cnn(patch1)path_out2 = cnn(patch2)path_out3 = cnn(patch3)path_out4 = cnn(patch4)#平均块patch_out_average = tf.keras.layers.Average()([path_out1, path_out2, path_out3, path_out4])#合并特征out_combined = tf.stack([out1, patch_out_average])

我的问题:有没有办法让这个过程更优雅、更少手动操作?我不想手动生成16个64×64的输入行。有没有办法将图像‘分割’成若干部分并返回一个平均张量,或者只是让这个过程更简短?

谢谢。

更新(使用下方答案中的代码):

from keras.applications.densenet import DenseNet201from keras.layers import Dense, Flatten, Concatenatefrom keras.activations import reluclass CreatePatches(tf.keras.layers.Layer):    def __init__(self , patch_size, cnn):        super(CreatePatches , self).__init__()        self.patch_size = patch_size        self.cnn = cnn    def call(self, inputs):        patches = []        #仅适用于正方形图像(因为inputs.shape[1] = inputs.shape[2])        input_image_size = inputs.shape[1]        for i in range(0 ,input_image_size , self.patch_size):            for j in range(0 ,input_image_size , self.patch_size):                patches.append(self.cnn(inputs[ : , i : i + self.patch_size , j : j + self.patch_size , : ]))        return patches#主图像in1 = tf.keras.Input(shape=(256,256,3))# CNN cnn = DenseNet201(include_top=False, pooling='avg')#完整256x256图像的输出out256 = cnn(in1)#4个128x128块的输出out128 = CreatePatches(patch_size=128, cnn = cnn)(in1)#16个64x64块的输出out64 = CreatePatches(patch_size=64, cnn = cnn)(in1)#平均块out128 = tf.keras.layers.Average()(out128)out64 = tf.keras.layers.Average()(out64)#合并特征out_combined = tf.stack([out256, out128, out64], axis = 1)#平均out_averaged = tf.keras.layers.GlobalAveragePooling1D()(out_combined)out_averaged

回答:

更新(2021年7月16日)

我在Keras的Vision Transformers教程中发现了这个代码,其中实现了一个自定义Keras层,使用tf.image.extract_patches函数从图像中创建块。

class Patches(layers.Layer):    def __init__(self, patch_size):        super(Patches, self).__init__()        self.patch_size = patch_size    def call(self, images):        batch_size = tf.shape(images)[0]        patches = tf.image.extract_patches(            images=images,            sizes=[1, self.patch_size, self.patch_size, 1],            strides=[1, self.patch_size, self.patch_size, 1],            rates=[1, 1, 1, 1],            padding="VALID",        )        patch_dims = patches.shape[-1]        patches = tf.reshape(patches, [batch_size, -1, patch_dims])        return patches

现有解决方案

您可以创建一个自定义KerasLayer,它可以将给定的正方形图像(宽度=高度)分割成块,如下所示,

class CreatePatches( tf.keras.layers.Layer ):  def __init__( self , patch_size ):    super( CreatePatches , self ).__init__()    self.patch_size = patch_size  def call(self, inputs ):    patches = []    # 仅适用于正方形图像(因为inputs.shape[ 1 ] = inputs.shape[ 2 ])    input_image_size = inputs.shape[ 1 ]    for i in range( 0 , input_image_size , self.patch_size ):        for j in range( 0 , input_image_size , self.patch_size ):            patches.append( inputs[ : , i : i + self.patch_size , j : j + self.patch_size , : ] )    return patchessample_image = np.random.rand( 1 , 256 , 256 , 3 ) layer = CreatePatches( 128 )layer( sample_image )

请确保inputs.shape[ 1 ]能被patch_size整除。

您还可以将此层包含在Model中,如下所示,

inputs = tf.keras.layers.Input( shape=( 256 , 256 , 3 ) ) patches = CreatePatches( patch_size=128 )( inputs )model = tf.keras.models.Model( inputs , patches )model.summary()

上述代码片段的输出,

Model: "model_1"_________________________________________________________________Layer (type)                 Output Shape              Param #   =================================================================input_3 (InputLayer)         [(None, 256, 256, 3)]     0         _________________________________________________________________create_patches_5 (CreatePatc [(None, 128, 128, 3), (No 0         =================================================================Total params: 0Trainable params: 0Non-trainable params: 0_________________________________________________________________

有关模型输出的更多详细信息,

>> model.outputs[<KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>, <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>, <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>, <KerasTensor: shape=(None, 128, 128, 3) dtype=float32 (created by layer 'create_patches_5')>]

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注