填充不同大小的torch张量使其相等

我正在寻找一种方法来处理用于分割的图像/目标批次，并返回一个批次，其中整个批次的图像尺寸已被调整为相等。我尝试使用下面的代码来实现这一点：

def collate_fn_padd(batch):    '''    Padds batch of variable length    note: it converts things ToTensor manually here since the ToTensor transform    assume it takes in images rather than arbitrary tensors.    '''    # separate the image and masks    image_batch,mask_batch = zip(*batch)    # pad the images and masks    image_batch = torch.nn.utils.rnn.pad_sequence(image_batch, batch_first=True)    mask_batch = torch.nn.utils.rnn.pad_sequence(mask_batch, batch_first=True)    # rezip the batch    batch = list(zip(image_batch, mask_batch))    return batch

然而，我遇到了这个错误：

RuntimeError: The expanded size of the tensor (650) must match the existing size (439) at non-singleton dimension 2.  Target sizes: [3, 650, 650].  Tensor sizes: [3, 406, 439]

我如何高效地填充张量，使它们具有相等的尺寸并避免这个问题？

回答：

rnn.pad_sequence 只能填充序列维度，它要求所有其他维度相等。你不能用它来填充图像的两个维度（高度和宽度）。

要填充图像，可以使用 torch.nn.functional.pad，但你需要手动确定需要填充的高度和宽度。

import torch.nn.functional as F# 确定最大高度和宽度# 掩码的高度和宽度与图像相同max_height = max([img.size(1) for img in image_batch])max_width = max([img.size(2) for img in image_batch])image_batch = [    # 所需的填充是最大宽度/高度与图像实际宽度/高度的差异    F.pad(img, [0, max_width - img.size(2), 0, max_height - img.size(1)])    for img in image_batch]mask_batch = [    # 与图像相同，但没有通道维度    # 因此掩码的宽度是维度1而不是2    F.pad(mask, [0, max_width - mask.size(1), 0, max_height - mask.size(0)])    for mask in mask_batch]

填充长度是按维度的相反顺序指定的，每个维度有两个值，一个用于开始的填充，一个用于结束的填充。对于具有维度 [channels, height, width] 的图像，填充指定为： [width_beginning, width_end, height_beginning, height_top]，可以重新表述为 [left, right, top, bottom]。因此，上面的代码将图像填充到右侧和底部。通道被省略，因为它们不被填充，这也意味着相同的填充可以直接应用于掩码。

学技术

填充不同大小的torch张量使其相等

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复