规范化传递给torch.transforms.Compose函数的图像

如何找到传递给PyTorch中transforms.Normalize函数的值?另外,在我的代码中,我应该在哪里准确地执行transforms.Normalize?

由于规范化数据集是一项相当知名的任务,我希望应该有某种脚本可以自动完成这个任务。至少我在PyTorch论坛上找不到这样的脚本。

transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',                                           root_dir='.',                                           transform=transforms.Compose([                                           Rescale(256),                                           RandomCrop(224),                                           transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],                                           std = [ 0.229, 0.224, 0.225 ]),                                           ToTensor()                                               ]))    for i in range(len(transformed_dataset)):    sample = transformed_dataset[i]    print(i, sample['image'].size(), sample['landmarks'].size())    if i == 3:       break

我知道这些当前值与我的数据集无关,而是与ImageNet有关,但我使用它们时实际上会得到一个错误:

    TypeError                                 Traceback (most recent call last)    <ipython-input-81-eb8dc46e0284> in <module>         10          11 for i in range(len(transformed_dataset)):    ---> 12     sample = transformed_dataset[i]         13          14     print(i, sample['image'].size(), sample['landmarks'].size())        <ipython-input-48-9d04158922fb> in __getitem__(self, idx)         30          31         if self.transform:    ---> 32             sample = self.transform(sample)         33          34         return sample        ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, img)         59     def __call__(self, img):         60         for t in self.transforms:    ---> 61             img = t(img)         62         return img         63         ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self, tensor)        210             Tensor: Normalized Tensor image.        211         """    --> 212         return F.normalize(tensor, self.mean, self.std, self.inplace)        213         214     def __repr__(self):        ~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor, mean, std, inplace)        278     """        279     if not torch.is_tensor(tensor):    --> 280         raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))        281         282     if tensor.ndimension() != 3:        TypeError: tensor should be a torch tensor. Got <class 'dict'>.

所以基本上是三个问题:

  1. 如何为我自己的自定义数据集找到与ImageNet均值和标准差相似的值?
  2. 如何传递这些值以及在哪里传递?我认为应该在transforms.Compose方法中进行,但我可能错了。
  3. 我认为应该对整个数据集应用规范化,而不仅仅是训练集,对吗?

更新:

尝试这里提供的解决方案对我不起作用: https://discuss.pytorch.org/t/about-normalization-using-pre-trained-vgg16-networks/23560/6?u=mona_jalal

mean = 0.std = 0.nb_samples = 0.for data in dataloader:    print(type(data))    batch_samples = data.size(0)        data.shape(0)    data = data.view(batch_samples, data.size(1), -1)    mean += data.mean(2).sum(0)    std += data.std(2).sum(0)    nb_samples += batch_samplesmean /= nb_samplesstd /= nb_samples

错误是:

<class 'dict'>---------------------------------------------------------------------------AttributeError                            Traceback (most recent call last)<ipython-input-51-e8ba3c8718bb> in <module>      5 for data in dataloader:      6     print(type(data))----> 7     batch_samples = data.size(0)      8       9     data.shape(0)AttributeError: 'dict' object has no attribute 'size'

这是print(data)的结果:

{'image': tensor([[[[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],          ...,          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]],         [[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],          ...,          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]],         [[0.2961, 0.2941, 0.2941,  ..., 0.2460, 0.2456, 0.2431],          [0.2953, 0.2977, 0.2980,  ..., 0.2442, 0.2431, 0.2431],          [0.2941, 0.2941, 0.2980,  ..., 0.2471, 0.2471, 0.2448],          ...,          [0.3216, 0.3216, 0.3216,  ..., 0.2482, 0.2471, 0.2471],          [0.3216, 0.3241, 0.3253,  ..., 0.2471, 0.2471, 0.2450],          [0.3216, 0.3216, 0.3216,  ..., 0.2471, 0.2452, 0.2431]]],        dtype=torch.float64), 'landmarks': tensor([[[160.2964,  98.7339],         [223.0788,  72.5067],         [ 82.4163,  70.3733],         [152.3213, 137.7867]],        [[198.3194,  74.4341],         [273.7188, 118.7733],         [117.7113,  80.8000],         [182.0750, 107.2533]],        [[137.4789,  92.8523],         [174.9463,  40.3467],         [ 57.3013,  59.1200],         [129.3375, 131.6533]]], dtype=torch.float64)}
dataloader = DataLoader(transformed_dataset, batch_size=3,                        shuffle=True, num_workers=4)

transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',                                           root_dir='.',                                           transform=transforms.Compose(                                               [                                               Rescale(256),                                               RandomCrop(224),                                                                                              ToTensor()#,                                               ##transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],                                               ##         std = [ 0.229, 0.224, 0.225 ])                                               ]                                                                        )                                           )

class MothLandmarksDataset(Dataset):    """Face Landmarks dataset."""    def __init__(self, csv_file, root_dir, transform=None):        """        Args:            csv_file (string): Path to the csv file with annotations.            root_dir (string): Directory with all the images.            transform (callable, optional): Optional transform to be applied                on a sample.        """        self.landmarks_frame = pd.read_csv(csv_file)        self.root_dir = root_dir        self.transform = transform    def __len__(self):        return len(self.landmarks_frame)    def __getitem__(self, idx):        if torch.is_tensor(idx):            idx = idx.tolist()        img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0])        image = io.imread(img_name)        landmarks = self.landmarks_frame.iloc[idx, 1:]        landmarks = np.array([landmarks])        landmarks = landmarks.astype('float').reshape(-1, 2)        sample = {'image': image, 'landmarks': landmarks}        if self.transform:            sample = self.transform(sample)        return sample

回答:

源代码错误

如何传递这些值以及在哪里传递?我认为应该在transforms.Compose方法中进行,但我可能错了。

MothLandmarksDataset中,尝试将Dictsample)传递给torchvision.transforms是行不通的,因为它需要torch.TensorPIL.Image作为输入。具体来说在这里:

sample = {'image': image, 'landmarks': landmarks}if self.transform:    sample = self.transform(sample)

可以sample["image"]传递给它,尽管你不应该这样做。仅对sample["image"]应用此操作会破坏它与landmarks的关系。你应该考虑的是像albumentations库(见这里),它可以以相同的方式转换imagelandmarks,以保持它们的关系。

此外,torchvision中没有Rescale变换,你可能指的是Resize

用于规范化的均值和方差

提供的代码很好,但你必须将数据解包成torch.Tensor,如下所示:

mean = 0.0std = 0.0nb_samples = 0.0for data in dataloader:    images, landmarks = data["image"], data["landmarks"]    batch_samples = images.size(0)    images_data = images.view(batch_samples, images.size(1), -1)    mean += images_data.mean(2).sum(0)    std += images_data.std(2).sum(0)    nb_samples += batch_samplesmean /= nb_samplesstd /= nb_samples

如何传递这些值以及在哪里传递?我认为应该在transforms.Compose方法中进行,但我可能错了。

这些值应该传递给torchvision.transforms.Normalize,仅应用于sample["images"],而不是sample["landmarks"]

我认为应该对整个数据集应用规范化,而不仅仅是训练集,对吗?

你应该在训练数据集上计算规范化值,并将这些计算出的值应用于验证和测试集。

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注