我正在尝试为一个多类别问题训练一个对象检测模型。在我的训练过程中，我使用了Mosaic增强，论文，来完成这个任务。

在我的训练机制中，我在正确获取每个类别的类别标签方面遇到了些许困难，因为增强机制会随机选择样本的子部分。然而，以下是我们迄今为止通过相关边界框实现的Mosaic增强的结果。

enter image description here

数据集

我创建了一个虚拟数据集。df.head()如下所示：

enter image description here

总共有4个类别，df.object.value_counts()如下：

human    23car      13cat       5dog       3

数据加载器和Mosaic增强

数据加载器定义如下。然而，Mosaic增强应该在其中定义，但现在我将创建一个单独的代码片段以便更好地演示：

IMG_SIZE = 2000class DatasetRetriever(Dataset):    def __init__(self, main_df, image_ids, transforms=None, test=False):        super().__init__()        self.image_ids = image_ids        self.main_df = main_df        self.transforms = transforms        self.size_limit = 1        self.test = test    def __getitem__(self, index: int):        image_id = self.image_ids[index]         image, boxes, labels = self.load_mosaic_image_and_boxes(index)                # labels = torch.tensor(labels, dtype=torch.int64) # for multi-class         labels = torch.ones((boxes.shape[0],), dtype=torch.int64) # for single-class                  target = {}        target['boxes'] = boxes        target['cls'] = labels        target['image_id'] = torch.tensor([index])        if self.transforms:            for i in range(10):                sample = self.transforms(**{                    'image' : image,                    'bboxes': target['boxes'],                    'labels': target['cls']                 })                                assert len(sample['bboxes']) == target['cls'].shape[0], 'not equal!'                if len(sample['bboxes']) > 0:                    # image                    image = sample['image']                                        # box                    target['boxes'] = torch.tensor(sample['bboxes'])                    target['boxes'][:,[0,1,2,3]] = target['boxes'][:,[1,0,3,2]]                                        # label                    target['cls'] = torch.stack(sample['labels'])                    break                            return image, target    def __len__(self) -> int:        return self.image_ids.shape[0]

基本变换

def get_transforms():    return A.Compose(        [            A.Resize(height=IMG_SIZE, width=IMG_SIZE, p=1.0),            ToTensorV2(p=1.0),        ],         p=1.0,         bbox_params=A.BboxParams(            format='pascal_voc',            min_area=0,             min_visibility=0,            label_fields=['labels']        )    )

Mosaic增强

请注意，它应该在数据加载器中定义。主要问题是，在这种增强中，当遍历所有4个样本来创建这种增强时，图像和边界框按如下方式重新缩放：

mosaic_image[y1a:y2a, x1a:x2a] = image[y1b:y2b, x1b:x2b]offset_x = x1a - x1boffset_y = y1a - y1bboxes[:, 0] += offset_xboxes[:, 1] += offset_yboxes[:, 2] += offset_xboxes[:, 3] += offset_y

这样，对于那些选定的边界框，我该如何选择相关的类别标签呢？请查看下面的完整代码：

def load_mosaic_image_and_boxes(self, index, s=3000,                                     minfrac=0.25, maxfrac=0.75):        self.mosaic_size = s        xc, yc = np.random.randint(s * minfrac, s * maxfrac, (2,))        # random other 3 sample         indices = [index] + random.sample(range(len(self.image_ids)), 3)         mosaic_image = np.zeros((s, s, 3), dtype=np.float32)        final_boxes  = [] # box for the sub-region        final_labels = [] # relevant class labels                for i, index in enumerate(indices):            image, boxes, labels = self.load_image_and_boxes(index)            if i == 0:    # top left                x1a, y1a, x2a, y2a =  0,  0, xc, yc                x1b, y1b, x2b, y2b = s - xc, s - yc, s, s # from bottom right            elif i == 1:  # top right                x1a, y1a, x2a, y2a = xc, 0, s , yc                x1b, y1b, x2b, y2b = 0, s - yc, s - xc, s # from bottom left            elif i == 2:  # bottom left                x1a, y1a, x2a, y2a = 0, yc, xc, s                x1b, y1b, x2b, y2b = s - xc, 0, s, s-yc   # from top right            elif i == 3:  # bottom right                x1a, y1a, x2a, y2a = xc, yc,  s, s                x1b, y1b, x2b, y2b = 0, 0, s-xc, s-yc    # from top left            # calculate and apply box offsets due to replacement                        offset_x = x1a - x1b            offset_y = y1a - y1b            boxes[:, 0] += offset_x            boxes[:, 1] += offset_y            boxes[:, 2] += offset_x            boxes[:, 3] += offset_y                        # cut image, save boxes            mosaic_image[y1a:y2a, x1a:x2a] = image[y1b:y2b, x1b:x2b]            final_boxes.append(boxes)            '''            ATTENTION:             Need some mechanism to get relevant class labels            '''            final_labels.append(labels)        # collect boxes        final_boxes  = np.vstack(final_boxes)        final_labels = np.hstack(final_labels)        # clip boxes to the image area        final_boxes[:, 0:] = np.clip(final_boxes[:, 0:], 0, s).astype(np.int32)        w = (final_boxes[:,2] - final_boxes[:,0])        h = (final_boxes[:,3] - final_boxes[:,1])                # discard boxes where w or h <10        final_boxes = final_boxes[(w>=self.size_limit) & (h>=self.size_limit)]        return mosaic_image, final_boxes, final_labels

回答：

我同时解析了边界框和类别标签信息。

以下是我们取得的输出。要尝试使用您自己的数据集，可以使用来开始。

enter image description here

学技术

如何从对象检测数据加载器中的Mosaic增强中获取类别标签？

数据集

数据加载器和Mosaic增强

发表回复取消回复

数据集

数据加载器和Mosaic增强

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复