我有两个文件夹。一个文件夹里有一张图片,另一个文件夹里有另一张图片。我需要比较这两张图片并找出它们的差异,但代码是从随机文件夹中选择的图片。
class InferenceSiameseNetworkDataset(Dataset): def __init__(self,imageFolderDataset,transform=None,should_invert=True): self.imageFolderDataset = imageFolderDataset self.transform = transform self.should_invert = should_invert def __getitem__(self,index): img0_tuple = random.choice(self.imageFolderDataset.imgs) img1_tuple = random.choice(self.imageFolderDataset.imgs) #we need to make sure approx 50% of images are in the same class should_get_same_class = random.randint(0,1) if should_get_same_class: while True: #keep looping till the same class image is found img1_tuple = random.choice(self.imageFolderDataset.imgs) if img0_tuple[1]==img1_tuple[1]: break else: while True: #keep looping till a different class image is found img1_tuple = random.choice(self.imageFolderDataset.imgs) if img0_tuple[1] !=img1_tuple[1]: break img0 = Image.open(img0_tuple[0]) img1 = Image.open(img1_tuple[0]) img0 = img0.convert("L") img1 = img1.convert("L") if self.should_invert: img0 = PIL.ImageOps.invert(img0) img1 = PIL.ImageOps.invert(img1) if self.transform is not None: img0 = self.transform(img0) img1 = self.transform(img1) return img0, img1 , torch.from_numpy(np.array([int(img1_tuple[1]!=img0_tuple[1])],dtype=np.float32)) def __len__(self): return len(self.imageFolderDataset.imgs)
我从GitHub上获取了这段代码,当我尝试比较两张图片的差异时,它会随机选择图片。输入文件夹有两个,一个图片应该在一个文件夹中,另一个图片应该在另一个文件夹中。当我尝试测试时,有时它会在同一个图片上进行测试,我的意思是它没有检查另一个文件夹中的另一张图片。
testing_dir1 = '/content/drive/My Drive/Signature Dissimilarity/Forged_Signature_Verification/processed_dataset/training1/'folder_dataset_test = dset.ImageFolder(root=testing_dir1)siamese_dataset = InferenceSiameseNetworkDataset(imageFolderDataset=folder_dataset_test, transform=transforms.Compose([transforms.Resize((100,100)), transforms.ToTensor() ]) ,should_invert=False)test_dataloader = DataLoader(siamese_dataset,num_workers=6,batch_size=1,shuffle=False)dataiter = iter(test_dataloader)x0,_,_ = next(dataiter)for i in range(2): _,x1,label2 = next(dataiter) concatenated = torch.cat((x0,x1),0) output1,output2 = net(Variable(x0).cuda(),Variable(x1).cuda()) euclidean_distance = F.pairwise_distance(output1, output2) imshow(torchvision.utils.make_grid(concatenated),'Dissimilarity: {:.2f}'.format(euclidean_distance.item())) dis = 'Dissimilarity: {:.2f}'.format(euclidean_distance.item()) dis1 = dis dis1 = dis1.replace("Dissimilarity:", "").replace(" ", "") print(dis) if float(dis1) < 0.5: print("It's Same Signature") else: print("It's Forged Signature")
回答:
只需在自定义数据集类InferenceSiameseNetworkDataset
的__getitem__
函数中将should_get_same_class
设为0,你就可以确保两张图片属于不同的类/文件夹。
其次,你不应该连接可能不满足你条件的两个批次的样本。你应该在循环范围内使用x0,x1,label2 = next(dataiter)
,然后进行连接操作。