我正在处理一个关于数字的经典示例。我想创建我的第一个神经网络来预测数字图像的标签{0,1,2,3,4,5,6,7,8,9}。因此,train.txt
文件的第一列是标签,其余所有列都是每个标签的特征。我定义了一个类来导入我的数据:
class DigitDataset(Dataset): """Digit dataset.""" def __init__(self, file_path, transform=None): """ Args: csv_file (string): Path to the csv file with annotations. root_dir (string): Directory with all the images. transform (callable, optional): Optional transform to be applied on a sample. """ self.data = pd.read_csv(file_path, header = None, sep =" ") self.transform = transform def __len__(self): return len(self.data) def __getitem__(self, idx): if torch.is_tensor(idx): idx = idx.tolist() labels = self.data.iloc[idx,0] images = self.data.iloc[idx,1:-1].values.astype(np.uint8).reshape((1,16,16)) if self.transform is not None: sample = self.transform(sample) return images, labels
然后我运行这些命令来将数据集分成批次,定义模型和损失函数:
train_dataset = DigitDataset("train.txt")train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True, num_workers=4)# Model creation with neural net Sequential modelmodel=nn.Sequential(nn.Linear(256, 128), # 1 layer:- 256 input 128 o/p nn.ReLU(), # Defining Regular linear unit as activation nn.Linear(128,64), # 2 Layer:- 128 Input and 64 O/p nn.Tanh(), # Defining Regular linear unit as activation nn.Linear(64,10), # 3 Layer:- 64 Input and 10 O/P as (0-9) nn.LogSoftmax(dim=1) # Defining the log softmax to find the probablities for the last output unit ) # defining the negative log-likelihood loss for calculating losscriterion = nn.NLLLoss()images, labels = next(iter(train_loader))images = images.view(images.shape[0], -1)logps = model(images) #log probabilitiesloss = criterion(logps, labels) #calculate the NLL-loss
然后我得到了以下错误:
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-2-7f4160c1f086> in <module> 47 images = images.view(images.shape[0], -1) 48 ---> 49 logps = model(images) #log probabilities 50 loss = criterion(logps, labels) #calculate the NLL-loss~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else:--> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(),~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input) 115 def forward(self, input): 116 for module in self:--> 117 input = module(input) 118 return input 119 ~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 725 result = self._slow_forward(*input, **kwargs) 726 else:--> 727 result = self.forward(*input, **kwargs) 728 for hook in itertools.chain( 729 _global_forward_hooks.values(), ~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py in forward(self, input) 91 92 def forward(self, input: Tensor) -> Tensor:---> 93 return F.linear(input, self.weight, self.bias) 94 95 def extra_repr(self) -> str: ~/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py in linear(input, weight, bias) 1688 if input.dim() == 2 and bias is not None: 1689 # fused op is marginally faster-> 1690 ret = torch.addmm(bias, input, weight.t()) 1691 else: 1692 output = input.matmul(weight.t())RuntimeError: expected scalar type Float but found Byte
您知道哪里出错了嗎?感谢您的耐心和帮助!
回答:
错误的原因是这行代码:
images = self.data.iloc[idx, 1:-1].values.astype(np.uint8).reshape((1, 16, 16))
images
是uint8
(byte
类型),而神经网络需要浮点数输入才能计算梯度(你不能用整数计算梯度,因为整数是非连续且不可微分的)。
你可以使用torchvision.transforms.functional.to_tensor
将图像转换为float
并将其值范围调整到[0, 1]
,如下所示:
import torchvisionimages = torchvision.transforms.functional.to_tensor( self.data.iloc[idx, 1:-1].values.astype(np.uint8).reshape((1, 16, 16)))
或者简单地除以255
来将值调整到[0, 1]
范围内。