我构建了一个自定义的神经网络模型,如下所示:
class MyNNet(torch.nn.Module): def __init__(self, inp_dim, n_classes): super(MyNNet, self).__init__() self.flat = torch.nn.Flatten() self.l1 = torch.nn.Linear(inp_dim * inp_dim, 32) self.l2 = torch.nn.Linear(32, 16) self.l3 = torch.nn.Linear(16, n_classes) def forward(self, X): out = self.flat(X) out = F.relu(self.l1(out)) out = F.relu(self.l2(out)) return self.l3(out)
以及一个简单的训练脚本,用于更新模型参数:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')model = MyNNet(28, 10)model.to(device)optimizer = torch.optim.Adam(model.parameters())loss = torch.nn.CrossEntropyLoss()epochs = 20for e in range(epochs): train_l = 0. for i, (s, c) in enumerate(train_loader): s.to(device) c.to(device) y_hat = model(s) l = loss(y_hat, c) train_l += l l.backward() optimizer.step() optimizer.zero_grad() print(f'Epoch: {e}, AvgLoss: {train_l / len(train_loader)}')
在脚本中,我将模型存储到cuda,并对数据集(MNIST)的每一批次也这样做。然而,出现了以下错误: Expected all tensors to be on the same device, but found at least two devices
但当我注释掉 model.to(device)
后,脚本就能正常工作。这是否意味着PyTorch会自动将自定义模型存储到cuda?
谢谢。
回答:
与Module
(其中.to(...)
以原地方式工作)不同,在将Tensor
移动到设备时,需要重新赋值:
s = s.to(device)c = c.to(device)