我有一个简单的NN:
import torchimport torch.nn as nnimport torch.optim as optimclass Model(nn.Module): def __init__(self): super(Model, self).__init__() self.fc1 = nn.Linear(1, 5) self.fc2 = nn.Linear(5, 10) self.fc3 = nn.Linear(10, 1) def forward(self, x): x = self.fc1(x) x = torch.relu(x) x = torch.relu(self.fc2(x)) x = self.fc3(x) return xnet = Model()opt = optim.Adam(net.parameters())
我还有一些输入特征:
features = torch.rand((3,1))
我可以用一个简单的MSE损失函数进行正常训练:
for i in range(10): opt.zero_grad() out = net(features) loss = torch.mean(torch.square(torch.tensor(5) - torch.sum(out))) print('loss:', loss) loss.backward() opt.step()
然而,我试图创建一个考虑实际权重值的损失函数:
loss = 1 - torch.mean(torch.tensor([torch.sum(w_arr) for w_arr in net.parameters()]))
但我得到了一个错误:
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
这里的目标是使每个权重的值尽可能接近1(或其他任何值)。
回答:
一个快速的错误修复是在创建张量时包含requires_grad = True
,这样做 –
loss = 1 - torch.mean(torch.tensor([torch.sum(w_arr) for w_arr in net.parameters()], requires_grad=True))
但是在将权重列表转换为张量时,torch不知道该张量的来源,因此损失不会减少。一种方法是
for i in range(500): opt.zero_grad() out = net(features) loss = torch.mean(torch.square(torch.tensor(5) - torch.sum(out))) len_w = 0 for w_arr in net.parameters(): loss += torch.mean(torch.abs(1 - w_arr)) len_w += 1 loss /= len_w print('loss:', loss) loss.backward() opt.step()
通过这种损失计算方式,可以确保所有权重都接近+1
。