代码如下:
import numpy as np predictors = np.array([[73,67,43],[91,88,64],[87,134,58],[102,43,37],[69,96,70]],dtype='float32') outputs = np.array([[56,70],[81,101],[119,133],[22,37],[103,119]],dtype='float32') inputs = torch.from_numpy(predictors) targets = torch.from_numpy(outputs) weights = torch.randn(2,3,requires_grad=True) biases = torch.randn(2,requires_grad=True) def loss_mse(x,y): d = x-y return torch.sum(d*d)/d.numel() def model(w,b,x): return x @ w.t() +b def train(x,y,w,b,lr,e): w = torch.tensor(w,requires_grad=True) b = torch.tensor(b,requires_grad=True) for epoch in range(e): preds = model(w,b,x) loss = loss_mse(y,preds) if epoch%5 == 0: print("Loss at Epoch [{}/{}] is {}".format(epoch,e,loss)) #loss.requires_grad=True loss.backward() with torch.no_grad(): w = w - lr*w.grad b = b - lr*b.grad w.grad.zero_() b.grad.zero_() train(inputs,targets,weights,biases,1e-5,100)
运行这段代码会出现不同的错误。一次是loss
的大小为0的错误。另一次是在更新的行w = w-lr*w.grad
中,出现无法从NoneType中减去浮点数的错误。
回答:
首先,为什么你要将权重和偏置包装成Tensor两次?
weights = torch.randn(2,3,requires_grad=True)biases = torch.randn(2,requires_grad=True)de here
然后在train函数内部你使用了:
w = torch.tensor(w,requires_grad=True)b = torch.tensor(b,requires_grad=True)
其次,在更新权重部分,将其更改为:
with torch.no_grad(): w_new = w - lr*w.grad b_new = b - lr*b.grad w.copy_(w_new) b.copy_(b_new) w.grad.zero_() b.grad.zero_()
你可以查看这个讨论以获取更全面的解释:https://discuss.pytorch.org/t/updatation-of-parameters-without-using-optimizer-step/34244/20