我正在学习使用PyTorch。
我有一个问题,如何在我的代码中检查每一层的输出梯度。
我的代码如下
#import the nescessary libsimport numpy as npimport torchimport time# Loading the Fashion-MNIST datasetfrom torchvision import datasets, transforms# Get GPU Devicedevice = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# Define a transform to normalize the datatransform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) ])# Download and load the training datatrainset = datasets.FashionMNIST('MNIST_data/', download = True, train = True, transform = transform)testset = datasets.FashionMNIST('MNIST_data/', download = True, train = False, transform = transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size = 32, shuffle = True, num_workers=4)testloader = torch.utils.data.DataLoader(testset, batch_size = 32, shuffle = True, num_workers=4)# Examine a sampledataiter = iter(trainloader)images, labels = dataiter.next()# Define the network architecturefrom torch import nn, optimimport torch.nn.functional as Fmodel = nn.Sequential(nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 10), nn.LogSoftmax(dim = 1) )model.to(device)# Define the losscriterion = nn.CrossEntropyLoss()# Define the optimizeroptimizer = optim.Adam(model.parameters(), lr = 0.001)# Define the epochsepochs = 5train_losses, test_losses = [], []# start = time.time()for e in range(epochs): running_loss = 0 for images, labels in trainloader: # Flatten Fashion-MNIST images into a 784 long vector images = images.to(device) labels = labels.to(device) images = images.view(images.shape[0], -1) # Training pass optimizer.zero_grad() output = model.forward(images) loss = criterion(output, labels) loss.backward() # print(loss.grad) optimizer.step() running_loss += loss.item() else: print(model[0].grad)
如果我在反向传播后打印model[0].grad
,它会是每一轮中每一层的输出梯度吗?
或者,如果我想知道每一层的输出梯度,我应该在哪里打印什么?
谢谢你!!
谢谢阅读
回答:
嗯,如果你需要了解模型内部的计算,这是一个很好的问题。让我来解释给你听!
首先,当你打印model
变量时,你会得到这样的输出:
Sequential( (0): Linear(in_features=784, out_features=128, bias=True) (1): ReLU() (2): Linear(in_features=128, out_features=10, bias=True) (3): LogSoftmax(dim=1))
如果你选择model[0]
,那意味着你选择了模型的第一层,也就是Linear(in_features=784, out_features=128, bias=True)
。如果你查看torch.nn.Linear
的文档这里,你会发现这个类有两个可以访问的变量。一个是Linear.weight,另一个是Linear.bias,它们分别会给你对应层的权重和偏置。
请记住,你不能使用model.weight
来查看模型的权重,因为你的线性层被保存在一个名为nn.Sequential
的容器中,它没有weight
属性。
所以回到查看权重和偏置的问题,你可以按层访问它们。因此,model[0].weight
和model[0].bias
是第一层的权重和偏置。同样地,要访问第一层的梯度,model[0].weight.grad
和model[0].bias.grad
将是梯度。