以下示例来自于:
https://github.com/jcjohnson/pytorch-examples
这段代码可以成功训练:
# Code in file tensor/two_layer_net_tensor.pyimport torchdevice = torch.device('cpu')# device = torch.device('cuda') # Uncomment this to run on GPU# N is batch size; D_in is input dimension;# H is hidden dimension; D_out is output dimension.N, D_in, H, D_out = 64, 1000, 100, 10# Create random input and output datax = torch.randn(N, D_in, device=device)y = torch.randn(N, D_out, device=device)# Randomly initialize weightsw1 = torch.randn(D_in, H, device=device)w2 = torch.randn(H, D_out, device=device)learning_rate = 1e-6for t in range(500): # Forward pass: compute predicted y h = x.mm(w1) h_relu = h.clamp(min=0) y_pred = h_relu.mm(w2) # Compute and print loss; loss is a scalar, and is stored in a PyTorch Tensor # of shape (); we can get its value as a Python number with loss.item(). loss = (y_pred - y).pow(2).sum() print(t, loss.item()) # Backprop to compute gradients of w1 and w2 with respect to loss grad_y_pred = 2.0 * (y_pred - y) grad_w2 = h_relu.t().mm(grad_y_pred) grad_h_relu = grad_y_pred.mm(w2.t()) grad_h = grad_h_relu.clone() grad_h[h < 0] = 0 grad_w1 = x.t().mm(grad_h) # Update weights using gradient descent w1 -= learning_rate * grad_w1 w2 -= learning_rate * grad_w2
如何预测单个示例?我之前的经验主要是使用numpy
来构建前馈网络。在训练模型后,我使用前向传播来预测单个示例:
numpy
代码片段,其中new
是我试图预测的输出值:
new = np.asarray(toclassify) Z1 = np.dot(weight_layer_1, new.T) + bias_1 sigmoid_activation_1 = sigmoid(Z1) Z2 = np.dot(weight_layer_2, sigmoid_activation_1) + bias_2 sigmoid_activation_2 = sigmoid(Z2)
sigmoid_activation_2
包含预测的向量属性
PyTorch中是否也使用前向传播来进行单个预测呢?
回答:
您发布的代码是一个简单的演示,试图揭示这种深度学习框架的内部机制。这些框架,包括PyTorch、Keras、Tensorflow等,会自动处理前向计算、跟踪和应用梯度,只要您定义了网络结构。然而,您展示的代码仍然尝试手动完成这些工作。这就是为什么您在预测单个示例时感觉繁琐,因为您仍然是从头开始做的。
在实践中,我们会定义一个从torch.nn.Module
继承的模型类,并在__init__
函数中初始化所有网络组件(如神经层、GRU、LSTM层等),并在forward
函数中定义这些组件如何与网络输入交互。
以您提供的页面中的示例为例:
# Code in file nn/two_layer_net_module.pyimport torchclass TwoLayerNet(torch.nn.Module): def __init__(self, D_in, H, D_out): """ In the constructor we instantiate two nn.Linear modules and assign them as member variables. """ super(TwoLayerNet, self).__init__() self.linear1 = torch.nn.Linear(D_in, H) self.linear2 = torch.nn.Linear(H, D_out) def forward(self, x): """ In the forward function we accept a Tensor of input data and we must return a Tensor of output data. We can use Modules defined in the constructor as well as arbitrary (differentiable) operations on Tensors. """ h_relu = self.linear1(x).clamp(min=0) y_pred = self.linear2(h_relu) return y_pred# N is batch size; D_in is input dimension;# H is hidden dimension; D_out is output dimension.N, D_in, H, D_out = 64, 1000, 100, 10# Create random Tensors to hold inputs and outputsx = torch.randn(N, D_in)y = torch.randn(N, D_out)# Construct our model by instantiating the class defined above.model = TwoLayerNet(D_in, H, D_out)# Construct our loss function and an Optimizer. The call to model.parameters()# in the SGD constructor will contain the learnable parameters of the two# nn.Linear modules which are members of the model.loss_fn = torch.nn.MSELoss(size_average=False)optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)for t in range(500): # Forward pass: Compute predicted y by passing x to the model y_pred = model(x) # Compute and print loss loss = loss_fn(y_pred, y) print(t, loss.item()) # Zero gradients, perform a backward pass, and update the weights. optimizer.zero_grad() loss.backward() optimizer.step()
这段代码定义了一个名为TwoLayerNet的模型,它在__init__
函数中初始化了两个线性层,并在forward
函数中进一步定义了这两个线性层如何与输入x
交互。
定义了模型后,我们可以执行单个前馈操作如下。假设xu
包含一个未见过的单个示例:
xu = torch.randn(D_in)
然后执行预测:
y_pred = model(torch.atleast_2d(xu))