我在使用MXNet创建简单的神经网络时,遇到了step()方法的问题
x1.shape=(64, 1, 1000)y1.shape=(64, 1, 10)
net =nm.Sequential()net.add(nn.Dense(H,activation='relu'),nn.Dense(90,activation='relu'),nn.Dense(D_out))
for t in range(500): #y_pred = net(x1) #loss = loss_fn(y_pred, y) #for i in range(len(x1)): with autograd.record(): output=net(x1) loss =loss_fn(output,y1) loss.backward() trainer.step(64) if t % 100 == 99: print(t, loss) #optimizer.zero_grad()
UserWarning: Parameter
dense30_weight
在cpu(0)上的梯度自上次step
以来未被更新。这可能意味着您的模型中存在错误,导致它在本次迭代中仅使用了一部分参数(块)。如果您有意只使用一部分,请在调用step时使用ignore_stale_grad=True来抑制此警告,并跳过更新具有陈旧梯度的参数
回答:
错误提示您在trainer中传递的参数不在您的计算图中。您需要初始化模型的参数并定义trainer。与Pytorch不同的是,在MXNet中您不需要调用zero_grad,因为默认情况下新的梯度会被写入而不是累积。以下代码展示了使用MXNet的Gluon API实现的简单神经网络:
# Define modelnet = gluon.nn.Dense(1)net.collect_params().initialize(mx.init.Normal(sigma=1.), ctx=model_ctx)square_loss = gluon.loss.L2Loss()trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.0001})# Create random input and labelsdef real_fn(X): return 2 * X[:, 0] - 3.4 * X[:, 1] + 4.2X = nd.random_normal(shape=(num_examples, num_inputs))noise = 0.01 * nd.random_normal(shape=(num_examples,))y = real_fn(X) + noise# Define Dataloaderbatch_size = 4train_data = gluon.data.DataLoader(gluon.data.ArrayDataset(X, y), batch_size=batch_size, shuffle=True)num_batches = num_examples / batch_sizefor e in range(10): # Iterate over training batches for i, (data, label) in enumerate(train_data): # Load data on the CPU data = data.as_in_context(mx.cpu()) label = label.as_in_context(mx.cpu()) with autograd.record(): output = net(data) loss = square_loss(output, label) # Backpropagation loss.backward() trainer.step(batch_size) cumulative_loss += nd.mean(loss).asscalar() print("Epoch %s, loss: %s" % (e, cumulative_loss / num_examples))