机器学习模型无法加载完整批次

我尝试使用CIFAR 10数据集构建一个机器学习模型,但遇到了一个问题,我的模型在i = 78处停止训练(循环了78次,查看代码了解更多信息)。

import torchimport torchvision.transforms as transformsfrom torchvision.datasets import CIFAR10from torchvision.transforms import ToTensorfrom torch.utils.data.dataloader import DataLoadertransform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')train_dataset = CIFAR10(root = './data', train = True, download = True, transform = transform)train_loader = DataLoader(train_dataset, batch_size = 4, shuffle = True, num_workers = 2)test_dataset = CIFAR10(root = './data', train = False, download = True, transform = transform)test_loader = DataLoader(test_dataset, batch_size = 128, shuffle = False, num_workers = 2)import torch.nn as nnimport torch.nn.functional as Fclass Net(nn.Module):    def __init__(self):        super(Net, self).__init__()        self.conv1 = nn.Conv2d(3, 6, 5)        self.pool = nn.MaxPool2d(2, 2)        self.conv2 = nn.Conv2d(6, 16, 5)        self.fc1 = nn.Linear(16 * 5 * 5, 120)        self.fc2 = nn.Linear(120, 84)        self.fc3 = nn.Linear(84, 10)    def forward(self, x):        x = self.pool(F.relu(self.conv1(x)))        x = self.pool(F.relu(self.conv2(x)))        x = x.view(-1, 16 * 5 * 5)        x = F.relu(self.fc1(x))        x = F.relu(self.fc2(x))        x = self.fc3(x)        return xnet = Net()optimiser = torch.optim.SGD(model.parameters(), lr = 0.001, momentum=0.9)loss_fn = nn.CrossEntropyLoss()for epoch in range(2):  running_loss = 0  for i, data in enumerate(test_loader, 0):    images, labels = data    outputs = model(images)    loss = loss_fn(outputs, labels)    optimiser.zero_grad()    loss.backward()    optimiser.step()    running_loss += loss.item()    print(i)    if i % 2000 == 1999:    # print every 2000 mini-batches            print('[%d, %5d] loss: %.3f' %                  (epoch + 1, i + 1, running_loss / 2000))            running_loss = 0

抱歉,我必须发布整个代码,因为我无法找出我犯的错误。此外,由于我无法让它工作,我尝试复制教程中的确切代码,结果它按预期工作!我也将在下面发布那个代码,

import torchimport torchvisionimport torchvision.transforms as transformstransform = transforms.Compose(     [transforms.ToTensor(),     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])trainset = torchvision.datasets.CIFAR10(root='./data', train=True,                                        download=True, transform=transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,                                      shuffle=True, num_workers=2)testset = torchvision.datasets.CIFAR10(root='./data', train=False,                                       download=True, transform=transform)testloader = torch.utils.data.DataLoader(testset, batch_size=4,                                     shuffle=False, num_workers=2)classes = ('plane', 'car', 'bird', 'cat',           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')import torch.nn as nnimport torch.nn.functional as Fclass Net(nn.Module):    def __init__(self):        super(Net, self).__init__()        self.conv1 = nn.Conv2d(3, 6, 5)        self.pool = nn.MaxPool2d(2, 2)        self.conv2 = nn.Conv2d(6, 16, 5)        self.fc1 = nn.Linear(16 * 5 * 5, 120)        self.fc2 = nn.Linear(120, 84)        self.fc3 = nn.Linear(84, 10)def forward(self, x):    x = self.pool(F.relu(self.conv1(x)))    x = self.pool(F.relu(self.conv2(x)))    x = x.view(-1, 16 * 5 * 5)    x = F.relu(self.fc1(x))    x = F.relu(self.fc2(x))    x = self.fc3(x)    return xnet = Net()import torch.optim as optimcriterion = nn.CrossEntropyLoss()optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)for epoch in range(2):  # loop over the dataset multiple timesrunning_loss = 0.0for i, data in enumerate(trainloader, 0):    # get the inputs; data is a list of [inputs, labels]    inputs, labels = data    # zero the parameter gradients    optimizer.zero_grad()    # forward + backward + optimize    outputs = net(inputs)    loss = criterion(outputs, labels)    loss.backward()    optimizer.step()    # print statistics    running_loss += loss.item()    if i % 2000 == 1999:    # print every 2000 mini-batches        print('[%d, %5d] loss: %.3f' %              (epoch + 1, i + 1, running_loss / 2000))        running_loss = 0.0print('Finished Training')

请帮助我找出错误!


回答:

查看你的主循环,你会注意到你在使用test_loader而不是train_loader。这

for epoch in range(2):  running_loss = 0  for i, data in enumerate(test_loader, 0):    images, labels = data    outputs = model(images)

应该看起来像这样:

for epoch in range(2):  running_loss = 0  for i, data in enumerate(train_loader, 0):    images, labels = data    outputs = model(images)

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注