使用PyTorch的MLP在从Keras获取的MNIST数据集上训练时遇到的问题

我已经完成了一个针对MNIST数据集的PyTorch MLP模型,但在训练时得到了两种不同的结果:使用PyTorch的MNIST数据集时准确率为0.90+,而使用Keras的MNIST数据集时准确率约为0.10。以下是我的代码,依赖如下:PyTorch 0.3.0.post4, keras 2.1.3, tensorflow后端1.4.1 gpu版本。

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport numpy as npimport torch as ptimport torchvision as ptvfrom keras.datasets import mnistfrom torch.nn import functional as Ffrom torch.utils.data import Dataset, DataLoader# training data from PyTorchtrain_set = ptv.datasets.MNIST("./data/mnist/train", train=True, transform=ptv.transforms.ToTensor(), download=True)test_set = ptv.datasets.MNIST("./data/mnist/test", train=False, transform=ptv.transforms.ToTensor(), download=True)train_dataset = DataLoader(train_set, batch_size=100, shuffle=True)test_dataset = DataLoader(test_set, batch_size=10000, shuffle=True)class MLP(pt.nn.Module):    """The Multi-layer perceptron"""    def __init__(self):        super(MLP, self).__init__()        self.fc1 = pt.nn.Linear(784, 512)        self.fc2 = pt.nn.Linear(512, 128)        self.fc3 = pt.nn.Linear(128, 10)        self.use_gpu = True    def forward(self, din):        din = din.view(-1, 28 * 28)        dout = F.relu(self.fc1(din))        dout = F.relu(self.fc2(dout))        # return F.softmax(self.fc3(dout))        return self.fc3(dout)model = MLP().cuda()print(model)# loss func and optimoptimizer = pt.optim.SGD(model.parameters(), lr=1)criterion = pt.nn.CrossEntropyLoss().cuda()def evaluate_acc(pred, label):    pred = pred.cpu().data.numpy()    label = label.cpu().data.numpy()    test_np = (np.argmax(pred, 1) == label)    test_np = np.float32(test_np)    return np.mean(test_np)def evaluate_loader(loader):    print("evaluating ...")    accurarcy_list = []    for i, (inputs, labels) in enumerate(loader):        inputs = pt.autograd.Variable(inputs).cuda()        labels = pt.autograd.Variable(labels).cuda()        outputs = model(inputs)        accurarcy_list.append(evaluate_acc(outputs, labels))    print(sum(accurarcy_list) / len(accurarcy_list))def training(d, epochs):    for x in range(epochs):        for i, data in enumerate(d):            optimizer.zero_grad()            (inputs, labels) = data            inputs = pt.autograd.Variable(inputs).cuda()            labels = pt.autograd.Variable(labels).cuda()            outputs = model(inputs)            loss = criterion(outputs, labels)            loss.backward()            optimizer.step()            if i % 200 == 0:                print(i, ":", evaluate_acc(outputs, labels))# Training MLP for 4 epochs with MNIST dataset from PyTorchtraining(train_dataset, 4)# The accuracy is ~0.96.evaluate_loader(test_dataset)print("###########################################################")def load_mnist():    (x, y), (x_test, y_test) = mnist.load_data()    x = x.reshape((-1, 1, 28, 28)).astype(np.float32)    x_test = x_test.reshape((-1, 1, 28, 28)).astype(np.float32)    y = y.astype(np.int64)    y_test = y_test.astype(np.int64)    print("x.shape", x.shape, "y.shape", y.shape,          "\nx_test.shape", x_test.shape, "y_test.shape", y_test.shape,          )    return x, y, x_test, y_testclass TMPDataset(Dataset):    """Dateset for loading Keras MNIST dataset."""    def __init__(self, a, b):        self.x = a        self.y = b    def __getitem__(self, item):        return self.x[item], self.y[item]    def __len__(self):        return len(self.y)x_train, y_train, x_test, y_test = load_mnist()# Create dataloader for MNIST dataset from Keras.test_loader = DataLoader(TMPDataset(x_test, y_test), num_workers=1, batch_size=10000)train_loader = DataLoader(TMPDataset(x_train, y_train), shuffle=True, batch_size=100)# Evaluate the performance of MLP trained on PyTorch dataset and the accurach is ~0.96.evaluate_loader(test_loader)evaluate_loader(train_loader)model = MLP().cuda()print(model)optimizer = pt.optim.SGD(model.parameters(), lr=1)criterion = pt.nn.CrossEntropyLoss().cuda()# Train now on MNIST dataset from Keras.training(train_loader, 4)# Evaluate the trianed model on MNIST dataset from Keras and result in performance ~0.10...evaluate_loader(test_loader)evaluate_loader(train_loader)

我已经检查了来自Keras的MNIST数据集的一些样本,并没有发现错误。我想知道数据集有什么问题?代码可以无错误运行,运行它来查看结果。


回答:

来自Keras的MNIST数据被归一化;按照Keras MNIST MLP示例,你应该手动进行归一化处理,即在你的load_data()函数中应包括以下内容:

x /= 255x_test /= 255

关于PyTorch我不确定,但看起来他们自己的工具函数提供的MNIST数据已经是归一化过的(就像Tensorflow的情况一样 – 参见我在这里的回答中的第三点)。

在未归一化的输入数据情况下,10%的准确率(即相当于随机猜测)是完全合理的。

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注