我对机器学习还很新手,对Pytorch更是知之甚少。这里是我的问题。(我已经跳过了像random_split()这样看似正常工作的部分)
我需要预测红酒的质量(红酒),数据集中这是最后一列,有6个类别
features = df.drop(['quality'], axis = 1)targets = df.iloc[:, -1] # theres 6 classesdataset = TensorDataset(torch.Tensor(np.array(features)).float(), torch.Tensor(targets).float())# here's where I think the error might be, but I might be wrongbatch_size = 8# Dataloadertrain_loader = DataLoader(train_ds, batch_size, shuffle = True)val_loader = DataLoader(val_ds, batch_size)test_ds = DataLoader(test_ds, batch_size)input_size = len(df.columns) - 1output_size = 6threshold = .5class WineModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(input_size, output_size) def forward(self, xb): out = self.linear(xb) return outmodel = WineModel()n_iters = 2000num_epochs = n_iters / (len(train_ds) / batch_size)num_epochs = int(num_epochs)criterion = nn.CrossEntropyLoss()optimizer = torch.optim.SGD(model.parameters(), lr=1e-2)# the part below returns the error on runningiter = 0 for epoch in range(num_epochs): for i, (x, y) in enumerate(train_loader): optimizer.zero_grad() outputs = model(x) loss = criterion(outputs, y) loss.backward() optimizer.step()
RuntimeError: expected scalar type Long but found Float
希望这些信息足够
回答:
nn.CrossEntropyLoss
的目标是给定类别的索引,这些索引必须是整数,确切地说它们需要是torch.long
类型,相当于torch.int64
。
您将目标转换成了浮点数,但应该将它们转换为长整型:
dataset = TensorDataset(torch.Tensor(np.array(features)).float(), torch.Tensor(targets).long())
由于目标是类别的索引,它们必须在范围[0, num_classes – 1]内。因为您有6个类别,这应该是范围[0, 5]。快速浏览一下您的数据,质量使用的是范围[3, 8]的值。尽管您有6个类别,但这些值不能直接用作类别。如果您将类别列为classes = [3, 4, 5, 6, 7, 8]
,您可以看到第一个类别是3,classes[0] == 3
,直到最后一个类别是classes[5] == 8
。
您需要用索引替换类别值,就像您处理命名类别一样(例如,如果您有dog和cat类别,dog将是0,cat将是1),但您可以避免查找它们,因为这些值只是简单地偏移了3,即index = classes[index] - 3
。因此,您可以从整个目标张量中减去3:
torch.Tensor(targets).long() - 3