这里 x_dat
和 y_dat
只是非常长的1维张量。
class FunctionDataset(Dataset): def __init__(self): x_dat, y_dat = data_product() self.length = len(x_dat) self.y_dat = y_dat self.x_dat = x_dat def __getitem__(self, index): sample = self.x_dat[index] label = self.y_dat[index] return sample, label def __len__(self): return self.length...data_set = FunctionDataset()...training_sampler = SubsetRandomSampler(train_indices)validation_sampler = SubsetRandomSampler(validation_indices)training_loader = DataLoader(data_set, sampler=training_sampler, batch_size=params['batch_size'], shuffle=False)validation_loader = DataLoader(data_set, sampler=validation_sampler, batch_size=valid_size, shuffle=False)
我还尝试过为两个加载器固定内存。将 num_workers
设置为大于0时,会在进程间出现运行时错误(如EOF错误和中断错误)。我通过以下方式获取批次:
x_val, target = next(iter(training_loader))
整个数据集可以放入内存/GPU,但我希望在这个实验中模拟批次。分析我的进程后,得到以下结果:
16276989 function calls (16254744 primitive calls) in 38.779 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1745/1 0.028 0.000 38.780 38.780 {built-in method builtins.exec} 1 0.052 0.052 38.780 38.780 simple aprox.py:3(<module>) 1 0.000 0.000 36.900 36.900 simple aprox.py:519(exploreHeatmap) 1 0.000 0.000 36.900 36.900 simple aprox.py:497(optFromSample) 1 0.033 0.033 36.900 36.900 simple aprox.py:274(train) 705/483 0.001 0.000 34.495 0.071 {built-in method builtins.next} 222 1.525 0.007 34.493 0.155 dataloader.py:311(__next__) 222 0.851 0.004 12.752 0.057 dataloader.py:314(<listcomp>) 3016001 11.901 0.000 11.901 0.000 simple aprox.py:176(__getitem__) 21 0.010 0.000 10.891 0.519 simple aprox.py:413(validationError) 443 1.380 0.003 9.664 0.022 sampler.py:136(__iter__) 663/221 2.209 0.003 8.652 0.039 dataloader.py:151(default_collate) 221 0.070 0.000 6.441 0.029 dataloader.py:187(<listcomp>) 442 6.369 0.014 6.369 0.014 {built-in method stack} 3060221 2.799 0.000 5.890 0.000 sampler.py:68(<genexpr>) 3060000 3.091 0.000 3.091 0.000 tensor.py:382(<lambda>) 222 0.001 0.000 1.985 0.009 sampler.py:67(__iter__) 222 1.982 0.009 1.982 0.009 {built-in method randperm} 663/221 0.002 0.000 1.901 0.009 dataloader.py:192(pin_memory_batch) 221 0.000 0.000 1.899 0.009 dataloader.py:200(<listcomp>)....
这表明与我的实验的其他活动(训练模型和大量其他计算等)相比,数据加载器非常慢。哪里出了问题?有什么最佳方法可以加速这个过程?
回答:
当通过
x, y = next(iter(training_loader))
获取批次时,实际上每次调用都会创建一个新的数据加载器迭代器(!)。有关更多信息,请参见此线程。
你应该做的相反是创建迭代器(每个epoch)一次:
training_loader_iter = iter(training_loader)
然后对迭代器的每个批次调用 next
for i in range(num_batches_in_epoch): x, y = next(training_loader_iter)
我之前也遇到过类似的问题,这样做还可以解决你在使用多个工作线程时遇到的EOF错误。