当我搜索关于 mnist.train.next_batch() 的信息时,我找到了这个https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/mnist.py
在这段代码中
def next_batch(self, batch_size, fake_data=False, shuffle=True): """Return the next `batch_size` examples from this data set.""" if fake_data: fake_image = [1] * 784 if self.one_hot: fake_label = [1] + [0] * 9 else: fake_label = 0 return [fake_image for _ in xrange(batch_size)], [ fake_label for _ in xrange(batch_size) ]start = self._index_in_epoch# Shuffle for the first epochif self._epochs_completed == 0 and start == 0 and shuffle: perm0 = numpy.arange(self._num_examples) numpy.random.shuffle(perm0) self._images = self.images[perm0] self._labels = self.labels[perm0]# Go to the next epochif start + batch_size > self._num_examples: # Finished epoch self._epochs_completed += 1 # Get the rest examples in this epoch rest_num_examples = self._num_examples - start images_rest_part = self._images[start:self._num_examples] labels_rest_part = self._labels[start:self._num_examples] # Shuffle the data if shuffle: perm = numpy.arange(self._num_examples) numpy.random.shuffle(perm) self._images = self.images[perm] self._labels = self.labels[perm] # Start next epoch start = 0 self._index_in_epoch = batch_size - rest_num_examples end = self._index_in_epoch images_new_part = self._images[start:end] labels_new_part = self._labels[start:end] return numpy.concatenate((images_rest_part, images_new_part), axis=0) , numpy.concatenate((labels_rest_part, labels_new_part), axis=0)else: self._index_in_epoch += batch_size end = self._index_in_epoch return self._images[start:end], self._labels[start:end]
我知道 mnist.train.next_batch(batch_size=100) 意味着它会从 MNIST 数据集中随机选择 100 条数据。现在,我的疑问是
- shuffle=true 是什么意思?
- 如果我设置 next_batch(batch_size=100,fake_data=False, shuffle=False),那么它会从 MNIST 数据集的开始到结束顺序选择 100 条数据吗?不是随机选择?
回答:
关于第一个问题,当 shuffle=True
时,数据集中例子的顺序会被随机化。关于第二个问题,是的,它会按照 numpy 数组中例子的原始顺序进行选择。