我已经从LeCun网站下载了MNIST数据集。我希望编写Python代码来解压gzip文件并直接从目录中读取数据集,这意味着我不再需要下载或访问MNIST网站。
期望的流程:访问文件夹/目录 –> 解压gzip –> 读取数据集(独热编码)
该如何操作呢?因为几乎所有教程都需要访问LeCun或TensorFlow网站来下载和读取数据集。提前感谢!
回答:
这个TensorFlow调用
from tensorflow.examples.tutorials.mnist import input_datainput_data.read_data_sets('my/directory')
… 如果你已经在该目录下有了文件,它不会下载任何东西。
但如果出于某些原因你希望自己解压,这里是如何做的:
from tensorflow.contrib.learn.python.learn.datasets.mnist import extract_images, extract_labelswith open('my/directory/train-images-idx3-ubyte.gz', 'rb') as f: train_images = extract_images(f)with open('my/directory/train-labels-idx1-ubyte.gz', 'rb') as f: train_labels = extract_labels(f)with open('my/directory/t10k-images-idx3-ubyte.gz', 'rb') as f: test_images = extract_images(f)with open('my/directory/t10k-labels-idx1-ubyte.gz', 'rb') as f: test_labels = extract_labels(f)