我在学习使用MNIST数据集进行分类时遇到了一个错误,无法解决。我已经做了很多谷歌搜索,但还是没有进展。或许您是专家,可以帮我解决。以下是我的代码–
>>> from sklearn.datasets import fetch_openml>>> mnist = fetch_openml('mnist_784', version=1)>>> mnist.keys()
输出:dict_keys([‘data’, ‘target’, ‘frame’, ‘categories’, ‘feature_names’, ‘target_names’, ‘DESCR’, ‘details’, ‘url’])
>>> X, y = mnist["data"], mnist["target"]>>> X.shape
输出:(70000, 784)
>>> y.shape
输出:(70000)
>>> X[0]输出:KeyError Traceback (most recent call last)c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2897 try:-> 2898 return self._engine.get_loc(casted_key) 2899 except KeyError as err:pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()KeyError: 0The above exception was the direct cause of the following exception:KeyError Traceback (most recent call last)<ipython-input-10-19c40ecbd036> in <module>----> 1 X[0]c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key) 2904 if self.columns.nlevels > 1: 2905 return self._getitem_multilevel(key)-> 2906 indexer = self.columns.get_loc(key) 2907 if is_integer(indexer): 2908 indexer = [indexer]c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2898 return self._engine.get_loc(casted_key) 2899 except KeyError as err:-> 2900 raise KeyError(key) from err 2901 2902 if tolerance is not None:KeyError: 0
回答:
fetch_openml
的API在不同版本之间发生了变化。在早期版本中,它返回的是numpy.ndarray
数组。自0.24.0
版本(2020年12月)起,fetch_openml
的as_frame
参数被设置为auto
(之前的默认选项是False
),这会为MNIST数据返回一个pandas.DataFrame
。您可以通过设置as_frame = False
来强制读取数据为numpy.ndarray
。请参阅fetch_openml参考。