我正在使用带有 jupyter notebook 的 MNIST 数据集进行机器学习项目 [关闭]答案

【问题标题】：I'm doing a machine learning project using MNIST data set with jupyter notebook [closed]我正在使用带有 jupyter notebook 的 MNIST 数据集进行机器学习项目 [关闭]
【发布时间】：2021-09-27 06:16:21
【问题描述】：

由于某种原因，我收到此错误，但我不知道为什么？任何帮助表示赞赏。我在 jupyter notebook 上做 MNIST 手写数字识别。 x_train 数组是一个 (70000,784) 数组。

~\AppData\Local\Temp/ipykernel_6700/1983129579.py in <module>
      1 shuffle_index = np.random.permutation(6000)
----> 2 x_train, y_train = x_train[shuffle_index], y_train[shuffle_index]

c:\users\hp\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3459             if is_iterator(key):
   3460                 key = list(key)
-> 3461             indexer = self.loc._get_listlike_indexer(key, axis=1)[1]
   3462 
   3463         # take() does not accept boolean indexers

c:\users\hp\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis)
   1312             keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
   1313 
-> 1314         self._validate_read_indexer(keyarr, indexer, axis)
   1315 
   1316         if needs_i8_conversion(ax.dtype) or isinstance(

c:\users\hp\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis)
   1372                 if use_interval_msg:
   1373                     key = list(key)
-> 1374                 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   1375 
   1376             not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())

KeyError: "None of [Int64Index([4112, 3293,  403, 2579,  942,  987, 3778, 3831, 3053, 3412,\n            ...\n             642, 2789, 3410, 3946, 5883, 3439, 2029, 2776, 4626,  497],\n           dtype='int64', length=6000)] are in the [columns]"

【问题讨论】：

添加错误跟踪和最小可重现代码meta.stackoverflow.com/questions/285551/…
请阅读ML标签的description。

标签： python numpy machine-learning scikit-learn

【解决方案1】：

如果 x_train 和 y_train 是 DataFrame，并且您希望结果是 DataFrame，您可以使用：

x_train = x_train.iloc[shuffle_index]

【讨论】：

【解决方案2】：

你用错了，应该是这样的

x_train = np.random.permutation(x_train)
y_train = np.random.permutation(y_train)

你也可以参考这里的例子 np.random.permutation

【讨论】：

这是我之前做的：shuffle_index = np.random.permutation(6000) x_train, y_train = x_train[shuffle_index], y_train[shuffle_index] 还是错了吗？
我猜你正在尝试洗牌数据，对吧？这个发生了什么是你试图用一个 1D 列表索引一个 2D numpy 数组，这就是错误出现的原因。