关于在 sklearn 中使用 PCA 进行图像检索的困惑答案

【问题标题】：Confusion about using PCA in sklearn for image retrieval关于在 sklearn 中使用 PCA 进行图像检索的困惑
【发布时间】：2021-10-28 21:21:24
【问题描述】：

我在进行图像检索时需要使用降维，我尝试在sklearn中使用PCA将dim 2048减少到512，下面是我的示例代码：

from sklearn.decomposition import PCA
import numpy as np 

x = np.random.random((32,2048)) //shape : (batch,dim)
pca = PCA(n_components = dim,copy = True)
pca.fit(x)

代码报错：

ValueError: n_components=512 must be between 0 and min(n_samples, n_features)=32 with svd_solver='full'

如果我使用pca(x[0]) 更改预批处理过程，则会引发错误：

ValueError: Expected 2D array, got 1D array instead:
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

那么，如果我想使用 PCA 将维度 2048 减少到 512 呢？我该如何做到这一点？谢谢！

【问题讨论】：

这可能会有所帮助：askpython.com/python/examples/…
@MithridatestheGreat 对不起，我在这篇文章中没有看到任何有用的信息。你能提供更详细的指南或解决方案吗？谢谢
我认为您需要阅读有关 PCA 基础知识的更多信息。当您只有 32 个点时，您无法将维数从 2048 减少到 512。答案中已经提到了。除此之外，它会工作得很好。顺便说一句，我最初评论中给出的链接与您想要做的事情完全相同。
@MithridatestheGreat 哦，抱歉没有描述清楚，我想做的是“按功能”进行调暗。这里的 32 表示批量大小而不是点数，因此所需的输出应该类似于 (32,512) ，这意味着这些特征应该独立处理，我该如何实现？

标签： python scikit-learn pca

【解决方案1】：

此错误表明您不能保留的组件数 (n_components) 低于 n_feature 和 n_samples。因此，您需要更多样本来执行此转换。（至少 512）

【讨论】：