在进行 PCA 分析时，sklearn 的 skpca.fit 会出现什么错误答案

【问题标题】：What is this error with sklearn's skpca.fit when doing a PCA analysis在进行 PCA 分析时，sklearn 的 skpca.fit 会出现什么错误
【发布时间】：2021-05-01 14:22:55
【问题描述】：

我正在使用一些卫星数据进行简单的 PCA 分析。所有的土地点都被删除了，均值和标准差接近 0 和 1。但是我得到了

from sklearn import preprocessing
scaler  = preprocessing.StandardScaler()
scaler_sst = scaler.fit(sss_data)

import joblib
joblib.dump(scaler_sst, './scaler_sst.pkl', compress=9)
scaler_sst = joblib.load('./scaler_sst.pkl')

X = scaler_sst.transform(sss_data)

print(X.mean())
print(X.std())
#X.shape

5.7725416769826885e-15
0.9999999999999993

from sklearn.decomposition import pca
skpca=pca.PCA()
skpca.fit(X)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj, include, exclude)
    968 
    969             if method is not None:
--> 970                 return method(include=include, exclude=exclude)
    971             return None
    972         else:

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/sklearn/base.py in _repr_mimebundle_(self, **kwargs)
    461     def _repr_mimebundle_(self, **kwargs):
    462         """Mime bundle used by jupyter kernels to display estimator"""
--> 463         output = {"text/plain": repr(self)}
    464         if get_config()["display"] == 'diagram':
    465             output["text/html"] = estimator_html_repr(self)

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/sklearn/base.py in __repr__(self, N_CHAR_MAX)
    273 
    274         # use ellipsis for sequences with a lot of elements
--> 275         pp = _EstimatorPrettyPrinter(
    276             compact=True, indent=1, indent_at_name=True,
    277             n_max_elements_to_show=N_MAX_ELEMENTS_TO_SHOW)

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/sklearn/utils/_pprint.py in __init__(self, indent, width, depth, stream, compact, indent_at_name, n_max_elements_to_show)
    162         if self._indent_at_name:
    163             self._indent_per_level = 1  # ignore indent param
--> 164         self._changed_only = get_config()['print_changed_only']
    165         # Max number of elements in a list, dict, tuple until we start using
    166         # ellipsis. This also affects the number of arguments of an estimators

KeyError: 'print_changed_only'
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    698     """A pprint that just redirects to the normal repr function."""
    699     # Find newlines and replace them with p.break_()
--> 700     output = repr(obj)
    701     lines = output.splitlines()
    702     with p.group():

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/sklearn/base.py in __repr__(self, N_CHAR_MAX)
    273 
    274         # use ellipsis for sequences with a lot of elements
--> 275         pp = _EstimatorPrettyPrinter(
    276             compact=True, indent=1, indent_at_name=True,
    277             n_max_elements_to_show=N_MAX_ELEMENTS_TO_SHOW)

~/miniconda3/envs/py3_std_maps/lib/python3.8/site-packages/sklearn/utils/_pprint.py in __init__(self, indent, width, depth, stream, compact, indent_at_name, n_max_elements_to_show)
    162         if self._indent_at_name:
    163             self._indent_per_level = 1  # ignore indent param
--> 164         self._changed_only = get_config()['print_changed_only']
    165         # Max number of elements in a list, dict, tuple until we start using
    166         # ellipsis. This also affects the number of arguments of an estimators

KeyError: 'print_changed_only'

错误发生在零件 skpca.fit(X)。我重新安装了 sklearn 包和 scikit 包。我之前在 sklearn 中使用过 PCA 分析，但从未发生过。

【问题讨论】：

标签： python arrays numpy scikit-learn

【解决方案1】：

我不知道答案，但也许这是 sklearn 中的一个错误：试试：

import sklearn
sklearn.get_config()

在我的例子中，它返回一个字典：

{'assume_finite': False, 'working_memory': 1024, 'print_changed_only': False}

该错误表明您不存在print_changend_only。我的 sklearn 版本在 python 3.6 上是“0.21.2”。也许它有助于降级 sklearn 版本？

【讨论】：