新版本的 MinMaxScaler 不再接受最大值和最小值范围答案

【问题标题】：New version of MinMaxScaler does not accept a range of max and min values anymore新版本的 MinMaxScaler 不再接受最大值和最小值范围
【发布时间】：2020-02-23 01:15:57
【问题描述】：

在早期版本的 sklearn 的 MinMaxScaler 中，可以指定缩放器标准化数据的最小值和最大值。换句话说，以下是可能的：

from sklearn import preprocessing
import numpy as np
x_data = np.array([[66,74,89], [1,44,53], [85,86,33], [30,23,80]])
scaler = preprocessing.MinMaxScaler()
scaler.fit ([-90, 90])
b = scaler.transform(x_data)

这将导致上面的数组缩放到 (0,1) 的范围，其中 -90 的最小可能值变为 0，90 的最大可能值变为 1，并且中间的所有值都被缩放因此。使用 0.21 版的 sklearn 会引发错误：

ValueError: Expected 2D array, got 1D array instead:
array=[-90.  90.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

我将scaler.fit ([-90, 90]) 转为scaler.fit ([[-90, 90]])，但后来我得到了：

ValueError: operands could not be broadcast together with shapes (4,3) (2,) (4,3)

我知道我可以做到scaler.fit (x_data)，但这会导致转换后的结果如下：

 [0.         0.33333333 0.35714286]
 [1.         1.         0.        ]
 [0.3452381  0.         0.83928571]]

我的问题有两个：1) 数字似乎不正确。它们应该在 0 和 1 之间缩放，但是对于应该分别更高和更低的值，我得到了许多 0 和许多 1。 2）如果我想根据固定范围（例如（-90. 90））将每个未来数组缩放到（0,1）范围怎么办？这是一个方便的功能，但现在我必须使用特定的数组来进行缩放。更重要的是，每次缩放都会产生不同的结果，因为我必须重新拟合每个未来的数组，从而得到可变的结果。

我在这里遗漏了什么吗？有没有办法保留这个漂亮的功能？如果没有，我将如何确保我的数据每次都正确且一致地缩放？

【问题讨论】：

标签： python numpy machine-learning scikit-learn

【解决方案1】：

看来问题不在于scikit-learn 包版本，而在于MinMaxScaler 对象的fit() 方法的输入数据形状：

import numpy as np
import sklearn
from sklearn.preprocessing import MinMaxScaler

print('scikit-learn package version: {}'.format(sklearn.__version__))
# scikit-learn package version: 0.21.3

scaler = MinMaxScaler()
x_sample = [-90, 90]
scaler.fit(np.array(x_sample)[:, np.newaxis]) # reshape data to satisfy fit() method requirements
x_data = np.array([[66,74,89], [1,44,53], [85,86,33], [30,23,80]])

print(scaler.transform(x_data))

# [[0.86666667 0.91111111 0.99444444]
# [0.50555556 0.74444444 0.79444444]
# [0.97222222 0.97777778 0.68333333]
# [0.66666667 0.62777778 0.94444444]]

要了解诸如StandardScaler、MinMaxScaler 等流行预处理器的输入数据要求，您可以查看我的answerStandardScaler.fit() 输入的另一个问题。

【讨论】：

谢谢你，它工作得很好！我还发现 MinMaxScaler 所做的就是它有效地按如下方式缩放所有值：x_data = (x_data+abs(min)) / (2*abs(min))，其中 min 为 -90