规范化 numpy ndarray 数据答案

【问题标题】：Normalize numpy ndarray data规范化 numpy ndarray 数据
【发布时间】：2017-11-03 14:53:11
【问题描述】：

我的数据是 numpy ndarray，其 shape(2,3,4) 如下：我尝试通过 sklearn 标准化将每列的 0-1 比例标准化。

from sklearn.preprocessing import normalize  

x = np.array([[[1, 2, 3, 4],
      [2, 2, 3, 4],
      [3, 2, 3, 4]],
      [[4, 2, 3, 4],
      [5, 2, 3, 4],
      [6, 2, 3, 4]]])

x.shape ==> ( 2,3,4) 

x = normalize(x, norm='max', axis=0, )

但是，我发现了错误：

ValueError: Found array with dim 3. the normalize function expected <= 2.

我该如何解决这个问题？

谢谢。

【问题讨论】：

标签： numpy scikit-learn normalization

【解决方案1】：

似乎scikit-learn 期望 ndarrays 最多有两个暗淡。所以，要解决这个问题，就是将其重塑为2D，将其提供给normalize，这给了我们一个2D 数组，可以将其重新整形为原始形状-

from sklearn.preprocessing import normalize  

normalize(x.reshape(x.shape[0],-1), norm='max', axis=0).reshape(x.shape)

或者，使用 NumPy 更简单，可以很好地与通用 ndarrays 配合使用 -

x/np.linalg.norm(x, ord=np.inf, axis=0, keepdims=True)

【讨论】：

非常感谢！！但是，上面的代码并不适用于逐列，而是适用于整个数据。应该应用哪个选项？
@ChrisJoo 不确定列到列是什么意思。也许您的意思是沿轴 = 1 而不是轴 = 0 使用它？
例如。第一列 [ [1, 2, 3], [4, 5, 6] ] 应该是 [ [ 0.1667, 0.3333, 0.5000], [ 0.6667 , 0.8333, 1.0000 ] ] 和第二列 ( 2, 2, 2, 2, 2, 2) 应该是 [ 1, 1, 1, 1, 1, 1 ]。
这看起来不像 l2 范数，而是一个最大范数，也沿着最后一个轴。所以，在 sklearn 版本中使用：x/np.linalg.norm(x, ord=np.inf,axis=-1,keepdims=1) 或 norm='max',。
我已经解决了这个问题。你对我帮助很大。谢谢！！ normalize(x.reshape(x.shape[1],-1), norm='max', axis=0).reshape(x.shape)