如何使用 to_categorical 将 [[4,7,10],[10,20,30]] 转换为一种热编码答案

【问题标题】：How to use to_categorical to convert [[4,7,10],[10,20,30]] to one hot encoding如何使用 to_categorical 将 [[4,7,10],[10,20,30]] 转换为一种热编码
【发布时间】：2018-07-10 17:31:07
【问题描述】：

我正在研究 LSTM。

输出是分类的。

其格式为[[t11,t12,t13],[t21,t22,t23]

我能够为 1d 数组做到这一点，但我发现很难为 2d 数组做到这一点。

from keras.utils import to_categorical
print(to_categorical([[9,10,11],[10,11,12]]))

输出

[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]

有两个不同的输入，每个都有 3 个时间步长，但在输出中它们全部组合在一起。

我需要它，

[[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]],

[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]]

【问题讨论】：

到底是什么问题？理论上你应该打电话给to_categorical，就是这样。
你的重塑应该是 13，而不是 12。
知道了，谢谢丹尼尔

标签： machine-learning keras deep-learning

【解决方案1】：

如果形状很奇怪，请尝试将其设为 1D，使用该函数并将其重新整形：

originalShape = myData.shape
totalFeatures = myData.max() + 1

categorical = myData.reshape((-1,))
categorical = to_categorical(categorical)
categorical = categorical.reshape(originalShape + (totalFeatures,))

【讨论】：

【解决方案2】：

我意识到我可以通过重塑来实现我想要的，

print(a.reshape(2,3,13))



[[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]]

[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]]

【讨论】：

【解决方案3】：

由于最高类别索引为 12，因此在重塑时会出现错误，因此有 13 个类别（0、1、...、12）。为了进一步避免此类错误，您可以 Numpy 通过调用one_hot.reshape(sparse.shape + [-1]) 来推断这些维度，其中one_hot 是由to_categorical() 和sparse 生成的单热编码向量。

【讨论】：