LabelEncoder：如何保留显示原始变量和转换变量的字典

【问题标题】：LabelEncoder: How to keep a dictionary that shows original and converted variableLabelEncoder：如何保留显示原始变量和转换变量的字典
【发布时间】：2020-10-21 02:25:24
【问题描述】：

使用LabelEncoder将分类变量编码为数字时，

如何保存一本跟踪转换的字典？

即一本字典，我可以在其中看到哪些值变成了什么：

{'A':1,'B':2,'C':3}

【问题讨论】：

标签： python dictionary label categorical-data

【解决方案1】：

我从classes_创建了一个字典

le = preprocessing.LabelEncoder()
ids = le.fit_transform(labels)
mapping = dict(zip(le.classes_, range(len(le.classes_))))

测试：

all([mapping[x] for x in le.inverse_transform(ids)] == ids)

应该返回True。

这是因为fit_transform 使用numpy.unique 同时计算标签编码和classes_ 属性：

def fit_transform(self, y):
    self.classes_, y = np.unique(y, return_inverse=True)
    return y

【讨论】：

【解决方案2】：

你可以在一行中完成：

le = preprocessing.LabelEncoder()
my_encodings = {l: i for (i, l) in enumerate(le.fit(data["target"].classes_))}

【讨论】：