如何手动创建 scikit-learn 树？答案

【问题标题】：How can I create a scikit-learn tree by hand?如何手动创建 scikit-learn 树？
【发布时间】：2020-04-13 03:41:29
【问题描述】：

为了测试一些代码，我希望能够手动创建一个 sklearn.tree._tree.Tree，而不是通过拟合一些数据。

为了具体，假设我想要一棵树，它将实线中的点分类为区间（-infinity，5]，（5,6] 或（6,infinity）。我想要树形

----0----
|        |
|     ---2---
|     |      |
1     3      4

其中节点 0 在 5 处分割实线，节点 2 在 6 处分割实线。

如何做到这一点？我看到树有一个__setstate__ 方法，并查看__getstate__ 的输出，看起来我需要类似

state = {
        'n_features_': 1,
        'max_depth': 2,
        'node_count': 5,
        'nodes': np.array([(1 ,   2,  0,  5., 0.375, 3, 3.),
                           (-1,  -1,  0, -2., 0.   , 1, 1.),
                           (3 ,   4,  0,  6., 0.,  , 2, 2.),
                           (-1,  -1,  0, -2., 0.,  , 1, 1.),
                           (-1,  -1,  0, -2., 0.,  , 1, 1.),
                           ],
                          dtype=[('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'),('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]),
}

但我并不真正理解这些参数的含义，而且无论如何我一开始都不知道如何用这种状态初始化树。

【问题讨论】：

标签： python scikit-learn decision-tree

【解决方案1】：

在尝试手动更改节点数小时后。我找到了解决方案。的确，你是对的。通过使用 setstate，您可以进行树自定义。 'node' 键必须如下：

numpy 元组数组
每个元组必须如下所示：(left_child[i], right_child[i], feature[i], threshold[i], impurity[i], n_node_samples[i], weighted_n_node_samples[i])

-1（左/右孩子）和-2（特征）代表叶子。

在训练分类器时，您将拥有另一个键：'value'。

【讨论】：