如何使用 cross_val_score() Sklearn？答案

【问题标题】：How to use cross_val_score() Sklearn?如何使用 cross_val_score() Sklearn？
【发布时间】：2017-07-14 15:01:09
【问题描述】：

我正在尝试在 Python 中使用 Sklearn 进行 k 折交叉验证，现在已经学习了两个教程，但我的代码无法运行验证。

每次我尝试做

cross_val_score(dt, x, y, cv=5)

我得到错误：

Traceback (most recent call last):
File "C:/Users/djsg38/Documents/CS6001-SpatialTemporal/HW2/main.py", line 573, in <module>
  scores = cross_val_score(dt, x, y, cv=5)
File "C:\Python27\lib\site-packages\sklearn\model_selection\_validation.py", line 128, in cross_val_score
  X, y, groups = indexable(X, y, groups)
File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 206, in indexable
  check_consistent_length(*result)
File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 177, in check_consistent_length
  lengths = [_num_samples(X) for X in arrays if X is not None]
File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 116, in _num_samples
  'estimator %s' % x)
TypeError: Expected sequence or array-like, got estimator     Its  official  US  President  Barack  Obama  wants  lawmakers  weigh  \
0     1         4  12          3       2     12      4          4      2   
1     0         0   1          0       0      0      0          0      0   
2     1         0   4          0       0      0      0          0      0   
3     0         0   0          0       0      0      4          0      0   
4     0         3  10          0       0      1      0          0      0   
5     0         0   0          0       0      0      0          0      0   
6     0         0   0          4       1      7      0          0      0   
7     3         0   0          0       0      0      0          0      0   
8     1         0   4          0       0      0      0          0      0   
9     0         0   0          0       0      0      0          0      0   
10    0         1   6          3       0      3      0          0      0   
11    0         0   0          1       0      0      0          0      0   
12    0         2   1          0       0      0      0          0      0   
13    0         0   0          0       0      0      0          0      0   
14    0         0   0          0       0      0      0          0      0   
15    0         0   0          0       0      0      0          0      0   
16    0         0   0          0       0      0      0          0      0   
17    0         0   5          4       1      9      1          0      0   
18    0         0   0          0       0      0      0          0      0   
19    0         0   0          0       0      0      0          0      0   
20    0         0   0          0       0      0      0          0      0   
21    0         0   3          2       1      1      0          0      1   
22    0         0   0          0       0      0      0          0      0   
23    0         0   1          0       0      0      0          0      0   
24    1         0   0          0       0      0      0          0      0   
25    0         0   0          1       0      0      0          0      0   
26    0         0   0          0       0      0      0          0      0   
27    0         0   1          0       0      0      0          0      0   
28    0         0   0          0       0      0      0          0      0   
29    0         1   0          0       0      0      0          0      0   
..  ...       ...  ..        ...     ...    ...    ...        ...    ...   
70    0         0   0          0       0      0      0          0      0   
71    0         0   0          2       0      5      0          0      0   
72    5         0   0          0       0      0      0          0      0   
73    0         0   0          0       0      0      0          0      0   
74    0         0   1          0       0      0      0          0      0   
75    1         0   1          0       0      0      1          0      0   
76    2         0   0          0       0      0      0          0      0   
77    1         0   0          0       0      0      0          0      0   
78    0         0   0          0       0      0      0          0      0   
79    1         0   0          0       0      0      0          0      0   
80    0         0   0          0       0      0      0          0      0   
81    0         0   1          0       0      0      0          0      0   
82    0         0   1          0       0      0      0          0      0   
83    0         0   0          0       0      0      0          1      0   
84    0         0   2          4       1      3      1          0      0   
85    0         0   0          1       0      0      0          0      0   
86    0         0   1          0       0      0      0          0      0   
87    0         0   0          0       0      0      0          0      0   
88    0         0   0          0       0      0      0          0      0   
89    0         0   0          0       0      0      0          0      0   
90    0         0   0          0       0      0      0          0      0   
91    0         0   2          1       0      0      0          0      0   
92    0         0   0          0       0      0      0          0      0   
93    0         0   0          0       0      0      0          0      0   
94    1         0   0          0       0      0      0          0      0   
95    0         2   1          0       0      0      0          0      0   
96    0         0   0          0       0      0      0          0      0   
97    0         0   4          1       0      0      0          0      0   
98    0         0  11          1       0      0      0          0      0   
99    0         0   0          0       0      0      0          0      0   

    whether     ...      Heh  heh  funny  disassociate  personWere  \
0         4     ...        0    0      0             0           0   
1         0     ...        0    0      0             0           0   
2         0     ...        0    0      0             0           0   
3         0     ...        0    0      0             0           0   
4         0     ...        0    0      0             0           0   
5         0     ...        0    0      0             0           0   
6         2     ...        0    0      0             0           0   
7         0     ...        0    0      0             0           0   
8         0     ...        0    0      0             0           0   
9         0     ...        0    0      0             0           0   
10        0     ...        0    0      0             0           0   
11        1     ...        0    0      0             0           0   
12        0     ...        0    0      0             0           0   
13        1     ...        0    0      0             0           0   
14        0     ...        0    0      0             0           0   
15        1     ...        0    0      0             0           0   
16        0     ...        0    0      0             0           0   
17        1     ...        0    0      0             0           0   
18        0     ...        0    0      0             0           0   
19        0     ...        0    0      0             0           0   
20        0     ...        0    0      0             0           0   
21        8     ...        0    0      0             0           0   
22        0     ...        0    0      0             0           0   
23        0     ...        0    0      0             0           0   
24        0     ...        0    0      0             0           0   
25        0     ...        0    0      0             0           0   
26        1     ...        0    0      0             0           0   
27        0     ...        0    0      0             0           0   
28        0     ...        0    0      0             0           0   
29        0     ...        0    0      0             0           0   
..      ...     ...      ...  ...    ...           ...         ...   
70        0     ...        0    0      0             0           0   
71        1     ...        0    0      0             0           0   
72        0     ...        0    0      0             0           0   
73        0     ...        0    0      0             0           0   
74        0     ...        0    0      0             0           0   
75        0     ...        0    0      0             0           0   
77        0     ...        0    0      0             0           0   
78        0     ...        0    0      0             0           0   
79        1     ...        0    0      0             0           0   
80        0     ...        0    0      0             0           0   
81        3     ...        0    0      0             0           0   
82        0     ...        0    0      0             0           0   
83        0     ...        0    0      0             0           0   
84        0     ...        0    0      0             0           0   
85        0     ...        0    0      0             0           0   
86        0     ...        0    0      0             0           0   
87        0     ...        0    0      0             0           0   
88        0     ...        0    0      0             0           0   
89        1     ...        0    0      0             0           0   
90        0     ...        0    0      0             0           0   
91        0     ...        0    0      0             0           0   
92        0     ...        0    0      0             0           0   
93        0     ...        0    0      0             0           0   
94        1     ...        0    0      0             0           0   
95        0     ...        0    0      0             0           0   
96        0     ...        0    0      0             0           0   
97        0     ...        0    0      0             0           0   
98        1     ...        0    0      0             0           0   
99        0     ...        1    1      1             1           1   

   therehighlightAs  indepth  umpireshighlightThe  headhighlightTwo  \
0                  0        0                    0                 0   
1                  0        0                    0                 0   
2                  0        0                    0                 0   
3                  0        0                    0                 0   
4                  0        0                    0                 0   
5                  0        0                    0                 0   
6                  0        0                    0                 0   
7                  0        0                    0                 0   
8                  0        0                    0                 0   
9                  0        0                    0                 0   
10                 0        0                    0                 0   
11                 0        0                    0                 0   
12                 0        0                    0                 0   
13                 0        0                    0                 0   
14                 0        0                    0                 0   
15                 0        0                    0                 0   
16                 0        0                    0                 0   
17                 0        0                    0                 0   
18                 0        0                    0                 0   
19                 0        0                    0                 0   
20                 0        0                    0                 0   
21                 0        0                    0                 0   
22                 0        0                    0                 0   
23                 0        0                    0                 0   
24                 0        0                    0                 0   
25                 0        0                    0                 0   
26                 0        0                    0                 0   
27                 0        0                    0                 0   
28                 0        0                    0                 0   
29                 0        0                    0                 0   
..               ...      ...                  ...               ...   
70                 0        0                    0                 0   
71                 0        0                    0                 0   
72                 0        0                    0                 0   
73                 0        0                    0                 0   
74                 0        0                    0                 0   
75                 0        0                    0                 0   
76                 0        0                    0                 0   
77                 0        0                    0                 0   
78                 0        0                    0                 0   
79                 0        0                    0                 0   
80                 0        0                    0                 0   
81                 0        0                    0                 0   
82                 0        0                    0                 0   
83                 0        0                    0                 0   
84                 0        0                    0                 0   
85                 0        0                    0                 0   
86                 0        0                    0                 0   
87                 0        0                    0                 0   
88                 0        0                    0                 0   
89                 0        0                    0                 0   
90                 0        0                    0                 0   
91                 0        0                    0                 0   
92                 0        0                    0                 0   
93                 0        0                    0                 0   
94                 0        0                    0                 0   
95                 0        0                    0                 0   
96                 0        0                    0                 0   
97                 0        0                    0                 0   
98                 0        0                    0                 0   
99                 1        1                    1                 1   

    disrespect  
0            0  
1            0  
2            0  
3            0  
4            0  
5            0  
6            0  
7            0  
8            0  
9            0  
10           0  
11           0  
12           0  
13           0  
14           0  
15           0  
16           0  
17           0  
18           0  
19           0  
20           0  
21           0  
22           0  
23           0  
24           0  
25           0  
26           0  
27           0  
28           0  
29           0  
..         ...  
70           0  
71           0  
72           0  
73           0  
74           0  
75           0  
76           0  
77           0  
78           0  
79           0  
80           0  
81           0  
82           0  
83           0  
84           0  
85           0  
86           0  
87           0  
88           0  
89           0  
90           0  
91           0  
92           0  
93           0  
94           0  
95           0  
96           0  
97           0  
98           0  
99           1  

[100 rows x 12993 columns]

这是我的代码：

def encode_target(df, target_column):

    df_mod = df.copy()
    targets = df_mod[target_column].unique()
    map_to_int = {name: n for n, name in enumerate(targets)}
    df_mod["Target"] = df_mod[target_column].replace(map_to_int)

return (df_mod, targets)

df = pd.read_csv("C:/Users/djsg38/Documents/CS6001-    SpatialTemporal/HW2/finalCounts.csv")
df2, targets = encode_target(df, "MYLABEL")

features = list(df2.columns[:12338])

y = df2["TARGET"]
x = df2[features]

dt = DecisionTreeClassifier()
dt.fit(x, y)

scores = cross_val_score(dt, x, y, cv=5)

我的 DecisionTreeClassifier 似乎工作正常，当我将其输出为图像时，它看起来不错，但这里的问题在于最后一行。

附：我不确定是否有列限制？我遵循的经典示例使用了 Iris 数据集，因此有四列可以查看数据。不过，对我来说，我有 12,338 列数据（100 篇文章中每个唯一单词的字数）。

【问题讨论】：

首先，打印错误的所有堆栈跟踪。其次，我无法理解您的代码。 encode_target 在做什么？您在dt 中将df2[features] 作为x 传递，但在cross-val_score 中以y 的形式传递。
encode_target 根据 target_column 抓取数据帧的值，对我来说它是 'MYLABEL' - 这是我给它的标签。然后它抓取该列中的所有 UNIQUE 值并将其放入列表中。然后它枚举它们并为它们提供整数值，因为显然分类器只能处理整数。我不确定我能提供多少帮助，就像我在问题中所说的那样，我正在关注用于进行此分类的在线教程。但假设 y 值是“目标”处数据帧的所有特征，即整数映射。 x 值就是其他一切。
看到这个功能就明白了。您应该在 cross_val_score 中使用与在 dt.fit 中使用的相同的 X、y
您知道，阅读您的问题并就此向我提问，我意识到我没有使用相同的东西似乎很奇怪..我今天会尝试！我只是尽可能多地按照教程进行操作，而他们似乎就是这样做的。如果它仍然不起作用，我会更新。到目前为止，谢谢你，Vivek。
另外请说明您使用的教程来源

标签： python scikit-learn cross-validation

【解决方案1】：

与我关注的教程相反，我不能通过收到错误而通过我的x值。原因可能是由于它中的串标题，而不是积极的。

我所做的解决方案只是手动5倍，在数据上拆分我的数据并在数据上执行5个不同的决策树，每次有1个测试。

【讨论】：