只需将标签编码为整数,一切都会运行良好。浮点数表示回归。
特别是你可以使用 LabelEncoder http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
>>> from sklearn.ensemble import RandomForestClassifier as RF
>>> import numpy as np
>>> X = np.array([[0], [1], [1.2]])
>>> y = [0.5, 1.2, -0.1]
>>> clf = RF()
>>> clf.fit(X, y)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
>>> print clf.score(y, X)
Traceback (most recent call last):
[.....]
ValueError: continuous is not supported
>>> y = [0, 1, 2]
>>> clf.fit(X, y)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
>>> print clf.score(X, y)
1.0
或自己计算.score,因为这是非常微不足道的函数
print np.mean(clf.predict(X) == y)