首先,我真的不明白你为什么选择gamma='auto' 作为你的超参数之一,但是如果你去掉它并让模型决定使用哪个伽玛可能会有更好的性能。
而且,小 C 和小 epsilon 可能会以矛盾的方式工作,所以我认为平衡这两个超参数是个好主意。
在这里,我做了一些随机数据试图弄清楚如何处理它,希望它可以帮助你解决你的问题。
代码:
import numpy as np
from sklearn.svm import SVR
# make data
month_rain = np.random.randint(1000, 5000, size=(10,12))
X = month_rain
y = np.random.randint(3000, 4000, size=(10,1))
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y)
svm_reg = SVR(gamma='auto', C=0.1, epsilon=0.2) # original model
svm_reg2 = SVR(C=0.1, epsilon=0.2) # get rid of gamma
svm_reg3 = SVR(C=100, epsilon=0.2) # get rid of gamma and use larger C
svm_reg4 = SVR(gamma='auto', C=100, epsilon=0.2) # use larger C
svm_reg.fit(X_train, y_train)
svm_reg2.fit(X_train, y_train)
svm_reg3.fit(X_train, y_train)
svm_reg4.fit(X_train, y_train)
# check out the model score in the training dataset.
print(svm_reg.score(X_train, y_train))
print(svm_reg2.score(X_train, y_train))
print(svm_reg3.score(X_train, y_train))
print(svm_reg4.score(X_train, y_train))
# check out the result.
y_pred = svm_reg.predict(X_test)
y_pred2 = svm_reg2.predict(X_test)
y_pred3 = svm_reg3.predict(X_test)
y_pred4 = svm_reg4.predict(X_test)
print(y_test)
print(y_pred.reshape(-1,1))
print(y_pred2.reshape(-1,1))
print(y_pred3.reshape(-1,1))
print(y_pred4.reshape(-1,1))
输出:
score:
-0.05514476528005918
-0.055253731765687375
0.40714376538337693
0.47055666976833854
result:
origin:
[[3690]
[3355]
[3916]]
model 1:
[[3346.]
[3346.]
[3346.]]
model 2:
[[3345.95909456]
[3345.99648151]
[3345.933001 ]]
model 3:
[[3305.09456122]
[3342.48150808]
[3279.00100083]]
model 4:
[[3346.]
[3346.]
[3346.]]
因此,我建议您使用较大的 C 来约束您的模型,它会具有更好的性能。