【发布时间】:2019-12-22 10:04:19
【问题描述】:
这是我发现here...
我使用与原作者相同的逻辑,但仍然没有得到很好的准确性。平均倒数排名很接近(我的:52.79,例如:48.04)
cv = CountVectorizer(binary=True, max_df=0.95)
feature_set = cv.fit_transform(df["short_description"])
X_train, X_test, y_train, y_test = train_test_split(
feature_set, df["category"].values, random_state=2000)
scikit_log_reg = LogisticRegression(
verbose=1, solver="liblinear", random_state=0, C=5, penalty="l2", max_iter=1000)
model = scikit_log_reg.fit(X_train, y_train)
target = to_categorical(y_test)
y_pred = model.predict_proba(X_test)
label_ranking_average_precision_score(target, y_pred)
>> 0.5279108613021547
model.score(X_test, y_test)
>> 0.38620071684587814
但是笔记本样本的准确率(59.80)与我的代码(38.62)不匹配
示例笔记本中使用的以下函数是否正确返回了准确性?
def compute_accuracy(eval_items:list):
correct=0
total=0
for item in eval_items:
true_pred=item[0]
machine_pred=set(item[1])
for cat in true_pred:
if cat in machine_pred:
correct+=1
break
accuracy=correct/float(len(eval_items))
return accuracy
【问题讨论】:
标签: machine-learning scikit-learn logistic-regression