【发布时间】:2015-04-03 10:29:35
【问题描述】:
我正在学习文本分类,并使用我自己的语料库进行线性回归分类如下:
from sklearn.linear_model.logistic import LogisticRegression
classifier = LogisticRegression(penalty='l2', C=7)
classifier.fit(training_matrix, y_train)
prediction = classifier.predict(testing_matrix)
我想使用 scikit-learn 提供的受限玻尔兹曼机来增加分类报告,从 documentation 我读到这可以用来提高分类召回率、f1 分数、准确度等。有人可以吗帮助我增加这是我迄今为止尝试过的,在此先感谢:
vectorizer = TfidfVectorizer(max_df=0.5,
max_features=None,
ngram_range=(1, 1),
norm='l2',
use_idf=True)
X_train = vectorizer.fit_transform(X_train_r)
X_test = vectorizer.transform(X_test_r)
from sklearn.pipeline import Pipeline
from sklearn.neural_network import BernoulliRBM
logistic = LogisticRegression()
rbm= BernoulliRBM(random_state=0, verbose=True)
classifier = Pipeline(steps=[('rbm', rbm), ('logistic', logistic)])
classifier.fit(X_train, y_train)
【问题讨论】:
标签: python python-2.7 machine-learning nlp scikit-learn