【发布时间】:2023-03-03 03:15:02
【问题描述】:
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
train_matrix = vectorizer.fit_transform(train_data['review'])
test_matrix = vectorizer.fit_transform(test_data['review'])
训练 LogisticRegressor
from sklearn.linear_model import LogisticRegression
sentiment_model = LogisticRegression()
sentiment_model = sentiment_model.fit(train_matrix,train_data['sentiment'])
检查样本数据
sample_test_data = test_data[10:13]
sample_test_matrix = vectorizer.fit_transform(sample_test_data['review'])
predict = sentiment_model.predict(sample_test_matrix)
错误:
X 每个样本有 85 个特征;期待 121676
ValueErrorTraceback(最后一次调用)
在 ()
----> 1 预测 = model.predict(sample_test_matrix)
~\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in predict(self, X)
Predicted class label per sample.---------> 分数 = self.decision_function(X)
if len(scores.shape) == 1: indices = (scores > 0).astype(np.int)decision_function(self, X)
if X.shape[1] != n_features: raise ValueError("X has %d features per sample; expecting %d" ------------> % (X.shape[1], n_features)) scores = safe_sparse_dot(X, self.coef_.T,ValueError: X 每个样本有 85 个特征;期待 121676
【问题讨论】:
-
请包含 full 堆栈跟踪,以便我们查看是哪一行引发了错误。
-
亚马逊产品评论,Coursera 分类作业。我什至做了与帮助台相同的编码,但仍然出现此错误
标签: python machine-learning scikit-learn logistic-regression sentiment-analysis