如何在逻辑回归中进行拟合？答案

【问题标题】：How do I make fit work in logistic regression?如何在逻辑回归中进行拟合？
【发布时间】：2021-06-29 21:36:28
【问题描述】：

我正在尝试使用逻辑回归拟合数据，但出现值错误。

我正在使用来自 sklearn 的 iris 数据集：

# The data is in iris["data"] and target in iris["target"]
# For this section, we will work with a single feature 'petal width'
# which is the last (fourth) feature in iris["data"]
# We will assign class y=1 if the target's value is 2 and 0 otherwise

from sklearn.datasets import load_iris
import numpy as np

iris = load_iris()

# petal width
X = np.array([len(iris["data"]),1]).reshape(-1,1)
# 1 if Iris virginica, else 0
y = []
for x in iris["target"]:
    if x == 2.0:
        y.append(1)
    else:
        y.append(0)
y = np.array(y)

# Import the LogisticRegression class from scikit learn
from sklearn.linear_model import LogisticRegression

# Initialize the LogisticRegression class, use lbfgs solver and random state of 42
log_reg = LogisticRegression(solver='lbfgs', random_state=42)

# Fit the data
log_reg.fit(X, y)

这是我到达的地方

ValueError: Found input variables with inconsistent numbers of samples: [2, 150]

不确定是我的 x 还是 y 设置不正确？

【问题讨论】：

您能打印出y = np.array(y) 之后的x 和y 的形状并将其添加到您的问题中吗？根据我的经验，这个错误通常与不正确的尺寸有关
欢迎来到 SO；如果答案解决了您的问题，请接受 - 请参阅What should I do when someone answers my question?

标签： python scikit-learn logistic-regression

【解决方案1】：

原因是您在此处尝试的 X 的错误重塑：

X = np.array([len(iris["data"]),1]).reshape(-1,1)

这导致了一个

X.shape
# (2,1)

因此样本数量不一致，因为

y.shape
# (150,)

这种重塑是错误的；因为，从代码中的 cmets 看来，您只需要第四个特征（花瓣宽度），您应该将其更改为：

X = iris['data'][:,3].reshape(-1,1)

确实给出了正确的形状：

X.shape
# (150, 1)

您的模型将毫无问题地安装（经过测试）。

【讨论】：