是什么导致了这个 ValueError: could not convert string to float: '' while using logisticRegr.fit(x_train, y_train)答案

【问题标题】：What is causing this ValueError: could not convert string to float: '' while using logisticRegr.fit(x_train, y_train)是什么导致了这个 ValueError: could not convert string to float: '' while using logisticRegr.fit(x_train, y_train)
【发布时间】：2021-11-26 03:12:53
【问题描述】：

我正在尝试运行 test_train_split，然后使用它来运行逻辑回归分析。

from sklearn.model_selection import train_test_split

x = hotel_data.iloc[:,:-1]
y = hotel_data.iloc[:,-1]

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.20, random_state=0)

当我运行代码时

logisticRegr.fit(x_train, y_train)

我明白了

ValueError: could not convert string to float: 'CityHotel'

“CityHotel”指的是我正在使用的“hotel_data”数据集中“酒店”列下的一种酒店。该列中的另一个选项是“ResortHotel”。

【问题讨论】：

标签： python machine-learning scikit-learn linear-regression

【解决方案1】：

您的数据中有分类变量。 CityHotel 是数据的特征之一，但它是分类的。这意味着该列的条目不是数字而是类别（字符串，例如“cat”、“dog”、“fish”）。您的模型无法将它们识别为有效输入，并且它会抱怨 --> could not convert string to float 这意味着该模型不接受字符串作为输入并且没有将它们转换为浮点数的方法。

有一个简单的解决方案：分类变量编码。网上搜索，因为有很多不同的方法。你应该选择最适合你的那个。通常，我会首先计算基数，即 CityHotel 列下唯一值的数量，然后考虑哪种编码方法是最佳选择。

【讨论】：