【发布时间】:2019-11-06 09:35:46
【问题描述】:
我正在通过这个演练解决这个多元回归问题,但是从
开始的代码section : #Treating categorical variables with One-hot-encoding at website: https://towardsdatascience.com/what-makes-a-movie-hit-a-jackpot-learning-from-data-with-multiple-linear-regression-339f6c1a7022
到目前为止,我运行了代码,但它不适用于 (X)
实际代码:
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
# LabelEncoder for a number of columns
class MultiColumnLabelEncoder:
def __init__(self, columns = None):
self.columns = columns # list of column to encode
def fit(self, X, y=None):
return self
def transform(self, X):
'''
Transforms columns of X specified in self.columns using
LabelEncoder(). If no columns specified, transforms all
columns in X.
'''
output = X.copy()
if self.columns is not None:
for col in self.columns:
output[col] = LabelEncoder().fit_transform(output[col])
else:
for colname, col in output.iteritems():
output[colname] = LabelEncoder().fit_transform(col)
return output
def fit_transform(self, X, y=None):
return self.fit(X, y).transform(X)
le = MultiColumnLabelEncoder()
X_train_le = le.fit_transform(X)
这是我得到的错误:
Traceback (most recent call last):
File "<ipython-input-63-581cea150670>", line 34, in <module>
X_train_le = le.fit_transform(X)
NameError: name 'X' is not defined
【问题讨论】:
-
X_train_le = le.fit_transform(X)该行不在您发布的代码中。请向我们展示您正在使用的实际代码。
标签: python scikit-learn data-science