【问题标题】:infinite problem when do some preprocess for data set in python对python中的数据集进行一些预处理时的无限问题
【发布时间】:2021-03-21 08:10:23
【问题描述】:

导入bumpy、panda和matplotlib.pyplot后如何解决这个错误

当我使用 projectDataset.csv 数据集时

dataset= pd.read_csv('projectDataset.csv')
x = dataset.iloc[:,7:54].values
y = dataset.iloc[:,83].values

以及将数据集拆分为训练数据集和测试数据集

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.2, random_state = 1)

和特征缩放

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.fit_transform(x_test)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-291-3ad5b591b7f2> in <module>()
      1 from sklearn.preprocessing import StandardScaler
      2 sc = StandardScaler()
----> 3 x_train = sc.fit_transform(x_train)
      4 x_test = sc.fit_transform(x_test)

4 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
     58                     msg_err.format
     59                     (type_err,
---> 60                      msg_dtype if msg_dtype is not None else X.dtype)
     61             )
     62     # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains infinity or a value too large for dtype('float64').

这个错误在我看来我的数据集没有数据

【问题讨论】:

    标签: python numpy dataset preprocessor infinity


    【解决方案1】:

    里面好像有几个np.inf。将它们替换为 np.nan 会有所帮助:

    dataset = dataset.replace([np.inf,-np.inf], np.nan)
    

    【讨论】:

      猜你喜欢
      • 2021-12-28
      • 2019-09-21
      • 2016-07-01
      • 1970-01-01
      • 2016-12-09
      • 2021-04-26
      • 1970-01-01
      • 1970-01-01
      • 2017-11-03
      相关资源
      最近更新 更多