【问题标题】:how to solve this attribution error AttributeError: 'DataFrame' object has no attribute 'as_matrix' (using Python 3.8)如何解决此归因错误 AttributeError: 'DataFrame' object has no attribute 'as_matrix' (使用 Python 3.8)
【发布时间】:2020-10-24 00:44:09
【问题描述】:

大家好,当我在 jupyter 记事本上运行以下代码时,我得到了(AttributeError: 'DataFrame' object has no attribute 'as_matrix'),引用了这两行 #创建 x & y 变量

X = features_df.as_matrix()
y = df['Price'].as_matrix()

我的整个代码如下

#developing model to predict houses prices in Australia
#importing needed libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import ensemble
from sklearn.metrics import mean_absolute_error
import sklearn.externals 
# importing the file path
df = pd.read_csv('~/mypython/machine_learning/machine_learning/housing/Melbourne_housing_FULL.csv')
#removing less related dimentions
del df['Address']
del df['Method']
del df['SellerG']
del df['Date']
del df['Postcode']
del df['Lattitude']
del df['Longtitude']
del df['Regionname']
del df['Propertycount']

#delete raws with any empty value
df.dropna(axis = 0 ,how = 'any' ,thresh = None, subset = None, inplace = True)

#converting non-numerical values to numerical values using pandas
features_df = pd.get_dummies(df, columns=['Suburb', 'CouncilArea', 'Type'])

# delete price because it's the dependant varilable
del features_df['Price']

#create x & y variables 
X = features_df.as_matrix()
y = df['Price'].as_matrix()

X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.3,random_state=0)

model = ensembel.GradientBoostingRegressor(
    n_estimators=150,
    learning_rate=0.1,
    max_depth=30,
    min_sample_split= 4,
    min_samples_leaf=6,
    max_features=0.6,
    loss="huber")

model.fit(X_train,y_train)

joblib.dumb(model,"house_train_model.pkl")

mse=mean_absolute_error(y_train_model, model.predict(X_train))
print("Training set mean absolute error:%.2f"%mse)

【问题讨论】:

  • 如果只使用X = features_dfy = df['Price'] 会怎样?

标签: pandas error-handling scikit-learn jupyter-notebook python-3.8


【解决方案1】:

你应该使用这个

X = features_df.values

y = df['Price'].values

【讨论】:

    猜你喜欢
    • 2021-08-22
    • 2021-05-16
    • 1970-01-01
    • 1970-01-01
    • 2020-05-26
    • 1970-01-01
    • 2021-04-06
    • 2022-12-04
    • 2022-11-11
    相关资源
    最近更新 更多