Matplotlib：形状不匹配：对象不能广播到单个形状答案

【问题标题】：Matplotlib: shape mismatch: objects cannot be broadcast to a single shapeMatplotlib：形状不匹配：对象不能广播到单个形状
【发布时间】：2021-09-12 17:41:10
【问题描述】：

我有一个看起来像这样的数据框（显然要大得多）：

id     points isAvailable frequency   Score
abc1   325    0           93          0.01
def2   467    1           80          0.59
ghi3   122    1           90          1 
jkl4   546    1           84          0
mno5   355    0           93          0.99

我想看看points、isAvailable 和frequency 对Score 的影响有多大。我想使用像in this example这样的随机森林：

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
#from sklearn.inspection import permutation_importance
#import shap
from matplotlib import pyplot as plt

plt.rcParams.update({'figure.figsize': (12.0, 8.0)})
plt.rcParams.update({'font.size': 14})

list_of_columns = ['points','isAvailable', 'frequency']
X = df[list_of_columns]
target_column = 'Score'
y = df[target_column]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=12)

rf = RandomForestRegressor(n_estimators=100)
rf.fit(X_train, y_train)
rf.feature_importances_ #the array below is the output 
>>> array([0.44326132, 0.01666047, 0.        , 0.5400782 ])

plt.barh(df.columns, rf.feature_importances_)

在最后一行我收到以下错误：ValueError: shape mismatch: objects cannot be broadcast to a single shape。我应该在一开始就创建这些列吗？（更大的）数据有问题吗？

【问题讨论】：

标签： python pandas dataframe matplotlib scikit-learn

【解决方案1】：

rf 模型是在 X 上训练的，这只是 df 的一个子集，因此应根据 X.columns（或 list_of_columns）而不是 df.columns 绘制特征重要性：

plt.barh(X.columns, rf.feature_importances_)

【讨论】：