【发布时间】:2021-07-23 01:11:57
【问题描述】:
我有一个包含两列 ACQUISITION_CHANNEL 和 HOURS_WORKED_CUMULATIVE 的 df,我使用下面的代码将其绘制为抖动的条形图。我想对 x 轴上的类别进行排序,以便它们首先按最高中位数排序。
| ACQUISITION_CHANNEL | HOURS_WORKED_CUMULATIVE |
|---|---|
| Referral | 34 |
| Job Platform | 42 |
| Referral | 34 |
| Offline | 42 |
| Referral | 34 |
| Digital | 42 |
...
group = 'ACQUISITION_CHANNEL'
column = 'HOURS_WORKED_CUMULATIVE'
grouped = df.groupby(group)
names, vals, xs = [], [] ,[]
for i, (name, subdf) in enumerate(grouped):
names.append(name)
vals.append(subdf[column].tolist())
xs.append(np.random.normal(i+1, 0.1, subdf.shape[0]))
plt.boxplot(vals, labels=names, showfliers=False )
ngroup = len(vals)
clevels = np.linspace(0., 1., ngroup)
for x, val, clevel in zip(xs, vals, clevels):
plt.scatter(x, val, alpha=0.4, c='#1f77b4')
plt.title('Hours Worked by Acquisition Channel')
plt.xlabel('Acquisition Channel')
plt.ylabel('Total Hours Worked')
【问题讨论】:
-
你有示例数据吗?
-
@TomMcLean 刚刚在中添加了一些示例数据
标签: python pandas matplotlib jupyter-notebook