【发布时间】:2022-01-21 15:34:18
【问题描述】:
我想了解我是否可以在不聚合的情况下将 DataFrame 塑造成多索引和多标题/多列(枢轴)DataFrame,因为此聚合计算已经存在于我的 DataFrame 的列中。
我有以下数据框:
card_type payment_status airbnb paid revenue - sum revenue - min debit - sum
American Express Checked Out Premium Queen Ensuite No 591.49 0.0 2
American Express Checked Out Queen Room w. Shared Facilities No 255.52 0.0 2
American Express Checked Out Single Room w. Shared Facilities No 1602.02 0.0 5
American Express Confirmed Compact Double Room w. Shared Facilities No 189.05 0.0 1
American Express Confirmed Premium Queen Ensuite No 350.0 0.0 1
American Express Confirmed Queen Room w. Shared Facilities Yes 110.53 0.0 1
American Express Confirmed Single Room w. Shared Facilities No 4258.48 0.0 3
Mastercard Cancelled Queen Room w. Shared Facilities No 28.5 0.0 3
Mastercard Cancelled Single Room w. Shared Facilities Yes 578.55 0.0 2
Mastercard Checked Out Compact Double Room w. Shared Facilities No 4637.71 0.0 22
...
df = pd.DataFrame.from_dict({
'card_type': {0: 'American Express', 1: 'American Express', 2: 'American Express', 3: 'American Express', 4: 'American Express', 5: 'American Express', 6: 'American Express', 7: 'Mastercard', 8: 'Mastercard', 9: 'Mastercard'},
'payment_status': {0: 'Checked Out', 1: 'Checked Out', 2: 'Checked Out', 3: 'Confirmed', 4: 'Confirmed', 5: 'Confirmed', 6: 'Confirmed', 7: 'Cancelled', 8: 'Cancelled', 9: 'Checked Out'},
'airbnb': {0: 'Premium Queen Ensuite ', 1: 'Queen Room w. Shared Facilities ', 2: 'Single Room w. Shared Facilities ', 3: 'Compact Double Room w. Shared Facilities ', 4: 'Premium Queen Ensuite ', 5: 'Queen Room w. Shared Facilities ', 6: 'Single Room w. Shared Facilities ', 7: 'Queen Room w. Shared Facilities ', 8: 'Single Room w. Shared Facilities ', 9: 'Compact Double Room w. Shared Facilities '},
'paid': {0: 'No', 1: 'No', 2: 'No', 3: 'No', 4: 'No', 5: 'Yes', 6: 'No', 7: 'No', 8: 'Yes', 9: 'No'},
'revenue - sum': {0: 591.49, 1: 255.52, 2: 1602.02, 3: 189.05, 4: 350.0, 5: 110.53, 6: 4258.48,7: 28.5, 8: 578.55, 9: 4637.71},
'revenue - min': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
'debit - sum': {0: 2, 1: 2, 2: 5, 3: 1, 4: 1, 5: 1, 6: 3, 7: 3, 8: 2, 9: 22}})
我已经使用这种方法(基于Pandas Pivot table without aggregating)来实现(部分)我正在寻找的形状。但是,我想将 aggfuncs 标签交换到底部(可能与https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.swaplevel.html),感觉不对,因为我的值之前已经计算过了,我们不需要再次计算:
df.pivot_table(index=["card_type", "payment_status"], columns=["airbnb", "paid"], values=["revenue - sum", "revenue - min", "debit - sum"], aggfunc={"revenue - sum": ["sum"], "revenue - min": ["max"], "debit - sum": ["mean"]}, fill_value="-")
有什么办法可以解决这个问题吗?谢谢!
【问题讨论】: