【发布时间】:2021-12-27 21:13:47
【问题描述】:
我有一个看起来像这样的熊猫数据框:
capacity_gw marginal_cost chained_capacity
Case Category
CES - No Storage Hydro 4.277016 0.000000 NaN
Solar 9.774715 0.000000 NaN
Wind 11.881870 0.000000 4.277016
Nuclear 5.242805 12.689066 14.051731
NGCC 2.101907 25.109150 25.933600
NGGT 4.638107 32.703513 31.176405
Overflow 35.000000 169.679554 33.278312
CES - Storage Hydro 4.277016 0.000000 37.916419
Solar 9.774715 0.000000 72.916419
Wind 11.881869 0.000000 4.277016
Nuclear 5.242805 12.689066 14.051731
NGCC 2.101907 25.109150 25.933600
NGGT 2.101907 32.703513 31.176405
Overflow 35.000000 169.679554 33.278312
Reference - No Storage Hydro 4.277016 0.000000 35.380219
Solar 14.289311 0.000000 70.380219
Wind 10.435570 0.000000 4.277016
Nuclear 1.143500 12.689066 18.566327
NGCC 4.533380 25.109150 29.001897
NGGT 17.224408 32.703513 30.145397
Overflow 35.000000 169.679554 34.678777
Reference - Storage Hydro 4.277016 0.000000 51.903185
Solar 14.894274 0.000000 86.903185
Wind 10.435570 0.000000 4.277016
Nuclear 1.143500 12.689066 19.171290
NGCC 4.533380 25.109150 29.606860
NGGT 14.524706 32.703513 30.750360
Overflow 35.000000 169.679554 35.283740
我使用以下方法创建了chained_capacity 变量:
stack['chained_capacity'] = stack.groupby('Case')['capacity_gw'].cumsum().shift(2)
但这不是我想要的结果。如您所见,它仍然以列中的第一个初始值开始求和。我希望总和从第三个值开始。所以预期的输出是:
capacity_gw marginal_cost chained_capacity
Case Category
CES - No Storage Hydro 4.277016 0.000000 NaN
Solar 9.774715 0.000000 NaN
Wind 11.881870 0.000000 11.881870
Nuclear 5.242805 12.689066 17.124674
NGCC 2.101907 25.109150 17.12 + 2.10
NGGT 4.638107 32.703513 ...
Overflow 35.000000 169.679554 ...
...
这里是df.to_dict() 能够完全重现数据:
{'capacity_gw': {('Reference - No Storage', 'Solar'): 14.289311043873823, ('Reference - No Storage', 'Wind'): 10.43556981658827, ('Reference - No Storage', 'Hydro'): 4.277016, ('Reference - No Storage', 'Nuclear'): 1.1435, ('Reference - No Storage', 'NGCC'): 4.533380090390558, ('Reference - No Storage', 'NGGT'): 17.22440836569597, ('Reference - No Storage', 'Overflow'): 35.0, ('Reference - Storage', 'Solar'): 14.894274398144354, ('Reference - Storage', 'Wind'): 10.435569838806854, ('Reference - Storage', 'Hydro'): 4.277016, ('Reference - Storage', 'Nuclear'): 1.1435, ('Reference - Storage', 'NGCC'): 4.533380082818851, ('Reference - Storage', 'NGGT'): 14.524706430121823, ('Reference - Storage', 'Overflow'): 35.0, ('CES - No Storage', 'Solar'): 9.774714739869358, ('CES - No Storage', 'Wind'): 11.881869635856951, ('CES - No Storage', 'Hydro'): 4.277016, ('CES - No Storage', 'Nuclear'): 5.242805, ('CES - No Storage', 'NGCC'): 2.1019069999999997, ('CES - No Storage', 'NGGT'): 4.638107074198996, ('CES - No Storage', 'Overflow'): 35.0, ('CES - Storage', 'Solar'): 9.774714538236491, ('CES - Storage', 'Wind'): 11.881869305881622, ('CES - Storage', 'Hydro'): 4.277016, ('CES - Storage', 'Nuclear'): 5.242805, ('CES - Storage', 'NGCC'): 2.1019069999999997, ('CES - Storage', 'NGGT'): 2.1019069999999997, ('CES - Storage', 'Overflow'): 35.0}, 'marginal_cost': {('Reference - No Storage', 'Solar'): 0.0, ('Reference - No Storage', 'Wind'): 0.0, ('Reference - No Storage', 'Hydro'): 0.0, ('Reference - No Storage', 'Nuclear'): 12.68906562274404, ('Reference - No Storage', 'NGCC'): 25.10914978408783, ('Reference - No Storage', 'NGGT'): 32.703513055654646, ('Reference - No Storage', 'Overflow'): 169.6795540944021, ('Reference - Storage', 'Solar'): 0.0, ('Reference - Storage', 'Wind'): 0.0, ('Reference - Storage', 'Hydro'): 0.0, ('Reference - Storage', 'Nuclear'): 12.68906562274404, ('Reference - Storage', 'NGCC'): 25.10914978408783, ('Reference - Storage', 'NGGT'): 32.703513055654646, ('Reference - Storage', 'Overflow'): 169.6795540944021, ('CES - No Storage', 'Solar'): 0.0, ('CES - No Storage', 'Wind'): 0.0, ('CES - No Storage', 'Hydro'): 0.0, ('CES - No Storage', 'Nuclear'): 12.68906562274404, ('CES - No Storage', 'NGCC'): 25.10914978408783, ('CES - No Storage', 'NGGT'): 32.703513055654646, ('CES - No Storage', 'Overflow'): 169.6795540944021, ('CES - Storage', 'Solar'): 0.0, ('CES - Storage', 'Wind'): 0.0, ('CES - Storage', 'Hydro'): 0.0, ('CES - Storage', 'Nuclear'): 12.68906562274404, ('CES - Storage', 'NGCC'): 25.10914978408783, ('CES - Storage', 'NGGT'): 32.703513055654646, ('CES - Storage', 'Overflow'): 169.6795540944021}}
【问题讨论】:
-
如果你在
cumsum之前shift喜欢.shift(2).cumsum()会发生什么? -
好像这样可以回答你的问题:stackoverflow.com/questions/54993050/…
-
@JoshuaVoskamp 这适用于
Case,但它不能解决仅从第 3 个条目开始的原始问题。
标签: python pandas cumulative-sum