【发布时间】:2021-11-07 23:54:43
【问题描述】:
假设我有一个这样的数据框
eff_date,mdl_cd,ast_cd,prop_cd,value
2021-09-22,Comm,Agri,Car,-0.1234
2021-09-22,Comm,Agri,Fund,0.5123
2021-09-22,Comm,Agri,Mmt,-0.7612
2021-09-22,Comm,Engy,Car,0.1212
2021-09-22,Comm,Engy,Fund,-0.1234
2021-09-22,Comm,Engy,Mmt,0.5123
2021-09-22,Comm,Industry,Car,-0.7612
2021-09-22,Comm,Industry,Fund,0.1212
2021-09-22,Comm,Industry,Mmt,-0.1234
2021-09-22,Comm,Metal,Car,0.5123
2021-09-22,Comm,Metal,Fund,-0.7612
2021-09-22,Comm,Metal,Mmt,0.1212
2021-09-23,Equity,Agri,Car,0.6541
2021-09-23,Equity,Agri,Fund,0.5123
2021-09-23,Equity,Agri,Mmt,-0.1874
2021-09-23,Equity,Engy,Car,0.1212
2021-09-23,Equity,Engy,Fund,-0.6234
2021-09-23,Equity,Engy,Mmt,0.5123
2021-09-23,Equity,Industry,Car,-0.1612
2021-09-23,Equity,Industry,Fund,0.1212
2021-09-23,Equity,Industry,Mmt,-0.1934
2021-09-23,Equity,Metal,Car,0.5123
2021-09-23,Equity,Metal,Fund,0.5412
2021-09-23,Equity,Metal,Mmt,0.1212
我想在每组 groupby(by=['eff_date','mdl_cd','ast_cd']) 中添加一个新行
其中eff_date,mdl_cd 和ast_cd 的列值将保持same 但prop_cd 的值变为Hlds 并且值值列变为该组的值的总和,例如对于 value 列的第一组值将是 (-0.1234+0.5123+-0.7612) 即 -0.3723
因此输出将是这样的
eff_date,mdl_cd,ast_cd,prop_cd,value
2021-09-22,Comm,Agri,Car,-0.1234
2021-09-22,Comm,Agri,Fund,0.5123
2021-09-22,Comm,Agri,Mmt,-0.7612
2021-09-22,Comm,Agri,Hlds,-0.3723 +row added (sum of value in that group)
2021-09-22,Comm,Engy,Car,0.1212
2021-09-22,Comm,Engy,Fund,-0.1234
2021-09-22,Comm,Engy,Mmt,0.5123
2021-09-22,Comm,Engy,Hlds,0.5101 +row added (sum of value in that group)
2021-09-22,Comm,Industry,Car,-0.7612
2021-09-22,Comm,Industry,Fund,0.1212
2021-09-22,Comm,Industry,Mmt,-0.1234
2021-09-22,Comm,Industry,Hlds,-0.7634 +row added (sum of value in that group)
2021-09-22,Comm,Metal,Car,0.5123
2021-09-22,Comm,Metal,Fund,-0.7612
2021-09-22,Comm,Metal,Mmt,0.1212
2021-09-22,Comm,Metal,Hlds,-0.1277 +row added (sum of value in that group)
2021-09-23,Equity,Agri,Car,0.6541
2021-09-23,Equity,Agri,Fund,0.5123
2021-09-23,Equity,Agri,Mmt,-0.1874
2021-09-23,Equity,Agri,Hlds,0.979 +row added (sum of value in that group)
2021-09-23,Equity,Engy,Car,0.1212
2021-09-23,Equity,Engy,Fund,-0.6234
2021-09-23,Equity,Engy,Mmt,0.5123
2021-09-23,Equity,Engy,Hlds,0.0101 +row added (sum of value in that group)
2021-09-23,Equity,Industry,Car,-0.1612
2021-09-23,Equity,Industry,Fund,0.1212
2021-09-23,Equity,Industry,Mmt,-0.1934
2021-09-23,Equity,Industry,Hlds,-0.2334 +row added (sum of value in that group)
2021-09-23,Equity,Metal,Car,0.5123
2021-09-23,Equity,Metal,Fund,0.5412
2021-09-23,Equity,Metal,Mmt,0.1212
2021-09-23,Equity,Metal,Hlds,1.1747 +row added (sum of value in that group)
如何使用 pandas 执行此计算
【问题讨论】:
-
你能不能一次性完成 groupby 和 sum 并将其连接回原始 df 然后排序。
标签: python pandas pandas-groupby