【问题标题】:Adding level 2 index as a sum of other indexes with a condition将 2 级索引添加为具有条件的其他索引的总和
【发布时间】:2021-09-29 15:09:27
【问题描述】:

我有一个df

df = pd.DataFrame.from_dict({('group', ''): {0: 'A',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A',
  7: 'A',
  8: 'A',
  9: 'B',
  10: 'B',
  11: 'B',
  12: 'B',
  13: 'B',
  14: 'B',
  15: 'B',
  16: 'B',
  17: 'B',
  18: 'all',
  19: 'all'},
 ('category', ''): {0: 'Amazon',
  1: 'Apple',
  2: 'Facebook',
  3: 'Google',
  4: 'Netflix',
  5: 'Tesla',
  6: 'Total',
  7: 'Uber',
  8: 'total',
  9: 'Amazon',
  10: 'Apple',
  11: 'Facebook',
  12: 'Google',
  13: 'Netflix',
  14: 'Tesla',
  15: 'Total',
  16: 'Uber',
  17: 'total',
  18: 'Total',
  19: 'total'},
 (pd.Timestamp('2020-06-29 00:00:00'), 'last_sales'): {0: 195.0,
  1: 61.0,
  2: 106.0,
  3: 61.0,
  4: 37.0,
  5: 13.0,
  6: 954.0,
  7: 4.0,
  8: 477.0,
  9: 50.0,
  10: 50.0,
  11: 75.0,
  12: 43.0,
  13: 17.0,
  14: 14.0,
  15: 504.0,
  16: 3.0,
  17: 252.0,
  18: 2916.0,
  19: 2916.0},
 (pd.Timestamp('2020-06-29 00:00:00'), 'sales'): {0: 1268.85,
  1: 18274.385000000002,
  2: 19722.65,
  3: 55547.255,
  4: 15323.800000000001,
  5: 1688.6749999999997,
  6: 227463.23,
  7: 1906.0,
  8: 113731.615,
  9: 3219.6499999999996,
  10: 15852.060000000001,
  11: 17743.7,
  12: 37795.15,
  13: 5918.5,
  14: 1708.75,
  15: 166349.64,
  16: 937.01,
  17: 83174.82,
  18: 787625.7400000001,
  19: 787625.7400000001},
 (pd.Timestamp('2020-06-29 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2020-07-06 00:00:00'), 'last_sales'): {0: 26.0,
  1: 39.0,
  2: 79.0,
  3: 49.0,
  4: 10.0,
  5: 10.0,
  6: 436.0,
  7: 5.0,
  8: 218.0,
  9: 89.0,
  10: 34.0,
  11: 133.0,
  12: 66.0,
  13: 21.0,
  14: 20.0,
  15: 732.0,
  16: 3.0,
  17: 366.0,
  18: 2336.0,
  19: 2336.0},
 (pd.Timestamp('2020-07-06 00:00:00'), 'sales'): {0: 3978.15,
  1: 12138.96,
  2: 19084.175,
  3: 40033.46000000001,
  4: 4280.15,
  5: 1495.1,
  6: 165548.29,
  7: 1764.15,
  8: 82774.145,
  9: 8314.92,
  10: 12776.649999999996,
  11: 28048.075,
  12: 55104.21000000002,
  13: 6962.844999999999,
  14: 3053.2000000000003,
  15: 231049.11000000002,
  16: 1264.655,
  17: 115524.55500000001,
  18: 793194.8000000002,
  19: 793194.8000000002},
 (pd.Timestamp('2020-07-06 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2021-06-28 00:00:00'), 'last_sales'): {0: 96.0,
  1: 56.0,
  2: 106.0,
  3: 44.0,
  4: 34.0,
  5: 13.0,
  6: 716.0,
  7: 9.0,
  8: 358.0,
  9: 101.0,
  10: 22.0,
  11: 120.0,
  12: 40.0,
  13: 13.0,
  14: 8.0,
  15: 610.0,
  16: 1.0,
  17: 305.0,
  18: 2652.0,
  19: 2652.0},
 (pd.Timestamp('2021-06-28 00:00:00'), 'sales'): {0: 5194.95,
  1: 19102.219999999994,
  2: 22796.420000000002,
  3: 30853.115,
  4: 11461.25,
  5: 992.6,
  6: 188143.41,
  7: 3671.15,
  8: 94071.705,
  9: 6022.299999999998,
  10: 7373.6,
  11: 33514.0,
  12: 35943.45,
  13: 4749.000000000001,
  14: 902.01,
  15: 177707.32,
  16: 349.3,
  17: 88853.66,
  18: 731701.46,
  19: 731701.46},
 (pd.Timestamp('2021-06-28 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0},
 (pd.Timestamp('2021-07-07 00:00:00'), 'last_sales'): {0: 45.0,
  1: 47.0,
  2: 87.0,
  3: 45.0,
  4: 13.0,
  5: 8.0,
  6: 494.0,
  7: 2.0,
  8: 247.0,
  9: 81.0,
  10: 36.0,
  11: 143.0,
  12: 56.0,
  13: 9.0,
  14: 9.0,
  15: 670.0,
  16: 1.0,
  17: 335.0,
  18: 2328.0,
  19: 2328.0},
 (pd.Timestamp('2021-07-07 00:00:00'), 'sales'): {0: 7556.414999999998,
  1: 14985.05,
  2: 16790.899999999998,
  3: 36202.729999999996,
  4: 4024.97,
  5: 1034.45,
  6: 163960.32999999996,
  7: 1385.65,
  8: 81980.16499999998,
  9: 5600.544999999999,
  10: 11209.92,
  11: 32832.61,
  12: 42137.44500000001,
  13: 3885.1499999999996,
  14: 1191.5,
  15: 194912.34000000003,
  16: 599.0,
  17: 97456.17000000001,
  18: 717745.3400000001,
  19: 717745.3400000001},
 (pd.Timestamp('2021-07-07 00:00:00'), 'difference'): {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0}}).set_index(['group','category'])

我正在尝试创建一个名为 combined 的级别 2 index,它将是除 categories 之外的所有 sales & last_sales 的总和,Facebooktotal / Total 除外。 所以df 看起来像这样:

我尝试使用.loc 进行操作,但没有成功:

s = df_out.stack(0)

s['combined'] = 0
s.loc[(slice(None),[x for x in s.loc[(slice(None),:) if x != 'Facebook']].sum()

【问题讨论】:

  • 您要排除level=1 中的Totaltotal 类别吗?
  • 是的,会更新我的问题表述。
  • this 能回答你的问题吗?

标签: python pandas indexing


【解决方案1】:

解决方案

  • all 放入level=0,同样将drop 其他不需要的电平值放入level=1
  • 计算level=0 上的sum 以聚合帧
  • 创建Multindex 以在聚合框架中添加附加级别combined
  • 追加和排序索引以保持顺序
s = df.drop('all').drop(['Facebook', 'total', 'Total'], level=1).sum(level=0)
s.index = pd.MultiIndex.from_product([s.index, ['combined']])
df_out = df.append(s).sort_index()

结果

                        2020-06-29 00:00:00                        2020-07-06 00:00:00                        2021-06-28 00:00:00                        2021-07-07 00:00:00                       
                        last_sales       sales difference          last_sales       sales difference          last_sales       sales difference          last_sales       sales difference
group category                                                                                                                                                                            
A     Amazon                 195.0    1268.850        0.0                26.0    3978.150        0.0                96.0    5194.950        0.0                45.0    7556.415        0.0
      Apple                   61.0   18274.385        0.0                39.0   12138.960        0.0                56.0   19102.220        0.0                47.0   14985.050        0.0
      Facebook               106.0   19722.650        0.0                79.0   19084.175        0.0               106.0   22796.420        0.0                87.0   16790.900        0.0
      Google                  61.0   55547.255        0.0                49.0   40033.460        0.0                44.0   30853.115        0.0                45.0   36202.730        0.0
      Netflix                 37.0   15323.800        0.0                10.0    4280.150        0.0                34.0   11461.250        0.0                13.0    4024.970        0.0
      Tesla                   13.0    1688.675        0.0                10.0    1495.100        0.0                13.0     992.600        0.0                 8.0    1034.450        0.0
      Total                  954.0  227463.230        0.0               436.0  165548.290        0.0               716.0  188143.410        0.0               494.0  163960.330        0.0
      Uber                     4.0    1906.000        0.0                 5.0    1764.150        0.0                 9.0    3671.150        0.0                 2.0    1385.650        0.0
      combined               371.0   94008.965        0.0               139.0   63689.970        0.0               252.0   71275.285        0.0               160.0   65189.265        0.0
      total                  477.0  113731.615        0.0               218.0   82774.145        0.0               358.0   94071.705        0.0               247.0   81980.165        0.0
B     Amazon                  50.0    3219.650        0.0                89.0    8314.920        0.0               101.0    6022.300        0.0                81.0    5600.545        0.0
      Apple                   50.0   15852.060        0.0                34.0   12776.650        0.0                22.0    7373.600        0.0                36.0   11209.920        0.0
      Facebook                75.0   17743.700        0.0               133.0   28048.075        0.0               120.0   33514.000        0.0               143.0   32832.610        0.0
      Google                  43.0   37795.150        0.0                66.0   55104.210        0.0                40.0   35943.450        0.0                56.0   42137.445        0.0
      Netflix                 17.0    5918.500        0.0                21.0    6962.845        0.0                13.0    4749.000        0.0                 9.0    3885.150        0.0
      Tesla                   14.0    1708.750        0.0                20.0    3053.200        0.0                 8.0     902.010        0.0                 9.0    1191.500        0.0
      Total                  504.0  166349.640        0.0               732.0  231049.110        0.0               610.0  177707.320        0.0               670.0  194912.340        0.0
      Uber                     3.0     937.010        0.0                 3.0    1264.655        0.0                 1.0     349.300        0.0                 1.0     599.000        0.0
      combined               177.0   65431.120        0.0               233.0   87476.480        0.0               185.0   55339.660        0.0               192.0   64623.560        0.0
      total                  252.0   83174.820        0.0               366.0  115524.555        0.0               305.0   88853.660        0.0               335.0   97456.170        0.0
all   Total                 2916.0  787625.740        0.0              2336.0  793194.800        0.0              2652.0  731701.460        0.0              2328.0  717745.340        0.0
      total                 2916.0  787625.740        0.0              2336.0  793194.800        0.0              2652.0  731701.460        0.0              2328.0  717745.340        0.0

【讨论】:

    【解决方案2】:

    您可以在级别 0 上进行分组,如果 name 不是“全部”,则 concat 是每个组的总和,同时添加索引。然后将此输出与原始数据和sort_index 结合level=0

    pd.concat([df,
               pd.concat({(name, 'combined'): d.drop(['Facebook', 'total', 'Total'], level=1).sum()
                          for name,d in df.groupby(level=0) if name != 'all'}, axis=1).T
              ]).sort_index(level=0)
    

    输出:

                   2020-06-29 00:00:00                        2020-07-06 00:00:00                        2021-06-28 00:00:00                        2021-07-07 00:00:00                       
                            last_sales       sales difference          last_sales       sales difference          last_sales       sales difference          last_sales       sales difference
    group category                                                                                                                                                                            
    A     Amazon                 195.0    1268.850        0.0                26.0    3978.150        0.0                96.0    5194.950        0.0                45.0    7556.415        0.0
          Apple                   61.0   18274.385        0.0                39.0   12138.960        0.0                56.0   19102.220        0.0                47.0   14985.050        0.0
          Facebook               106.0   19722.650        0.0                79.0   19084.175        0.0               106.0   22796.420        0.0                87.0   16790.900        0.0
          Google                  61.0   55547.255        0.0                49.0   40033.460        0.0                44.0   30853.115        0.0                45.0   36202.730        0.0
          Netflix                 37.0   15323.800        0.0                10.0    4280.150        0.0                34.0   11461.250        0.0                13.0    4024.970        0.0
          Tesla                   13.0    1688.675        0.0                10.0    1495.100        0.0                13.0     992.600        0.0                 8.0    1034.450        0.0
          Total                  954.0  227463.230        0.0               436.0  165548.290        0.0               716.0  188143.410        0.0               494.0  163960.330        0.0
          Uber                     4.0    1906.000        0.0                 5.0    1764.150        0.0                 9.0    3671.150        0.0                 2.0    1385.650        0.0
          combined               371.0   94008.965        0.0               139.0   63689.970        0.0               252.0   71275.285        0.0               160.0   65189.265        0.0
          total                  477.0  113731.615        0.0               218.0   82774.145        0.0               358.0   94071.705        0.0               247.0   81980.165        0.0
    B     Amazon                  50.0    3219.650        0.0                89.0    8314.920        0.0               101.0    6022.300        0.0                81.0    5600.545        0.0
          Apple                   50.0   15852.060        0.0                34.0   12776.650        0.0                22.0    7373.600        0.0                36.0   11209.920        0.0
          Facebook                75.0   17743.700        0.0               133.0   28048.075        0.0               120.0   33514.000        0.0               143.0   32832.610        0.0
          Google                  43.0   37795.150        0.0                66.0   55104.210        0.0                40.0   35943.450        0.0                56.0   42137.445        0.0
          Netflix                 17.0    5918.500        0.0                21.0    6962.845        0.0                13.0    4749.000        0.0                 9.0    3885.150        0.0
          Tesla                   14.0    1708.750        0.0                20.0    3053.200        0.0                 8.0     902.010        0.0                 9.0    1191.500        0.0
          Total                  504.0  166349.640        0.0               732.0  231049.110        0.0               610.0  177707.320        0.0               670.0  194912.340        0.0
          Uber                     3.0     937.010        0.0                 3.0    1264.655        0.0                 1.0     349.300        0.0                 1.0     599.000        0.0
          combined               177.0   65431.120        0.0               233.0   87476.480        0.0               185.0   55339.660        0.0               192.0   64623.560        0.0
          total                  252.0   83174.820        0.0               366.0  115524.555        0.0               305.0   88853.660        0.0               335.0   97456.170        0.0
    all   Total                 2916.0  787625.740        0.0              2336.0  793194.800        0.0              2652.0  731701.460        0.0              2328.0  717745.340        0.0
          total                 2916.0  787625.740        0.0              2336.0  793194.800        0.0              2652.0  731701.460        0.0              2328.0  717745.340        0.0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-09-27
      • 1970-01-01
      • 2016-11-22
      • 1970-01-01
      • 1970-01-01
      • 2023-03-05
      相关资源
      最近更新 更多