【问题标题】:Partition pandas .diff() in multi-index level在多索引级别对 pandas .diff() 进行分区
【发布时间】:2015-07-08 19:02:13
【问题描述】:

我的问题与在多索引级别的分区内调用 .diff() 有关

在下面的示例中第一个的输出

df.diff() 是

               values
Greek English        
alpha a           NaN
      b             2
      c             2
      d             2
beta  e            11
      f             1
      g             1
      h             1

但我希望它是:

               values
Greek English        
alpha a           NaN
      b             2
      c             2
      d             2
beta  e            NaN
      f             1
      g             1
      h             1

这是一个使用循环的解决方案,但我想我可以避免那个循环

import pandas as pd
import numpy as np

df = pd.DataFrame({'values' : [1.,3.,5.,7.,18.,19.,20.,21.],
   'Greek' : ['alpha', 'alpha', 'alpha', 'alpha','beta','beta','beta','beta'],
   'English' : ['a', 'b', 'c', 'd','e','f','g','h']})

df.set_index(['Greek','English'],inplace =True)
print df

# (1.) This is not the type of .diff() i want.
# I need it to respect the level='Greek' and restart   
print df.diff()


# this is one way to achieve my desired result but i have to think
# there is a way that does not involve the need to loop.
idx = pd.IndexSlice
for greek_letter in df.index.get_level_values('Greek').unique():
    df.loc[idx[greek_letter,:]]['values'] = df.loc[idx[greek_letter,:]].diff()

print df

【问题讨论】:

    标签: pandas multi-index


    【解决方案1】:

    只需 groupby level=0 或“希腊语”,如果您愿意,然后您可以在值上致电 diff

    In [179]:
    
    df.groupby(level=0)['values'].diff()
    Out[179]:
    Greek  English
    alpha  a         NaN
           b           2
           c           2
           d           2
    beta   e         NaN
           f           1
           g           1
           h           1
    dtype: float64
    

    【讨论】:

      猜你喜欢
      • 2018-07-08
      • 2015-11-04
      • 2016-07-21
      • 1970-01-01
      • 1970-01-01
      • 2018-11-27
      • 2020-04-29
      • 2018-07-09
      • 2021-02-02
      相关资源
      最近更新 更多