【问题标题】:Pandas iterate max value of a variable length slice in a series熊猫在一系列中迭代可变长度切片的最大值
【发布时间】:2017-10-20 00:19:56
【问题描述】:

假设我有一个 Pandas DataFrame,如下所示:

import pandas as pd
idx = ['2003-01-02', '2003-01-03', '2003-01-06', '2003-01-07',
       '2003-01-08', '2003-01-09', '2003-01-10', '2003-01-13',
       '2003-01-14', '2003-01-15', '2003-01-16', '2003-01-17',
       '2003-01-21', '2003-01-22', '2003-01-23', '2003-01-24',
       '2003-01-27']

a = pd.DataFrame([1,2,0,0,1,2,3,0,0,0,1,2,3,4,5,0,1],
                  columns = ['original'], index = pd.to_datetime(idx))

我试图在两个零之间获取该 DataFrame 的每个切片的最大值。 在那个例子中,我会得到:

a['result'] = [0,2,0,0,0,0,3,0,0,0,0,0,0,0,5,0,1]

即:

            original  result
2003-01-02         1       0
2003-01-03         2       2
2003-01-06         0       0
2003-01-07         0       0
2003-01-08         1       0
2003-01-09         2       0
2003-01-10         3       3
2003-01-13         0       0
2003-01-14         0       0
2003-01-15         0       0
2003-01-16         1       0
2003-01-17         2       0
2003-01-21         3       0
2003-01-22         4       0
2003-01-23         5       5
2003-01-24         0       0
2003-01-27         1       1

【问题讨论】:

    标签: pandas python-3.5


    【解决方案1】:
    • 找零
    • cumsum组群
    • mask 零成自己的组-1
    • 在每个组中找到最大位置idxmax
    • 去掉-1 组的那个,反正那个是零
    • 获取 a.original 以获取找到的最大位置,重新索引并用零填充

    m = a.original.eq(0)
    g = a.original.groupby(m.cumsum().mask(m, -1))
    i = g.idxmax().drop(-1)
    a.assign(result=a.loc[i, 'original'].reindex(a.index, fill_value=0))
    
                original  result
    2003-01-02         1       0
    2003-01-03         2       2
    2003-01-06         0       0
    2003-01-07         0       0
    2003-01-08         1       0
    2003-01-09         2       0
    2003-01-10         3       3
    2003-01-13         0       0
    2003-01-14         0       0
    2003-01-15         0       0
    2003-01-16         1       0
    2003-01-17         2       0
    2003-01-21         3       0
    2003-01-22         4       0
    2003-01-23         5       5
    2003-01-24         0       0
    2003-01-27         1       1
    

    【讨论】:

      猜你喜欢
      • 2020-03-17
      • 2021-09-12
      • 1970-01-01
      • 2022-10-16
      • 2017-12-30
      • 2021-10-10
      • 2015-02-23
      • 2018-12-29
      • 2021-10-22
      相关资源
      最近更新 更多