【问题标题】:Using pandas to calculate over-the-month and over-the-year change使用 pandas 计算月度和年度变化
【发布时间】:2020-09-09 00:30:19
【问题描述】:

我不知道该怎么做,但我想从这个 DataFrame 开始:

Date    Value
Jan-15  300
Feb-15  302
Mar-15  303
Apr-15  305
May-15  307
Jun-15  307
Jul-15  305
Aug-15  306
Sep-15  308
Oct-15  310
Nov-15  309
Dec-15  312
Jan-16  315
Feb-16  317
Mar-16  315
Apr-16  315
May-16  312
Jun-16  314
Jul-16  312
Aug-16  313
Sep-16  316
Oct-16  316
Nov-16  316
Dec-16  312

通过计算每月和每年的变化来计算这一点:

Date    Value  otm  oty
Jan-15  300    na   na
Feb-15  302    2    na
Mar-15  303    1    na
Apr-15  305    2    na
May-15  307    2    na
Jun-15  307    0    na
Jul-15  305    -2   na
Aug-15  306    1    na
Sep-15  308    2    na
Oct-15  310    2    na
Nov-15  309    -1   na
Dec-15  312    3    na
Jan-16  315    3    15
Feb-16  317    2    15
Mar-16  315    -2   12
Apr-16  315    0    10
May-16  312    -3   5
Jun-16  314    2    7
Jul-16  312    -2   7
Aug-16  313    1    7
Sep-16  316    3    8
Oct-16  316    0    6
Nov-16  316    0    7
Dec-16  312    -4   0

所以 otm 是根据上面字段的值计算出来的,oty 是根据上面的 12 个字段计算出来的。

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    我认为您需要diff,但必须在索引中不缺少任何月份:

    df['otm'] = df.Value.diff()
    df['oty'] = df.Value.diff(12)
    print (df)
          Date  Value  otm   oty
    0   Jan-15    300  NaN   NaN
    1   Feb-15    302  2.0   NaN
    2   Mar-15    303  1.0   NaN
    3   Apr-15    305  2.0   NaN
    4   May-15    307  2.0   NaN
    5   Jun-15    307  0.0   NaN
    6   Jul-15    305 -2.0   NaN
    7   Aug-15    306  1.0   NaN
    8   Sep-15    308  2.0   NaN
    9   Oct-15    310  2.0   NaN
    10  Nov-15    309 -1.0   NaN
    11  Dec-15    312  3.0   NaN
    12  Jan-16    315  3.0  15.0
    13  Feb-16    317  2.0  15.0
    14  Mar-16    315 -2.0  12.0
    15  Apr-16    315  0.0  10.0
    16  May-16    312 -3.0   5.0
    17  Jun-16    314  2.0   7.0
    18  Jul-16    312 -2.0   7.0
    19  Aug-16    313  1.0   7.0
    20  Sep-16    316  3.0   8.0
    21  Oct-16    316  0.0   6.0
    22  Nov-16    316  0.0   7.0
    23  Dec-16    312 -4.0   0.0
    

    如果缺少一些数据,那就有点复杂了:


    df['Date'] = pd.to_datetime(df['Date'], format='%b-%y').dt.to_period('M')
    df = df.set_index('Date')
    df = df.reindex(pd.period_range(df.index.min(), df.index.max(), freq='M'))
    df.index = df.index.strftime('%b-%y')
    df = df.rename_axis('date').reset_index()
    
    df['otm'] = df.Value.diff()
    df['oty'] = df.Value.diff(12)
    
    print (df)
          date  Value  otm   oty
    0   Jan-15  300.0  NaN   NaN
    1   Feb-15  302.0  2.0   NaN
    2   Mar-15    NaN  NaN   NaN
    3   Apr-15    NaN  NaN   NaN
    4   May-15  307.0  NaN   NaN
    5   Jun-15  307.0  0.0   NaN
    6   Jul-15  305.0 -2.0   NaN
    7   Aug-15  306.0  1.0   NaN
    8   Sep-15  308.0  2.0   NaN
    9   Oct-15  310.0  2.0   NaN
    10  Nov-15  309.0 -1.0   NaN
    11  Dec-15  312.0  3.0   NaN
    12  Jan-16  315.0  3.0  15.0
    13  Feb-16  317.0  2.0  15.0
    14  Mar-16  315.0 -2.0   NaN
    15  Apr-16  315.0  0.0   NaN
    16  May-16  312.0 -3.0   5.0
    17  Jun-16  314.0  2.0   7.0
    18  Jul-16  312.0 -2.0   7.0
    19  Aug-16  313.0  1.0   7.0
    20  Sep-16  316.0  3.0   8.0
    21  Oct-16  316.0  0.0   6.0
    22  Nov-16  316.0  0.0   7.0
    23  Dec-16  312.0 -4.0   0.0
    

    【讨论】:

    • 假设它们总是井井有条/所有月份都存在
    • 没错,否则需要重新索引。
    • 哇,这么简单。熊猫很棒!感谢您提供有关重新索引的提示。我会遇到数据不有序的情况。
    • @flyingmeatball - 谢谢,我添加了 reindex 解决方案。
    【解决方案2】:

    更正确的解决方案是按月频移:

    #Create datetime column
    df['DateTime'] = pd.to_datetime(df['Date'], format='%b-%y')
    
    #Set it as index
    df.set_index('DateTime', inplace=True)
    
    #Then shift by month frequency:
    df['otm'] = df['Value'] - df['Value'].shift(1, freq='MS')
    df['oty'] = df['Value'] - df['Value'].shift(12, freq='MS')
    

    【讨论】:

      【解决方案3】:
      df['otm'] = df['Value'] - df['Value'].shift(1)
      df['oty'] = df['Value'] - df['Value'].shift(12)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2020-07-13
        • 1970-01-01
        • 2020-09-26
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-09-16
        相关资源
        最近更新 更多