【问题标题】:Pandas upsampling a dataframe and make a cumulative sum on a specific window熊猫对数据帧进行上采样并在特定窗口上进行累积和
【发布时间】:2021-12-22 16:49:10
【问题描述】:

我有一个数据框,每 3 小时有一次值,我将其上采样到 1 小时。新的 bin 使用 NaN 保持为空。我想用总和等于未上采样的 bin 的值的值填充这些 NaN,并“反求和”未上采样的 bin 的值。

例如: 我有 3 个垃圾箱。第 3 个 bin 的值为 3,bin 1 和 2 有 NaN。我想为每个 1 填充箱 1,2 和 3。最后,如果我每 3 个 bin 有一个累积和,则结果将等于我的 bin 在上采样之前的值。

我写了一个例子来说明我的意思(对不起,我很难解释清楚)。有没有更好的方法来做到这一点?

import numpy as np
import pandas as pd
from datetime import *

# Create df with a datetime index every 3 hours
rng = pd.date_range('2000-01-01', periods=365*(24/3), freq='3H') 
df = pd.DataFrame({'Val': np.random.randn(len(rng)) }, index = rng)

# Upsample to 1H but keep the new bins empty
df = df.resample('1H').asfreq()

# Create a copy of df to verify that the sum went well
df_summed_every_3_bins = df.copy()

# Create a counter to the next bin holding a value
to_full_bin = 2

# We de-sum the first value
df.Val.values[0] = df.Val.values[0]/3
for i in range(1,len(df)):
    
    # Take the value from a bin, divide it by 3 and insert it in the dataframe
    df.Val.values[i] = df.Val.values[i+to_full_bin]/3
    
    # We move forward in df, meaning that the bin with a value is approaching. So we reduce the counter by 1
    to_full_bin = to_full_bin-1
    
    # When the variable is equal to -1, it means we need to reinitialized our counter
    if to_full_bin == -1:
        to_full_bin = 2

【问题讨论】:

    标签: python pandas dataframe sum sampling


    【解决方案1】:

    Resample 数据帧,然后是 backfill 并除以 3

    df.resample('1H').bfill().div(3)
    

                              Val
    2000-01-01 00:00:00 -0.747733
    2000-01-01 01:00:00 -0.057699
    2000-01-01 02:00:00 -0.057699
    2000-01-01 03:00:00 -0.057699
    2000-01-01 04:00:00 -0.409512
    2000-01-01 05:00:00 -0.409512
    2000-01-01 06:00:00 -0.409512
    2000-01-01 07:00:00 -0.108856
    2000-01-01 08:00:00 -0.108856
    2000-01-01 09:00:00 -0.108856
    ...
    

    【讨论】:

      猜你喜欢
      • 2020-12-30
      • 2018-12-18
      • 2018-11-16
      • 2016-01-10
      • 1970-01-01
      • 1970-01-01
      • 2022-01-16
      • 2022-10-16
      • 2016-12-17
      相关资源
      最近更新 更多