【发布时间】:2021-12-22 16:49:10
【问题描述】:
我有一个数据框,每 3 小时有一次值,我将其上采样到 1 小时。新的 bin 使用 NaN 保持为空。我想用总和等于未上采样的 bin 的值的值填充这些 NaN,并“反求和”未上采样的 bin 的值。
例如: 我有 3 个垃圾箱。第 3 个 bin 的值为 3,bin 1 和 2 有 NaN。我想为每个 1 填充箱 1,2 和 3。最后,如果我每 3 个 bin 有一个累积和,则结果将等于我的 bin 在上采样之前的值。
我写了一个例子来说明我的意思(对不起,我很难解释清楚)。有没有更好的方法来做到这一点?
import numpy as np
import pandas as pd
from datetime import *
# Create df with a datetime index every 3 hours
rng = pd.date_range('2000-01-01', periods=365*(24/3), freq='3H')
df = pd.DataFrame({'Val': np.random.randn(len(rng)) }, index = rng)
# Upsample to 1H but keep the new bins empty
df = df.resample('1H').asfreq()
# Create a copy of df to verify that the sum went well
df_summed_every_3_bins = df.copy()
# Create a counter to the next bin holding a value
to_full_bin = 2
# We de-sum the first value
df.Val.values[0] = df.Val.values[0]/3
for i in range(1,len(df)):
# Take the value from a bin, divide it by 3 and insert it in the dataframe
df.Val.values[i] = df.Val.values[i+to_full_bin]/3
# We move forward in df, meaning that the bin with a value is approaching. So we reduce the counter by 1
to_full_bin = to_full_bin-1
# When the variable is equal to -1, it means we need to reinitialized our counter
if to_full_bin == -1:
to_full_bin = 2
【问题讨论】:
标签: python pandas dataframe sum sampling