【发布时间】:2018-04-24 04:43:57
【问题描述】:
我有一个数据框,其中汇总了几天的数据。我想补充缺失的日子
我正在关注另一个帖子 Add missing dates to pandas dataframe,不幸的是,它覆盖了我的结果(可能功能略有改变?)...代码如下
import random
import datetime as dt
import numpy as np
import pandas as pd
def generate_row(year, month, day):
while True:
date = dt.datetime(year=year, month=month, day=day)
data = np.random.random(size=4)
yield [date] + list(data)
# days I have data for
dates = [(2000, 1, 1), (2000, 1, 2), (2000, 2, 4)]
generators = [generate_row(*date) for date in dates]
# get 5 data points for each
data = [next(generator) for generator in generators for _ in range(5)]
df = pd.DataFrame(data, columns=['date'] + ['f'+str(i) for i in range(1,5)])
# df
groupby_day = df.groupby(pd.PeriodIndex(data=df.date, freq='D'))
results = groupby_day.sum()
idx = pd.date_range(min(df.date), max(df.date))
results.reindex(idx, fill_value=0)
【问题讨论】:
-
也许您正在寻找重采样?
-
看起来很有希望,但我很难从文档中应用它
-
我想我明白了...
df.set_index(df.date, inplace=True)+df = df.resample('D').sum()很方便 -
没错。如果可行,请将其写为答案,我会给你一个赞成票。