【问题标题】:How to add new 5-minute interval如何添加新的 5 分钟间隔
【发布时间】:2020-08-22 07:22:28
【问题描述】:

我想为仅在某些日期出现的所有日期添加公共时间间隔。 这是我的数据样本。此示例数据中不存在更多时间间隔。

Data:

Row   Date     Hour Minute  Open    Close

0   2006-12-11  10  0       736.0   778.0
1   2006-12-11  10  5       775.0   775.0
2   2006-12-11  10  10      778.8   780.0
3   2006-12-11  10  30      780.0   780.0
4   2006-12-11  10  45      780.0   780.0
5   2006-12-11  10  50      781.0   799.0
6   2006-12-12   9  0       736.0   778.0
7   2006-12-12   9  5       775.0   775.0
8   2006-12-12   9  10      778.8   780.0
9   2006-12-12  10  0       780.0   780.0
10  2006-12-12  10  5       780.0   780.0
11  2006-12-12  10  10      781.0   799.0
12  2006-12-12  10  15      780.0   780.0
13  2006-12-12  10  45      780.0   780.0
14  2006-12-12  10  50      781.0   799.0


Expected Output:
Row   Date     Hour Minute  Open    Close

0   2006-12-11   9  0       null    null
1   2006-12-11   9  5       null    null
2   2006-12-11   9  10      null    null
3   2006-12-11  10  0       736.0   778.0
4   2006-12-11  10  5       775.0   775.0
5   2006-12-11  10  10      778.8   780.0
6   2006-12-11  10  15      null    null
7   2006-12-11  10  30      780.0   780.0
8   2006-12-11  10  45      780.0   780.0
9   2006-12-11  10  50      781.0   799.0
10  2006-12-12   9  0       736.0   778.0
11  2006-12-12   9  5       775.0   775.0
12  2006-12-12   9  10      778.8   780.0
12  2006-12-12  10  0       780.0   780.0
14  2006-12-12  10  5       780.0   780.0
15  2006-12-12  10  10      781.0   799.0
16  2006-12-12  10  15      780.0   780.0
17  2006-12-11  10  30      null    null
18  2006-12-12  10  45      780.0   780.0
19  2006-12-12  10  50      781.0   799.0

【问题讨论】:

    标签: pandas date pandas-groupby sklearn-pandas


    【解决方案1】:

    您可以使用DataFrame.unstackDataFrame.stack 来添加缺失的组合:

    df1 = (df.set_index(['Date','Hour','Minute'])
             .unstack([1,2])
             .stack([1,2],dropna=False)
             .reset_index())
    

    或者DataFrame.reindexMultiIndex.from_product

    df1 = df.set_index(['Date','Hour','Minute'])
    mux = pd.MultiIndex.from_product(df1.index.levels)
    df1 = df1.reindex(mux).reset_index()
    

    print (df1)
              Date  Hour  Minute   Open  Close
    0   2006-12-11     9       0    NaN    NaN
    1   2006-12-11     9       5    NaN    NaN
    2   2006-12-11     9      10    NaN    NaN
    3   2006-12-11     9      15    NaN    NaN
    4   2006-12-11     9      30    NaN    NaN
    5   2006-12-11     9      45    NaN    NaN
    6   2006-12-11     9      50    NaN    NaN
    7   2006-12-11    10       0  736.0  778.0
    8   2006-12-11    10       5  775.0  775.0
    9   2006-12-11    10      10  778.8  780.0
    10  2006-12-11    10      15    NaN    NaN
    11  2006-12-11    10      30  780.0  780.0
    12  2006-12-11    10      45  780.0  780.0
    13  2006-12-11    10      50  781.0  799.0
    14  2006-12-12     9       0  736.0  778.0
    15  2006-12-12     9       5  775.0  775.0
    16  2006-12-12     9      10  778.8  780.0
    17  2006-12-12     9      15    NaN    NaN
    18  2006-12-12     9      30    NaN    NaN
    19  2006-12-12     9      45    NaN    NaN
    20  2006-12-12     9      50    NaN    NaN
    21  2006-12-12    10       0  780.0  780.0
    22  2006-12-12    10       5  780.0  780.0
    23  2006-12-12    10      10  781.0  799.0
    24  2006-12-12    10      15  780.0  780.0
    25  2006-12-12    10      30    NaN    NaN
    26  2006-12-12    10      45  780.0  780.0
    27  2006-12-12    10      50  781.0  799.0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2020-07-30
      • 2012-08-14
      • 1970-01-01
      • 2019-08-01
      • 1970-01-01
      • 2019-12-25
      • 1970-01-01
      相关资源
      最近更新 更多