【问题标题】:Fill down a column date value until another date value is reached, then continue filling with the newly reached value填写一列日期值,直到达到另一个日期值,然后继续填充新达到的值
【发布时间】:2018-06-10 10:28:35
【问题描述】:

我有以下数据框:

         Date                 Team 1                Team 2  Score1  Score2
0    1-Oct-17                      1                   NaN       2     NaN
1       21:20          Chicago Cubs        Cincinnati Reds       1     3.0
2       21:15    Kansas City Royals   Arizona Diamondbacks       2    14.0
3       21:15    St.Louis Cardinals      Milwaukee Brewers       1     6.0
4   30-Sep-17                      1                   NaN       2     NaN
5       22:15     St.Louis Cardinals     Milwaukee Brewers       7     6.0
6       22:05           Chicago Cubs       Cincinnati Reds       9     0.0
7       22:05  San Francisco Giants       San Diego Padres       2     3.0
8       19:05         Boston Red Sox        Houston Astros       6     3.0
9   29-Sep-17                      1                   NaN       2     NaN
10      20:20           Chicago Cubs       Cincinnati Reds       5     4.0
11      19:05       New York Yankees     Toronto Blue Jays       4     0.0
12       2:15    Kansas City Royals         Detroit Tigers       1     4.0
13       2:10      Chicago White Sox    Los Angeles Angels       5     4.0

为了得到这个结果,我需要填写日期值并替换时间值。

         Date                 Team 1                Team 2  Score1  Score2
0    1-Oct-17                      1                   NaN       2     NaN
1    1-Oct-17          Chicago Cubs        Cincinnati Reds       1     3.0
2    1-Oct-17    Kansas City Royals   Arizona Diamondbacks       2    14.0
3    1-Oct-17    St.Louis Cardinals      Milwaukee Brewers       1     6.0
4   30-Sep-17                      1                   NaN       2     NaN
5   30-Sep-17     St.Louis Cardinals     Milwaukee Brewers       7     6.0
6   30-Sep-17           Chicago Cubs       Cincinnati Reds       9     0.0
7   30-Sep-17  San Francisco Giants       San Diego Padres       2     3.0
8   30-Sep-17         Boston Red Sox        Houston Astros       6     3.0
9   29-Sep-17                      1                   NaN       2     NaN
10  29-Sep-17           Chicago Cubs       Cincinnati Reds       5     4.0
11  29-Sep-17       New York Yankees     Toronto Blue Jays       4     0.0
12  29-Sep-17    Kansas City Royals         Detroit Tigers       1     4.0
13  29-Sep-17      Chicago White Sox    Los Angeles Angels       5     4.0

【问题讨论】:

    标签: python pandas dataframe autofill


    【解决方案1】:

    您可以检查Date 列中值的长度,如果高于7,则将NaN 替换为where,最后通过ffill 前向填充缺失值(fillna 使用方法ffill) :

    df['Date'] = df['Date'].where(df['Date'].str.len() > 7).ffill()
    #similar idea
    #df['Date'] = df['Date'].mask(df['Date'].str.len().isin([4,5])).ffill()
    print (df)
             Date                Team 1                Team 2  Score1  Score2
    0    1-Oct-17                     1                   NaN       2     NaN
    1    1-Oct-17          Chicago Cubs       Cincinnati Reds       1     3.0
    2    1-Oct-17    Kansas City Royals  Arizona Diamondbacks       2    14.0
    3    1-Oct-17    St.Louis Cardinals     Milwaukee Brewers       1     6.0
    4   30-Sep-17                     1                   NaN       2     NaN
    5   30-Sep-17    St.Louis Cardinals     Milwaukee Brewers       7     6.0
    6   30-Sep-17          Chicago Cubs       Cincinnati Reds       9     0.0
    7   30-Sep-17  San Francisco Giants      San Diego Padres       2     3.0
    8   30-Sep-17        Boston Red Sox        Houston Astros       6     3.0
    9   29-Sep-17                     1                   NaN       2     NaN
    10  29-Sep-17          Chicago Cubs       Cincinnati Reds       5     4.0
    11  29-Sep-17      New York Yankees     Toronto Blue Jays       4     0.0
    12  29-Sep-17    Kansas City Royals        Detroit Tigers       1     4.0
    13  29-Sep-17     Chicago White Sox    Los Angeles Angels       5     4.0
    

    另一个想法是将值转换为日期时间并比较 0:00 时间:

    from datetime import time
    
    df['Date']  = pd.to_datetime(df['Date'] )
    df['Date'] = df['Date'].where(df['Date'].dt.time == time(0,0)).ffill()
    print (df)
             Date                Team 1                Team 2  Score1  Score2
    0  2017-10-01                     1                   NaN       2     NaN
    1  2017-10-01          Chicago Cubs       Cincinnati Reds       1     3.0
    2  2017-10-01    Kansas City Royals  Arizona Diamondbacks       2    14.0
    3  2017-10-01    St.Louis Cardinals     Milwaukee Brewers       1     6.0
    4  2017-09-30                     1                   NaN       2     NaN
    5  2017-09-30    St.Louis Cardinals     Milwaukee Brewers       7     6.0
    6  2017-09-30          Chicago Cubs       Cincinnati Reds       9     0.0
    7  2017-09-30  San Francisco Giants      San Diego Padres       2     3.0
    8  2017-09-30        Boston Red Sox        Houston Astros       6     3.0
    9  2017-09-29                     1                   NaN       2     NaN
    10 2017-09-29          Chicago Cubs       Cincinnati Reds       5     4.0
    11 2017-09-29      New York Yankees     Toronto Blue Jays       4     0.0
    12 2017-09-29    Kansas City Royals        Detroit Tigers       1     4.0
    13 2017-09-29     Chicago White Sox    Los Angeles Angels       5     4.0
    

    【讨论】:

      猜你喜欢
      • 2015-07-19
      • 1970-01-01
      • 2012-11-09
      • 1970-01-01
      • 2021-03-16
      • 1970-01-01
      • 2022-01-12
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多