【问题标题】:Pandas Column operation with dates带有日期的 Pandas 列操作
【发布时间】:2015-05-23 12:02:13
【问题描述】:

我有一个数据框,主要是日期。这是我想做的事

从旧日期变量(DTDate),我想创建一个新的日期变量,如果旧日期是星期一,新日期将相同,但如果旧日期是星期一以外的任何日期,则新日期会给我下周一的日期。所以最后新日期中的所有项目将只与星期一。

我一直在尝试使用函数并应用。这是我的数据集和代码

    Date call   DTDate      weekday     weekdayNo
0   31/12/2014  2014-12-31  Wednesday   3
1   29/10/2014  2014-10-29  Wednesday   3
2   28/10/2014  2014-10-28  Tuesday     2
3   27/3/2015   2015-03-27  Friday      5
4   27/2/2015   2015-02-27  Friday      5
5   27/11/2014  2014-11-27  Thursday    4
6   27/10/2014  2014-10-27  Monday      1
7   26/3/2015   2015-03-26  Thursday    4
8   26/2/2015   2015-02-26  Thursday    4
9   26/12/2014  2014-12-26  Friday      5
10  26/11/2014  2014-11-26  Wednesday   3
11  26/10/2014  2014-10-26  Sunday      0
12  25/3/2015   2015-03-25  Wednesday   3
13  25/12/2014  2014-12-25  Thursday    4
14  24/3/2015   2015-03-24  Tuesday     2
15  24/2/2015   2015-02-24  Tuesday     2
16  24/12/2014  2014-12-24  Wednesday   3
17  24/11/2014  2014-11-24  Monday      1
18  23/3/2015   2015-03-23  Monday      1

代码是

from datetime import date, timedelta

def AddDate(row):
    if row['weekdayNo']==0:
        return row['DTDate'] + timedelta(days=1)
    elif row['weekdayNo'] ==2:
        return row['DTDate'] + timedelta(days=6)
    elif row['weekdayNo'] ==3:
       return row['DTDate'] + timedelta(days=5)
    elif row['weekdayNo'] ==4:
       return row['DTDate'] + timedelta(days=4)
    elif row['weekdayNo'] ==5:
       return row['DTDate'] + timedelta(days=3) 
    elif row['weekdayNo'] ==6:
       return row['DTDate'] + timedelta(days=2)
    else:
       return row['DTDate']

 DF['newDate'] = DF.apply(AddDate, axis=1)

我得到以下内容,完全一样,没有任何改变

     Date call  DTDate       weekday    weekdayNo   newDate
 0  31/12/2014  2014-12-31  Wednesday      3        2014-12-31
 1  29/10/2014  2014-10-29  Wednesday      3        2014-10-29
 2  28/10/2014  2014-10-28  Tuesday        2        2014-10-28
 3  27/3/2015   2015-03-27  Friday         5        2015-03-27
 4  27/2/2015   2015-02-27  Friday         5        2015-02-27
 5  27/11/2014  2014-11-27  Thursday       4        2014-11-27
 6  27/10/2014  2014-10-27  Monday         1        2014-10-27
 7  26/3/2015   2015-03-26  Thursday       4        2015-03-26
 8  26/2/2015   2015-02-26  Thursday       4        2015-02-26
 9  26/12/2014  2014-12-26  Friday         5        2014-12-26
 10 26/11/2014  2014-11-26  Wednesday      3        2014-11-26
 11 26/10/2014  2014-10-26  Sunday         0        2014-10-26
 12 25/3/2015   2015-03-25  Wednesday      3        2015-03-25
 13 25/12/2014  2014-12-25  Thursday       4        2014-12-25
 14 24/3/2015   2015-03-24  Tuesday        2        2015-03-24
 15 24/2/2015   2015-02-24  Tuesday        2        2015-02-24
 16 24/12/2014  2014-12-24  Wednesday      3        2014-12-24
 17 24/11/2014  2014-11-24  Monday         1        2014-11-24
 18 23/3/2015   2015-03-23  Monday         1        2015-03-23

我也觉得,这个想法不好,如果有更好的,请问有谁愿意建议,可能是什么?提前致谢

【问题讨论】:

  • DF.DTDate dtype datetime?转换成datetimedf.DTDate = pd.to_datetime(DF.DTDate)后可以试试吗?

标签: python date pandas


【解决方案1】:

您不需要import datetimetimedelta 来执行此操作。

df['DTDate'] = pd.to_datetime(df['DTDate'])  # can skip this if column 'DTDate' is already of the right type

x.weekday() 提取星期一=0 和星期日=6 的星期几。

df['newDate'] = df.DTDate.apply(lambda x: x + pd.DateOffset(days=7-x.weekday()) if  x.weekday() else x)

产量:

    Date_call     DTDate    weekday  weekdayNo    newDate
0  2014-12-31 2014-12-31  Wednesday          3 2015-01-05
1  2014-10-29 2014-10-29  Wednesday          3 2014-11-03
2  2014-10-28 2014-10-28    Tuesday          2 2014-11-03
3  2015-03-27 2015-03-27     Friday          5 2015-03-30
4  2015-02-27 2015-02-27     Friday          5 2015-03-02
5  2014-11-27 2014-11-27   Thursday          4 2014-12-01
6  2014-10-27 2014-10-27     Monday          1 2014-10-27
7  2015-03-26 2015-03-26   Thursday          4 2015-03-30
8  2015-02-26 2015-02-26   Thursday          4 2015-03-02
9  2014-12-26 2014-12-26     Friday          5 2014-12-29
10 2014-11-26 2014-11-26  Wednesday          3 2014-12-01
11 2014-10-26 2014-10-26     Sunday          0 2014-10-27
12 2015-03-25 2015-03-25  Wednesday          3 2015-03-30
13 2014-12-25 2014-12-25   Thursday          4 2014-12-29
14 2015-03-24 2015-03-24    Tuesday          2 2015-03-30
15 2015-02-24 2015-02-24    Tuesday          2 2015-03-02
16 2014-12-24 2014-12-24  Wednesday          3 2014-12-29
17 2014-11-24 2014-11-24     Monday          1 2014-11-24
18 2015-03-23 2015-03-23     Monday          1 2015-03-23

【讨论】:

    【解决方案2】:

    AddDate 函数可以变得更简单,实际上是一个内衬

    In [34]: df['newDate'] = df['DTDate'].apply(lambda x: x + timedelta(days=7-x.dayofweek)
                                                if x.dayofweek else x)
    

    这里,如果不是星期一,则 lambda 函数 lambda x: x + timedelta(days=7-x.dayofweek) if x.dayofweek else x
    添加 delta=7-x.dayofweek days。

    要验证新的weekday,让我们创建一个新列newdayofweek

    In [35]: df['newdayofweek'] = df['newDate'].apply(lambda x: x.dayofweek)
    
    In [36]: df
    Out[36]:
        Date        call     DTDate    weekday  weekdayNo    newDate  newdayofweek
    0      0  31/12/2014 2014-12-31  Wednesday          3 2015-01-05             0
    1      1  29/10/2014 2014-10-29  Wednesday          3 2014-11-03             0
    2      2  28/10/2014 2014-10-28    Tuesday          2 2014-11-03             0
    3      3   27/3/2015 2015-03-27     Friday          5 2015-03-30             0
    4      4   27/2/2015 2015-02-27     Friday          5 2015-03-02             0
    5      5  27/11/2014 2014-11-27   Thursday          4 2014-12-01             0
    6      6  27/10/2014 2014-10-27     Monday          1 2014-10-27             0
    7      7   26/3/2015 2015-03-26   Thursday          4 2015-03-30             0
    8      8   26/2/2015 2015-02-26   Thursday          4 2015-03-02             0
    9      9  26/12/2014 2014-12-26     Friday          5 2014-12-29             0
    10    10  26/11/2014 2014-11-26  Wednesday          3 2014-12-01             0
    11    11  26/10/2014 2014-10-26     Sunday          0 2014-10-27             0
    12    12   25/3/2015 2015-03-25  Wednesday          3 2015-03-30             0
    13    13  25/12/2014 2014-12-25   Thursday          4 2014-12-29             0
    14    14   24/3/2015 2015-03-24    Tuesday          2 2015-03-30             0
    15    15   24/2/2015 2015-02-24    Tuesday          2 2015-03-02             0
    16    16  24/12/2014 2014-12-24  Wednesday          3 2014-12-29             0
    17    17  24/11/2014 2014-11-24     Monday          1 2014-11-24             0
    18    18   23/3/2015 2015-03-23     Monday          1 2015-03-23             0
    

    注意:星期几,星期一=0,星期日=6

    【讨论】:

    • 谢谢伙计,出色的解决方案,但由于某些原因,我收到了一个错误...'datetime.date' object has no attribute 'dayofweek' 有什么建议吗?对不起,我之前的评论是错误的
    • 是你的type(df['DTDate'][0]) == pandas.tslib.Timestamp 吗?如果没有,则将其转换为df['DTDate'] = pd.to_datetime(df['DTDate'])
    • 好的..现在它工作了......非常好....非常感谢,非常感谢?你能建议一些关于问题to_datetime 的内容吗?或者我什么时候需要它?太好了,非常感谢......或者如果你有时间,你能用一些话解释一下吗?
    【解决方案3】:

    这是一种更高效的方法。

    In [50]: s = Series(pd.date_range('20000101',freq='D',periods=10000))
    
    In [51]: result = s.where(s.dt.weekday==0,pd.TimedeltaIndex(7-s.dt.weekday,unit='d')+s)
    
    In [52]: expected = s.apply(lambda x: x + pd.DateOffset(days=7-x.weekday()) if  x.weekday() else x)
    
    In [53]: (result==expected).all()
    Out[53]: True
    

    这本质上是在 python 空间中循环。

    In [54]: %timeit s.apply(lambda x: x + pd.DateOffset(days=7-x.weekday()) if  x.weekday() else x)
    1 loops, best of 3: 244 ms per loop
    

    在这里,我们正在构建要添加的天数的 TimedeltaIndex。 .whereif-then 等效,但这是一个向量化的表达式。

    In [55]: %timeit s.where(s.dt.weekday==0,pd.TimedeltaIndex(7-s.dt.weekday,unit='d')+s)
    100 loops, best of 3: 9.69 ms per loop
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2022-11-22
      • 1970-01-01
      • 2015-09-14
      • 2015-11-08
      • 2017-08-01
      • 1970-01-01
      • 2014-07-19
      • 2020-10-01
      相关资源
      最近更新 更多