【发布时间】:2021-07-11 05:52:42
【问题描述】:
假设我有如下数据框
date,ent_id,val
2021-03-23,101,61
2021-03-12,103,64
2021-03-15,101,32
2021-04-01,103,39
2021-04-02,101,71
2021-04-02,103,79
2021-04-30,101,51
2021-04-30,103,53
2021-05-31,101,28
2021-05-31,103,26
2021-05-31,101,47
2021-05-31,103,61
2021-06-06,101,45
2021-06-06,103,78
2021-06-07,101,23
2021-06-07,103,31
2021-07-31,101,14
2021-07-31,103,02
2021-07-31,101,82
2021-07-31,103,15
我想在包含月底日期的数据框中创建一个附加列 基于以下条件
case
when DAYNAME('date')='Sunday' then days_add(date,-2)
when DAYNAME('date')='Saturday' then days_add(date,-1)
else date
所以输出会是这样的
date,ent_id,val,month_end
2021-03-23,101,61,2021-03-31
2021-03-12,103,64,2021-03-31
2021-03-15,101,32,2021-03-31
2021-04-01,103,39,2021-04-30
2021-04-02,101,71,2021-04-30
2021-04-02,103,79,2021-04-30
2021-04-30,101,51,2021-04-30
2021-04-30,103,53,2021-04-30
2021-05-31,101,28,2021-05-31
2021-05-31,103,26,2021-05-31
2021-05-31,101,47,2021-05-31
2021-05-31,103,61,2021-05-31
2021-06-06,101,45,2021-06-30
2021-06-06,103,78,2021-06-30
2021-06-07,101,23,2021-06-30
2021-06-07,103,31,2021-06-30
2021-07-31,101,14,2021-07-31
2021-07-31,103,02,2021-07-31
2021-07-31,101,82,2021-07-31
2021-07-31,103,15,2021-07-31
我的努力
import pandas as pd
from datetime import timedelta
from pandas.tseries.offsets import MonthEnd
import numpy as np
df.loc[(df['date']+MonthEnd(0)).dt.day_name()=='Sunday','month_end'] =[df.loc[(df['date']+MonthEnd(0)).dt.day_name()=='Sunday']['date']+timedelta(days=-2)]
df.loc[(df['date']+MonthEnd(0)).dt.day_name()=='Saturday','month_end'] =[df.loc[(df['date']+MonthEnd(0)).dt.day_name()=='Saturday']['date']+timedelta(days=-1)]
但出现此错误
ValueError: Must have equal len keys and value when setting with an ndarray
欢迎任何其他更好的解决方案
【问题讨论】:
-
如果您觉得有帮助,请考虑为答案投票。 :-)
-
查看我仅使用 Pandas 日期偏移函数的编辑。
标签: python pandas numpy data-science