【问题标题】:Performing a mod function on time data column pandas python对时间数据列pandas python执行mod函数
【发布时间】:2021-08-04 08:30:22
【问题描述】:

您好,我想将 % 24 列的 mod 函数应用于时间列。

我相信时间列是字符串格式,

我想知道我应该如何进行手术。

sales_id,date,time,shopping_cart,price,parcel_size,Customer_lat,Customer_long,isLoyaltyProgram,nearest_storehouse_id,nearest_storehouse,dist_to_nearest_storehouse,delivery_cost
ORD0056604,24/03/2021,45:13:45,"[('bed', 3), ('Chair', 1), ('wardrobe', 4), ('side_table', 2), ('Dining_table', 2), ('mattress', 1)]",3152.77,medium,-38.246,145.61984,1,4,Sunshine,78.43,5.8725000000000005
ORD0096594,13/12/2018,54:22:20,"[('Study_table', 4), ('wardrobe', 4), ('side_table', 1), ('Dining_table', 2), ('sofa', 4), ('Chair', 3), ('mattress', 1)]",3781.38,large,-38.15718,145.05072,1,4,Sunshine,40.09,5.8725000000000005
ORD0046310,16/02/2018,17:23:36,"[('mattress', 2), ('wardrobe', 1), ('side_table', 2), ('sofa', 1), ('Chair', 3), ('Study_table', 4)]",2219.09,medium,144.69623,-38.00731,0,2,Footscray,34.2,16.9875
ORD0031675,25/06/2018,17:38:48,"[('bed', 4), ('side_table', 1), ('Chair', 1), ('mattress', 3), ('Dining_table', 2), ('sofa', 2), ('wardrobe', 2)]",4542.1,large,144.65506,-38.40669,1,2,Footscray,72.72,18.274500000000003
ORD0019799,05/01/2021,18:37:16,"[('wardrobe', 1), ('Study_table', 3), ('sofa', 4), ('side_table', 2), ('Chair', 4), ('Dining_table', 4), ('bed', 1)]",3132.71,L,-37.66022,144.94286,1,0,Clayton,17.77,14.931
ORD0041462,25/12/2018,07:29:33,"[('Chair', 3), ('bed', 1), ('mattress', 3), ('side_table', 3), ('wardrobe', 3), ('sofa', 4)]",4416.42,medium,-38.39154,145.87448,0,6,Sunshine,105.91,6.151500000000001
ORD0047848,30/07/2021,34:18:01,"[('Chair', 3), ('bed', 3), ('wardrobe', 4)]",2541.04,small,-37.4654,144.45832,1,2,Footscray,60.85,18.4635

【问题讨论】:

  • 45:13:45 % 24 之后的预期输出是什么?
  • 21:13:45,抱歉应该编辑了mod只适用于小时

标签: pandas dataframe csv datetime data-cleaning


【解决方案1】:

通过to_timedelta 将值转换为时间增量,然后通过索引删除天数 - 选择最后 8 个值:

print (df)
     sales_id        date      time
0  ORD0056604  24/03/2021  45:13:45
1  ORD0096594  13/12/2018  54:22:20

print (pd.to_timedelta(df['time']))
0   1 days 21:13:45
1   2 days 06:22:20
Name: time, dtype: timedelta64[ns]

df['time'] = pd.to_timedelta(df['time']).astype(str).str[-8:]

print (df)
     sales_id        date      time
0  ORD0056604  24/03/2021  21:13:45
1  ORD0096594  13/12/2018  06:22:20

如果需要还向 date 列添加天数,解决方案是将 timedeltas 添加到日期并按 Series.dt.strftime 最后提取值:

dates = pd.to_datetime(df['date'], dayfirst=True) + pd.to_timedelta(df['time'])
df['time'] = dates.dt.strftime('%H:%M:%S')
df['date'] = dates.dt.strftime('%d/%m/%Y')
print (df)
     sales_id        date      time
0  ORD0056604  25/03/2021  21:13:45
1  ORD0096594  15/12/2018  06:22:20

【讨论】:

  • 以下代码是否适用于甚至少于 24 小时的记录?
  • 我会编辑问题请看一下
  • @jixubi - 我想是的,如果尝试一下有问题吗?
猜你喜欢
  • 1970-01-01
  • 2021-10-30
  • 2016-02-04
  • 1970-01-01
  • 2019-11-27
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-02-02
相关资源
最近更新 更多