【问题标题】:How may I extract day information for timestamp in python pandas如何在 python pandas 中提取时间戳的日期信息
【发布时间】:2020-05-09 21:22:14
【问题描述】:

我需要向数据集添加更多列。由于一周有 7 天,因此,我在数据集中添加了 7 个附加列,分别为“day_1”、“day_2”、...“day_7”,然后对于所有时间戳数据,现在我想提取日期信息。例如,对于对应于星期二的行,只有“day_2”列应为“1”,其他列(day_1、day_3、day_4、day_5、day_6、day_7)应全部为“0”。对于这个任务,我想使用一个称为“单热编码”的过程。我该如何为这个需求编写代码?

#dataset['day_7'] = dataset.insert(0, 'day_7', 0)
#dataset['day_6'] = dataset.insert(0, 'day_6', 0)
#dataset['day_5'] = dataset.insert(0, 'day_5', 0)
#dataset['day_4'] = dataset.insert(0, 'day_4', 0)
#dataset['day_3'] = dataset.insert(0, 'day_3', 0)
#dataset['day_2'] = dataset.insert(0, 'day_2', 0)
#dataset['day_1'] = dataset.insert(0, 'day_1', 0)

这是我在数据集中的日期列:

#0  2016-01-01 05:00:00
#1  2016-01-01 06:00:00
#2  2016-01-01 07:00:00
#3  2016-01-01 08:00:00
#4  2016-01-01 09:00:00
#5  2016-01-01 10:00:00
#6  2016-01-01 11:00:00
#7  2016-01-01 12:00:00
#8  2016-01-01 13:00:00
#9  2016-01-01 14:00:00
#10 2016-01-01 15:00:00
#11 2016-01-01 16:00:00
#12 2016-01-01 17:00:00
#13 2016-01-01 18:00:00
#14 2016-01-01 19:00:00
#15 2016-01-01 20:00:00
#16 2016-01-01 21:00:00
#17 2016-01-01 22:00:00
#18 2016-01-01 23:00:00
#19 2016-01-02 00:00:00
#20 2016-01-02 01:00:00
#21 2016-01-02 02:00:00
#22 2016-01-02 03:00:00
#23 2016-01-02 04:00:00
#24 2016-01-02 05:00:00
#25 2016-01-02 06:00:00
#26 2016-01-02 07:00:00
#27 2016-01-02 08:00:00
#28 2016-01-02 09:00:00
#29 2016-01-02 10:00:00
#30 2016-01-02 11:00:00
#31 2016-01-02 12:00:00
#32 2016-01-02 13:00:00
#33 2016-01-02 14:00:00
#34 2016-01-02 15:00:00
#35 2016-01-02 16:00:00
#36 2016-01-02 17:00:00
#37 2016-01-02 18:00:00
#38 2016-01-02 19:00:00
#39 2016-01-02 20:00:00
#40 2016-01-02 21:00:00
#41 2016-01-02 22:00:00
#42 2016-01-02 23:00:00
#43 2016-01-03 00:00:00
#44 2016-01-03 01:00:00
#45 2016-01-03 02:00:00
#46 2016-01-03 03:00:00
#47 2016-01-03 04:00:00
#48 2016-01-03 05:00:00
#49 2016-01-03 06:00:00
#50 2016-01-03 07:00:00
#51 2016-01-03 08:00:00
#52 2016-01-03 09:00:00
#53 2016-01-03 10:00:00
#54 2016-01-03 11:00:00
#55 2016-01-03 12:00:00
#56 2016-01-03 13:00:00
#57 2016-01-03 14:00:00
#58 2016-01-03 15:00:00
#59 2016-01-03 16:00:00
#60 2016-01-03 17:00:00
#61 2016-01-03 18:00:00
#62 2016-01-03 19:00:00
#63 2016-01-03 20:00:00
#64 2016-01-03 21:00:00
#65 2016-01-03 22:00:00
#66 2016-01-03 23:00:00

【问题讨论】:

标签: python pandas datetime


【解决方案1】:

你可以使用get_dummies

import pandas as pd
df = pd.DataFrame({'timestamp': [
    '2016-01-01 05:00:00', '2016-01-01 06:00:00', '2016-01-01 07:00:00', '2016-01-01 08:00:00',
    '2016-01-02 00:00:00', '2016-01-02 06:00:00', '2016-01-02 07:00:00', '2016-01-02 08:00:00',
    '2016-01-03 00:00:00', '2016-01-03 06:00:00', '2016-01-03 07:00:00', '2016-01-03 08:00:00',
    '2016-01-04 00:00:00', '2016-01-04 06:00:00', '2016-01-04 07:00:00', '2016-01-04 08:00:00',
]})
# convert to datetime
df.timestamp = pd.to_datetime(df.timestamp)
# extract the day and add 1
df['day'] = df.timestamp.dt.dayofweek + 1 # Thanks @Mark Wang
# create one-hot encoding
df_onehot = pd.get_dummies(df.day, prefix='day')
# merge back
df = pd.concat([df,df_onehot], axis=1)

【讨论】:

  • dayofweek 从 0 开始,可能需要 +1
  • 感谢@MarkWang 提醒我。编辑了答案
【解决方案2】:
###  Import Libraries

from datetime import datetime

### Get some Data:
dateList = pd.date_range('2012-04-27 05:00:00 ', periods=24)
df = pd.DataFrame(data = dateList, columns = ['Date'])

###  Get Day name
df['Day'] = df['Date'].dt.day_name()

### Generate Day columns
Days = pd.get_dummies(df.Day)

### Stitch the two the original and the Days DataFrames together
df = pd.concat([df.drop(['Day'], axis = 1), Days], axis = 1)

df

这是输出:

    Date    Friday  Monday  Saturday    Sunday  Thursday    Tuesday Wednesday
0   2012-04-27 05:00:00 1   0   0   0   0   0   0
1   2012-04-28 05:00:00 0   0   1   0   0   0   0
2   2012-04-29 05:00:00 0   0   0   1   0   0   0
3   2012-04-30 05:00:00 0   1   0   0   0   0   0
4   2012-05-01 05:00:00 0   0   0   0   0   1   0

【讨论】:

    猜你喜欢
    • 2020-06-27
    • 2017-02-01
    • 1970-01-01
    • 2019-01-27
    • 1970-01-01
    • 2015-10-17
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多