【问题标题】:converting a string which looks like a date into date将看起来像日期的字符串转换为日期
【发布时间】:2018-05-19 20:26:30
【问题描述】:

我一直在阅读很多将字符串转换为日期时间的问题,但我没有找到看起来像此数据框中所示的字符串日期。

    idAviso                     timestamp idpostulante
 1111413600  2018-02-28T20:40:28.079-0500      0z5VvGv
 1112368499  2018-02-28T20:51:02.844-0500      0z5VvGv
 1112369554  2018-02-28T20:43:50.396-0500      0z5VvGv
 1112358250  2018-02-27T16:02:19.303-0500      0zB026d
 1112358250  2018-02-27T16:02:30.036-0500      0zB026d

我的目标是将timestamp 列转换为类似的内容,以便我可以将其用于一些分析

    idAviso   timestamp idpostulante
 1111413600  2018-02-28      0z5VvGv
 1112368499  2018-02-28      0z5VvGv
 1112369554  2018-02-28      0z5VvGv
 1112358250  2018-02-27      0zB026d
 1112358250  2018-02-27      0zB026d

timestamp 现在应该是一个日期时间变量

【问题讨论】:

  • 但你可以只读取前 4+1+2+1+2 = 10 个字符

标签: python pandas date datetime


【解决方案1】:

只需用.str[:10] 剥离字符串,然后像这样使用pd.to_datetime()

df['timestamp'] = pd.to_datetime(df['timestamp'].str[:10])

其他选择:

df['timestamp'] = df['timestamp'].apply(pd.Timestamp)
df['timestamp'] = pd.to_datetime(df['timestamp'])  # offset by 5 hours

完整示例:

import pandas as pd
import numpy as np

data = '''\
idAviso     timestamp                         idpostulante
1111413600  2018-02-28T20:40:28.079-0500      0z5VvGv
1112368499  2018-02-28T20:51:02.844-0500      0z5VvGv
1112369554  2018-02-28T20:43:50.396-0500      0z5VvGv
1112358250  2018-02-27T16:02:19.303-0500      0zB026d
1112358250  2018-02-27T16:02:30.036-0500      0zB026d'''

file = pd.compat.StringIO(data)
df = pd.read_csv(file, sep='\s+')

df['timestamp'] = pd.to_datetime(df['timestamp'].str[:10])

print(df)

返回:

      idAviso  timestamp idpostulante
0  1111413600 2018-02-28      0z5VvGv
1  1112368499 2018-02-28      0z5VvGv
2  1112369554 2018-02-28      0z5VvGv
3  1112358250 2018-02-27      0zB026d
4  1112358250 2018-02-27      0zB026d

也看这里:

https://github.com/pandas-dev/pandas/issues/16898

【讨论】:

    【解决方案2】:

    这是带有偏移量的 ISO8061 时间格式。

    In [1]: x= '2018-02-28T20:40:28.079-0500'
    
    In [2]: from dateutil.parser import parse
    
    In [3]: parse(x)
    Out[3]: datetime.datetime(2018, 2, 28, 20, 40, 28, 79000, tzinfo=tzoffset(None, -18000))
    

    在 pandas Dataframe 中使用它

    In [7]: df = pd.DataFrame([x])
    
    In [8]: df
    Out[8]: 
                                  0
    0  2018-02-28T20:40:28.079-0500
    
    In [9]: df[0]
    Out[9]: 
    0    2018-02-28T20:40:28.079-0500
    Name: 0, dtype: object
    
    In [10]: df[0].apply(parse)
    Out[10]: 
    0   2018-02-28 20:40:28.079000-05:00
    Name: 0, dtype: datetime64[ns, tzoffset(None, -18000)]
    

    【讨论】:

      猜你喜欢
      • 2011-11-28
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-04-27
      • 2020-05-21
      相关资源
      最近更新 更多