【发布时间】:2020-05-13 03:26:28
【问题描述】:
给定数据框中的 UTC 时间戳列,我想将它们转换为 2018-10-07 06:59:05.162000 之类的格式:
_source.@timestamp
0 2018-10-07T06:59:05.162Z
1 2018-10-07T06:59:05.075Z
2 2018-10-07T06:59:05.103Z
3 2018-10-07T06:59:05.093Z
4 2018-10-07T06:59:05.108Z
5 2018-10-07T06:59:05.110Z
6 2018-10-07T06:59:07.148Z
7 2018-10-07T06:59:09.164Z
8 2018-10-07T06:59:09.214Z
我已经应用了以下代码:
df['_source.@timestamp'] = pd.to_datetime(df['_source.@timestamp'], format='%Y-%m-%dT%H:%M:%S.%fZ')
但它会引发错误:ValueError: time data '-27' does not match format '%Y-%m-%dT%H:%M:%S.%fZ' (match)
通过添加errors='coerce':
df['_source.@timestamp'] = pd.to_datetime(df['_source.@timestamp'],
format='%Y-%m-%dT%H:%M:%S.%fZ',
errors='coerce')
我得到以下结果,但似乎不正确:
2018-10-07T06:59:05.162Z NaT
2018-10-07T06:59:05.075Z NaT
2018-10-07T06:59:05.103Z NaT
2018-10-07T06:59:05.093Z NaT
2018-10-07T06:59:05.108Z NaT
..
2018-10-07T09:55:33.596Z NaT
2018-10-07T09:55:33.647Z NaT
2018-10-07T09:55:33.581Z NaT
2018-10-07T09:55:33.655Z NaT
2018-10-07T09:55:35.593Z NaT
Name: _source.@timestamp, Length: 10000, dtype: datetime64[ns]
此代码可能有助于解决问题:
utc = "2018-10-07T06:59:05.162Z"
UTC_FORMAT = "%Y-%m-%dT%H:%M:%S.%fZ"
utcTime = datetime.datetime.strptime(utc, UTC_FORMAT)
print(utcTime)
输出:
2018-10-07 06:59:05.162000
如何正确转换列?谢谢。
【问题讨论】:
-
无法重现您的错误,所以只是问一下,您是否尝试过不指定
pd.to_datetime(df['_source.@timestamp'])之类的格式? -
对我来说也可以正确处理示例数据。
-
旁注:因为你有 ISO 8601 兼容的字符串,例如
pd.to_datetime('2018-10-07T06:59:05.162Z')正确解析为Timestamp('2018-10-07 06:59:05.162000+0000', tz='UTC')。所以你没有必须提供format。
标签: python-3.x pandas datetime