【问题标题】:Quick way of transforming a datetime column in Pandas在 Pandas 中转换日期时间列的快速方法
【发布时间】:2016-11-29 20:13:51
【问题描述】:

我有一大堆 CSV,其中日期列如下:

Print df
           Date
0          20090501 00:00:00.831
1          20090501 00:00:00.832
2          20090501 00:00:01.078
3          20090501 00:00:01.337
4          20090501 00:00:01.580
5          20090501 00:00:01.581
6          20090501 00:00:01.582
7          20090501 00:00:01.602

从这里我想用'%Y%m%d %H:%M:%S.%f'的格式来表达它,因此:

df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d %H:%M:%S.%f')
print df
          Date
          2009-05-01 00:00:00.831
1         2009-05-01 00:00:00.832
2         2009-05-01 00:00:01.078
3         2009-05-01 00:00:01.337
4         2009-05-01 00:00:01.580
5         2009-05-01 00:00:01.581

最后,使用以下方法将其拆分为单独的日期和时间列:

df['Time'] = df['Date'].apply(lambda x:x.time())
df['Date1']= df['Date'].apply(lambda x:x.date())
print df
         Time             Date1   
0        00:00:00.831000  2009-05-01
1        00:00:00.832000  2009-05-01
2        00:00:01.078000  2009-05-01
3        00:00:01.337000  2009-05-01
4        00:00:01.580000  2009-05-01
5        00:00:01.581000  2009-05-01
6        00:00:01.582000  2009-05-01

问题是 lambda 函数大约需要一分钟才能完成,而且我要处理大约 200 万行的 30000 个 CSV 范围内的内容。如果有人能给我一个更快的解决方案,那将有很大帮助。

谢谢

【问题讨论】:

    标签: python python-2.7 pandas lambda


    【解决方案1】:

    使用dt.timedt.date

    df['Time'] = df['Date'].dt.time
    df['Date1']= df['Date'].dt.date
    print (df)
                         Date             Time       Date1
    0 2009-05-01 00:00:00.831  00:00:00.831000  2009-05-01
    1 2009-05-01 00:00:00.832  00:00:00.832000  2009-05-01
    2 2009-05-01 00:00:01.078  00:00:01.078000  2009-05-01
    3 2009-05-01 00:00:01.337  00:00:01.337000  2009-05-01
    4 2009-05-01 00:00:01.580  00:00:01.580000  2009-05-01
    5 2009-05-01 00:00:01.581  00:00:01.581000  2009-05-01
    6 2009-05-01 00:00:01.582  00:00:01.582000  2009-05-01
    7 2009-05-01 00:00:01.602  00:00:01.602000  2009-05-01
    

    【讨论】:

    • 剃掉了大约 15 秒,一点也不差。
    • Super ;) 这个函数是矢量化的,和apply一样快。
    猜你喜欢
    • 2018-08-30
    • 2016-12-03
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-04-19
    • 2016-07-29
    相关资源
    最近更新 更多