【问题标题】:Convert dates / times collected in two different formats within the same variable into a consistent format [duplicate]将同一变量中以两种不同格式收集的日期/时间转换为一致的格式[重复]
【发布时间】:2018-07-07 06:33:57
【问题描述】:
     Request.id Pickup.point Driver.id            Status   Request.timestamp      Drop.timestamp
1           619      Airport         1    Trip Completed     11/7/2016 11:51     11/7/2016 13:00
2           867      Airport         1    Trip Completed     11/7/2016 17:57     11/7/2016 18:47
3          1807         City         1    Trip Completed      12/7/2016 9:17      12/7/2016 9:58
4          2532      Airport         1    Trip Completed     12/7/2016 21:08     12/7/2016 22:03
5          3112         City         1    Trip Completed 13-07-2016 08:33:16 13-07-2016 09:25:47
6          3879      Airport         1    Trip Completed 13-07-2016 21:57:28 13-07-2016 22:28:59
7          4270      Airport         1    Trip Completed 14-07-2016 06:15:32 14-07-2016 07:13:15
8          5510      Airport         1    Trip Completed 15-07-2016 05:11:52 15-07-2016 06:07:52
9          6248         City         1    Trip Completed 15-07-2016 17:57:27 15-07-2016 18:50:51
10          267         City         2    Trip Completed      11/7/2016 6:46      11/7/2016 7:25
11         1467      Airport         2    Trip Completed      12/7/2016 5:08      12/7/2016 6:02
12         1983         City         2    Trip Completed     12/7/2016 12:30     12/7/2016 12:57
13         2784      Airport         2    Trip Completed 13-07-2016 04:49:20 13-07-2016 05:23:03
14         3075         City         2    Trip Completed 13-07-2016 08:02:53 13-07-2016 09:16:19
15         3379         City         2    Trip Completed 13-07-2016 14:23:02 13-07-2016 15:35:18
16         3482      Airport         2    Trip Completed 13-07-2016 17:23:18 13-07-2016 18:20:51
17         4652         City         2    Trip Completed 14-07-2016 12:01:02 14-07-2016 12:36:46
18         5335      Airport         2    Trip Completed 14-07-2016 22:24:13 14-07-2016 23:18:52
19          535      Airport         3    Trip Completed     11/7/2016 10:00     11/7/2016 10:31
20          960      Airport         3    Trip Completed     11/7/2016 18:45     11/7/2016 19:23
21         1934      Airport         3    Trip Completed     12/7/2016 11:17     12/7/2016 12:23
22         2083      Airport         3    Trip Completed     12/7/2016 15:46     12/7/2016 16:40
23         2211      Airport         3    Trip Completed     12/7/2016 18:00     12/7/2016 18:28
24         3096      Airport         3    Trip Completed 13-07-2016 08:17:29 13-07-2016 09:22:37
25         3881      Airport         3    Trip Completed 13-07-2016 21:54:18 13-07-2016 22:51:23
26         5254         City         3    Trip Completed 14-07-2016 21:23:03 14-07-2016 22:25:19
27         5434         City         3    Trip Completed 15-07-2016 02:41:38 15-07-2016 03:24:43
28         5916         City         3    Trip Completed 15-07-2016 10:00:43 15-07-2016 10:53:06
29          669         City         4    Trip Completed     11/7/2016 13:08     11/7/2016 13:49
30         1567      Airport         4    Trip Completed      12/7/2016 6:21      12/7/2016 7:10

在上面给出的数据集中,Request.timestampDrop.timestamp 列包含不同格式的日期值。如何在两列中转换相同格式的日期,如何分别提取日期和时间?

【问题讨论】:

    标签: r date datetime posixct


    【解决方案1】:

    要转换两种格式的时间,我们需要确定使用哪种格式。我为此使用了 lubridate 包,因为它比某些标准 R 日期格式更易于使用。

    rawData <- "Request.id|Pickup.point|Driver.id|Status      |Request.timestamp  |Drop.timestamp
             619   |  Airport   |     1   |Trip Completed|    11/7/2016 11:51|    11/7/2016 13:00
    867   |  Airport   |     1   |Trip Completed|    11/7/2016 17:57|    11/7/2016 18:47
    1807   |     City   |     1   |Trip Completed|     12/7/2016 9:17|     12/7/2016 9:58
    2532   |  Airport   |     1   |Trip Completed|    12/7/2016 21:08|    12/7/2016 22:03
    3112   |     City   |     1   |Trip Completed|13-07-2016 08:33:16|13-07-2016 09:25:47
    3879   |  Airport   |     1   |Trip Completed|13-07-2016 21:57:28|13-07-2016 22:28:59
    4270   |  Airport   |     1   |Trip Completed|14-07-2016 06:15:32|14-07-2016 07:13:15
    5510   |  Airport   |     1   |Trip Completed|15-07-2016 05:11:52|15-07-2016 06:07:52
    6248   |     City   |     1   |Trip Completed|15-07-2016 17:57:27|15-07-2016 18:50:51
    267   |     City   |     2   |Trip Completed|     11/7/2016 6:46|     11/7/2016 7:25
    1467   |  Airport   |     2   |Trip Completed|     12/7/2016 5:08|     12/7/2016 6:02
    1983   |     City   |     2   |Trip Completed|    12/7/2016 12:30|    12/7/2016 12:57
    2784   |  Airport   |     2   |Trip Completed|13-07-2016 04:49:20|13-07-2016 05:23:03
    3075   |     City   |     2   |Trip Completed|13-07-2016 08:02:53|13-07-2016 09:16:19
    3379   |     City   |     2   |Trip Completed|13-07-2016 14:23:02|13-07-2016 15:35:18
    3482   |  Airport   |     2   |Trip Completed|13-07-2016 17:23:18|13-07-2016 18:20:51
    4652   |     City   |     2   |Trip Completed|14-07-2016 12:01:02|14-07-2016 12:36:46
    5335   |  Airport   |     2   |Trip Completed|14-07-2016 22:24:13|14-07-2016 23:18:52
    535   |  Airport   |     3   |Trip Completed|    11/7/2016 10:00|    11/7/2016 10:31
    960   |  Airport   |     3   |Trip Completed|    11/7/2016 18:45|    11/7/2016 19:23
    1934   |  Airport   |     3   |Trip Completed|    12/7/2016 11:17|    12/7/2016 12:23
    2083   |  Airport   |     3   |Trip Completed|    12/7/2016 15:46|    12/7/2016 16:40
    2211   |  Airport   |     3   |Trip Completed|    12/7/2016 18:00|    12/7/2016 18:28
    3096   |  Airport   |     3   |Trip Completed|13-07-2016 08:17:29|13-07-2016 09:22:37
    3881   |  Airport   |     3   |Trip Completed|13-07-2016 21:54:18|13-07-2016 22:51:23
    5254   |     City   |     3   |Trip Completed|14-07-2016 21:23:03|14-07-2016 22:25:19
    5434   |     City   |     3   |Trip Completed|15-07-2016 02:41:38|15-07-2016 03:24:43
    5916   |     City   |     3   |Trip Completed|15-07-2016 10:00:43|15-07-2016 10:53:06
    669   |     City   |     4   |Trip Completed|    11/7/2016 13:08|    11/7/2016 13:49
    1567   |  Airport   |     4   |Trip Completed|     12/7/2016 6:21|     12/7/2016 7:10"
    library(lubridate) 
    data <- read.csv(text=rawData,header=TRUE,
                     sep="|",
                     stringsAsFactors=FALSE)
    
    convertTime <- function(aVector){
    
        unlist(lapply(aVector,function(x){
              ifelse(grepl("/",x),
                     mdy_hm(x),
                     dmy_hms(x))
    
         }))
    }
    requestTime <- convertTime(data$Request.timestamp)
    dropTime <- convertTime(data$Drop.timestamp)
    as_datetime(requestTime)
    

    ...和输出:

    > as_datetime(requestTime)
     [1] "2016-11-07 11:51:00 UTC" "2016-11-07 17:57:00 UTC" "2016-12-07 09:17:00 UTC"
     [4] "2016-12-07 21:08:00 UTC" "2016-07-13 08:33:16 UTC" "2016-07-13 21:57:28 UTC"
     [7] "2016-07-14 06:15:32 UTC" "2016-07-15 05:11:52 UTC" "2016-07-15 17:57:27 UTC"
    [10] "2016-11-07 06:46:00 UTC" "2016-12-07 05:08:00 UTC" "2016-12-07 12:30:00 UTC"
    [13] "2016-07-13 04:49:20 UTC" "2016-07-13 08:02:53 UTC" "2016-07-13 14:23:02 UTC"
    [16] "2016-07-13 17:23:18 UTC" "2016-07-14 12:01:02 UTC" "2016-07-14 22:24:13 UTC"
    [19] "2016-11-07 10:00:00 UTC" "2016-11-07 18:45:00 UTC" "2016-12-07 11:17:00 UTC"
    [22] "2016-12-07 15:46:00 UTC" "2016-12-07 18:00:00 UTC" "2016-07-13 08:17:29 UTC"
    [25] "2016-07-13 21:54:18 UTC" "2016-07-14 21:23:03 UTC" "2016-07-15 02:41:38 UTC"
    [28] "2016-07-15 10:00:43 UTC" "2016-11-07 13:08:00 UTC" "2016-12-07 06:21:00 UTC"
    > 
    

    【讨论】:

    • 对不起。我们的解决方案使用类似的技术同时添加。
    • @MKR - 没问题。感谢您发布具有多种格式的parse_date_time() 解决方案。我以前没有使用过这个功能。
    【解决方案2】:

    OP 在数据框中以异构格式获取日期/时间。在这种情况下lubridate 非常方便。

    library(lubridate)
    df <- read.table(text = "Request.id Pickup.point Driver.id            Status   Request.timestamp      Drop.timestamp
    1           619      Airport         1    'Trip Completed'     '11/7/2016 11:51'     '11/7/2016 13:00'
    2           867      Airport         1    'Trip Completed'     '11/7/2016 17:57'     '11/7/2016 18:47'
    3          1807         City         1    'Trip Completed'      '12/7/2016 9:17'      '12/7/2016 9:58'
    4          2532      Airport         1    'Trip Completed'     '12/7/2016 21:08'     '12/7/2016 22:03'
    5          3112         City         1    'Trip Completed' '13-07-2016 08:33:16' '13-07-2016 09:25:47'
    6          3879      Airport         1    'Trip Completed' '13-07-2016 21:57:28' '13-07-2016 22:28:59'
    7          4270      Airport         1    'Trip Completed' '14-07-2016 06:15:32' '14-07-2016 07:13:15'
    8          5510      Airport         1    'Trip Completed' '15-07-2016 05:11:52' '15-07-2016 06:07:52'
    9          6248         City         1    'Trip Completed' '15-07-2016 17:57:27' '15-07-2016 18:50:51'", header = T, stringsAsFactors = F)
    
    #Use parse_date_time to convert hetrogeneous date-time
    df$Request.timestamp <- parse_date_time(df$Request.timestamp, c("dmY HM", "dmY HMS"))
    df$Drop.timestamp <- parse_date_time(df$Drop.timestamp, c("dmY HM", "dmY HMS"))
    
    df
    

    转换后的日期/时间数据为

     Request.id Pickup.point Driver.id         Status   Request.timestamp      Drop.timestamp
    1        619      Airport         1 Trip Completed 2016-07-11 11:51:00 2016-07-11 13:00:00
    2        867      Airport         1 Trip Completed 2016-07-11 17:57:00 2016-07-11 18:47:00
    3       1807         City         1 Trip Completed 2016-07-12 09:17:00 2016-07-12 09:58:00
    4       2532      Airport         1 Trip Completed 2016-07-12 21:08:00 2016-07-12 22:03:00
    5       3112         City         1 Trip Completed 2016-07-13 08:33:16 2016-07-13 09:25:47
    6       3879      Airport         1 Trip Completed 2016-07-13 21:57:28 2016-07-13 22:28:59
    7       4270      Airport         1 Trip Completed 2016-07-14 06:15:32 2016-07-14 07:13:15
    8       5510      Airport         1 Trip Completed 2016-07-15 05:11:52 2016-07-15 06:07:52
    9       6248         City         1 Trip Completed 2016-07-15 17:57:27 2016-07-15 18:50:51
    

    用于分隔日期和时间的附加代码:

    df$Request.timestamp_date <- as.character(df$Request.timestamp, "%Y-%m-%d")
    df$Request.timestamp_time <- as.character(df$Request.timestamp, "%H:%M:%S")
    

    【讨论】:

    • 我们如何在上表中分别提取日期和时间进行绘图
    • @user57370 让我将其添加为示例的一部分。这很简单。我认为弗洛里安已经展示了一种方法。
    • @user57370 答案已修改为在单独的列中添加datetime
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-11-30
    • 2017-10-23
    • 1970-01-01
    • 2022-01-19
    • 2018-02-16
    • 2018-11-17
    • 2020-05-20
    相关资源
    最近更新 更多