【问题标题】:Erase minutes and seconds from character dates in R从R中的字符日期中删除分钟和秒
【发布时间】:2019-03-21 01:02:12
【问题描述】:

我有这个时间戳向量:

c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
"01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
"01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
"01/09/2019 10:52:20")

我想从字符向量中去掉分钟和秒,这样我就只有 01/09/2019 901/09/2019 10

最有效的方法是什么?

【问题讨论】:

  • 有趣的公认答案,您对高效的定义是什么?
  • 我非常偏向于tidyverse 包:)
  • 啊我明白了,我不怪你 :) 下次你可能应该把它放在问题中......

标签: r


【解决方案1】:

这是一个。

datevec <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
      "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
      "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
      "01/09/2019 10:52:20")

format(as.POSIXct(datevec, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H")

# Result
 [1] "01/09/2019 09" "01/09/2019 09" "01/09/2019 09" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
 [7] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"

【讨论】:

    【解决方案2】:

    您想要的输出类是什么?这个怎么样:

    v <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
      "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
      "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
      "01/09/2019 10:52:20")
    
    
    strptime(v, "%m/%d/%Y %H")
    

    【讨论】:

      【解决方案3】:

      这看起来不错,

      unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)]
      

      (在here 的帮助下制作)

      可以选择,

      sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1)
      

      使用 Ronak 的一些基准和最近的 cmets,fixed=TRUE 使方法更快,我们看到方法 4(上述方法)最快,

      mystring <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
                    "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
                    "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
                    "01/09/2019 10:52:20")
      
      microbenchmark(one = sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1),
                 two = unlist(lapply(mystring,function(x) strsplit(x,":", fixed=TRUE)[[1]][1])),
                 three = strptime(mystring, "%m/%d/%Y %H"),
                 four = unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)],
                 five = format(as.POSIXct(mystring, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H"), 
                 six = gsub("(.*?):.*", "\\1", mystring),
                 seven = str_extract(mystring, ".+(?=:.+:)"),
                 times = 100000)
      
      
      
          Unit: microseconds
        expr     min      lq      mean  median       uq        max neval
         one  42.792  49.471  85.63742  52.572  57.1310  669280.96 1e+05
         two  64.637  70.618 114.16364  73.252  77.6840  582466.94 1e+05
       three 129.456 134.771 156.82308 136.188 139.2030  339715.94 1e+05
        four  12.860  15.641  22.75699  17.254  18.5440  305703.52 1e+05
        five 482.888 505.647 633.15388 512.880 552.1155  551274.28 1e+05
         six  37.889  43.121  52.79030  45.567  49.1880   32954.59 1e+05
       seven  53.432  59.051  88.05015  62.326  69.9320 1180361.17 1e+05
      

      【讨论】:

      • 是的。我认为fixed = TRUE 让它更快。
      • 很有意思,其实4个,fixed=TRUE最快,改了benchmark来显示这个。
      【解决方案4】:

      你也可以从stringr使用str_extract

      date_strings <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
      "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
      "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
      "01/09/2019 10:52:20")
      
      str_extract(date_strings, ".+(?=:.+:)")
      
       [1] "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 10"
       [5] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
       [9] "01/09/2019 10" "01/09/2019 10"
      

      【讨论】:

        【解决方案5】:

        另一个:

        dates <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
                          "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
                          "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
                          "01/09/2019 10:52:20")
        unlist(lapply(dates,function(x) strsplit(x,":")[[1]][1]))
        

        给予

         [1] "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 10" "01/09/2019 10"
         [6] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
        

        【讨论】:

          【解决方案6】:

          这是另一个使用gsub

          通过()\\1捕获模式来引用捕获的组,需要?使正则表达式变得懒惰,因为有多个:

          gsub("(.*?):.*", "\\1", dates)
          

          【讨论】:

            猜你喜欢
            • 2017-06-10
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2013-07-04
            • 1970-01-01
            • 1970-01-01
            • 2014-11-16
            • 2020-11-29
            相关资源
            最近更新 更多