【问题标题】:Mutating character to date/time in R在R中将字符更改为日期/时间
【发布时间】:2021-10-26 22:27:37
【问题描述】:

我是 R 新手,在将字符格式转换为日期/时间格式时遇到问题。试图让started_atended_at 列从字符变为日期/时间,但无论我尝试了什么,我都会收到错误nas introduced by coercioncharacter string is not in a standard unambiguous format。目的是创建一个新列 ride_length 作为 ended_atstarted_at 值之间的差异(以分钟为单位)。

我的df 命名为sep_2021

str(sep_2021)

spec_tbl_df [804,352 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ ride_id           : chr [1:804352] "9DC7B962304CBFD8" "F930E2C6872D6B32" "6EF72137900BB910" "78D1DE133B3DBF55" ...
 $ rideable_type     : chr [1:804352] "electric_bike" "electric_bike" "electric_bike" "electric_bike" ...
 $ started_at        : chr [1:804352] "9/28/21 16:07" "9/28/21 14:24" "9/28/21 00:20" "9/28/21 14:51" ...
 $ ended_at          : chr [1:804352] "9/28/21 16:09" "9/28/21 14:40" "9/28/21 00:23" "9/28/21 15:00" ...
 $ day_of_week       : num [1:804352] 3 3 3 3 3 3 3 3 2 3 ...
 $ start_station_name: chr [1:804352] NA NA NA NA ...
 $ start_station_id  : chr [1:804352] NA NA NA NA ...
 $ end_station_name  : chr [1:804352] NA NA NA NA ...
 $ end_station_id    : chr [1:804352] NA NA NA NA ...
 $ start_lat         : num [1:804352] 41.9 41.9 41.8 41.8 41.9 ...
 $ start_lng         : num [1:804352] -87.7 -87.6 -87.7 -87.7 -87.7 ...
 $ end_lat           : num [1:804352] 41.9 42 41.8 41.8 41.9 ...
 $ end_lng           : num [1:804352] -87.7 -87.7 -87.7 -87.7 -87.7 ...
 $ member_casual     : chr [1:804352] "casual" "casual" "casual" "casual" ...
 - attr(*, "spec")=
  .. cols(
  ..   ride_id = col_character(),
  ..   rideable_type = col_character(),
  ..   started_at = col_character(),
  ..   ended_at = col_character(),
  ..   day_of_week = col_double(),
  ..   start_station_name = col_character(),
  ..   start_station_id = col_character(),
  ..   end_station_name = col_character(),
  ..   end_station_id = col_character(),
  ..   start_lat = col_double(),
  ..   start_lng = col_double(),
  ..   end_lat = col_double(),
  ..   end_lng = col_double(),
  ..   member_casual = col_character()
  .. )
 - attr(*, "problems")=<externalptr> 

我尝试了以下 --

sep_2021 &lt;- mutate(sep_2021, started_at = as.Date(started_at)

结果:字符串不是标准的明确格式

sep_2021 &lt;- mutate(sep_2021, started_at = as.Date.POSIXct(started_at, tz = "", tryFormats = c("%Y-%m-%d %H:%M:%OS","%Y/%m/%d %H:%M:%OS")))

结果:字符串不是标准的明确格式

sep_2021 &lt;- mutate(sep_2021, started_at = lubridate::as_datetime(started_at))

结果:所有格式都无法解析。未找到格式

sep_2021 &lt;- mutate(sep_2021, started_at = as.Date(started_at, "%m-%d-%y %H:%M:%OS"))

结果:强制引入的 NAs

非常感谢任何和所有建议或建议!

【问题讨论】:

    标签: r datetime


    【解决方案1】:

    我们可以使用来自parsedateparse_date

    library(dplyr)
    library(parsedate)
    sep_2021 <- sep_2021 %>%
        mutate(across(c(started_at, ended_at), parse_date))
    

    使用的format 和列中的格式不同,即应该是%m/%d/%y %H:%M

    sep_2021 <- sep_2021 %>%
          mutate(across(c(started_at, ended_at), as.POSIXct,
          format = "%m/%d/%y %H:%M"))
    

    【讨论】:

    • 这很好用,谢谢 akrun!后续问题——从 POSIXct 格式计算持续时间的最佳方法是什么?当我尝试使用difftime 时,我得到的都是0,如果我使用as.duration,我会收到错误as.duration is not defined for class POSIXct' s.duration is not defined for class POSIXt
    【解决方案2】:

    您可以使用mdy_hm 将类从字符更改为POSIXct。要计算差异,请使用 difftime 并将 units 传递给它。

    例如,要在几秒钟内获得差异,您可以这样做 -

    library(dplyr)
    library(lubridate)
    
    sep_2021 <- sep_2021 %>%
      mutate(across(c(started_at, ended_at), mdy_hm), 
             diff = difftime(ended_at, started_at, units = 'secs'))
    
    sep_2021
    
    #           started_at            ended_at     diff
    #1 2021-09-28 16:07:00 2021-09-28 16:09:00 120 secs
    #2 2021-09-28 14:24:00 2021-09-28 14:40:00 960 secs
    #3 2021-09-28 00:20:00 2021-09-28 00:23:00 180 secs
    #4 2021-09-28 14:51:00 2021-09-28 15:00:00 540 secs
    

    数据

    如果您在reproducible format 中提供数据会更容易提供帮助

    sep_2021 <- data.frame(started_at = c("9/28/21 16:07", "9/28/21 14:24" ,"9/28/21 00:20" ,"9/28/21 14:51"), 
                      ended_at = c("9/28/21 16:09" ,"9/28/21 14:40", "9/28/21 00:23", "9/28/21 15:00"))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2022-07-06
      • 2017-04-08
      • 1970-01-01
      • 2018-11-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多