【问题标题】:How do I split my time data into intervals in R?如何将我的时间数据拆分为 R 中的间隔?
【发布时间】:2020-12-10 16:19:24
【问题描述】:

我有一些看起来像这样的数据:

time                 author             text               day     times     timeblock  dayblock
2019-08-02 12:16:40|"ab5c9c0a"|"This message was deleted"  |2|   "12:16:40"| "Cycle 1"|  "No"
2019-08-02 12:36:40|"ab5c9c0a"|"Please take a survey"      |2|   "12:36:40"| "Cycle 1"|  "No"
2019-08-02 13:29:40|"43cd8b94"|"Done :D"                   |2|   "13:29:40"| "Cycle 1"|  "No"
2019-08-02 17:41:40|"083fa508"|"<Media omitted>"           |2|   "17:41:40"| "Cycle 1"|  "No"
str(chat)

Classes ‘data.table’ and 'data.frame':  16111 obs. of  7 variables:
 $ time     : POSIXct, format: "2019-08-02 12:16:40" "2019-08-02 12:35:40" "2019-08-02 12:36:40" ...
 $ author   : chr  "ab5c9c0a" "ab5c9c0a" "ab5c9c0a" "43cd8b94" ...
 $ text     : chr  "This message was deleted" "https://docs.google.com/forms/d/e/1FAIpQLSf4hE" "Please take a survey" "Done :D" ...
 $ day      : int  2 2 2 2 2 3 3 3 3 3 ...
 $ times    : chr  "12:16:40" "12:35:40" "12:36:40" "13:29:40" ...
 $ timeblock: Factor w/ 13 levels "Cycle 1","Cycle 2",..:

我写这个是为了将时间分类为7 am10 pm等等:

chat <- chat %>% 
mutate(
 # Time Segements
 dayblock = case_when(
 time >= hms(070000) & time <= hms(080000) ~ "7 AM",
 time >= hms(080000) & time <= hms(090000) ~ "8 AM",
 time >= hms(090000) & time <= hms(100000) ~ "9 AM",
 time >= hms(100000) & time <= hms(110000) ~ "10 AM",
 time >= hms(110000) & time <= hms(120000) ~ "11 AM",
 time >= hms(120000) & time <= hms(130000) ~ "12 PM",
 time >= hms(130000) & time <= hms(140000) ~ "1 PM",
 time >= hms(140000) & time <= hms(150000) ~ "2 PM",
 time >= hms(150000) & time <= hms(160000) ~ "3 PM",
 time >= hms(160000) & time <= hms(170000) ~ "4 PM",
 time >= hms(170000) & time <= hms(180000) ~ "5 PM",
 time >= hms(180000) & time <= hms(190000) ~ "6 PM",
 time >= hms(190000) & time <= hms(200000) ~ "7 PM",
 time >= hms(200000) & time <= hms(210000) ~ "8 PM",
 time >= hms(210000) & time <= hms(220000) ~ "9 PM",
 time >= hms(220000) & time <= hms(230000) ~ "10 PM",
 time >= hms(230000) & time <= hms(000000) ~ "11 PM",
 time >= hms(000000) & time <= hms(010000) ~ "12 AM",
 time >= hms(010000) & time <= hms(020000) ~ "1 AM",
 time >= hms(020000) & time <= hms(030000) ~ "2 AM",
 time >= hms(030000) & time <= hms(040000) ~ "3 AM",
 time >= hms(040000) & time <= hms(050000) ~ "4 AM",
 time >= hms(050000) & time <= hms(060000) ~ "5 AM",
 time >= hms(060000) & time <= hms(070000) ~ "6 AM",
 T ~ "No")) %>% 
  mutate(dayblock = factor(dayblock))

预期的输出是:

time                 author             text               day     times     timeblock  dayblock
2019-08-02 12:16:40|"ab5c9c0a"|"This message was deleted"  |2|   "12:16:40"| "Cycle 1"|  12 PM
2019-08-02 12:36:40|"ab5c9c0a"|"Please take a survey"      |2|   "12:36:40"| "Cycle 1"|  12 PM
2019-08-02 13:29:40|"43cd8b94"|"Done :D"                   |2|   "13:29:40"| "Cycle 1"|  1 PM
2019-08-02 17:41:40|"083fa508"|"<Media omitted>"           |2|   "17:41:40"| "Cycle 1"|  5 PM

但是当我运行它时,所有行都只填充了No 值。我做错了什么?

当前的错误是:

Problem with `mutate()` input `dayblock`.
i Some strings failed to parse, or all strings are NAs
i Input `dayblock` is `case_when(...)`.Some strings failed to parse, or all strings are NAsProblem with `mutate()` input `dayblock`.

编辑:虽然接受的答案解决了这个问题,但@Istrel 的答案是一个更优雅的解决方案,我建议用户尝试一下。

【问题讨论】:

  • 原因是hms(070000)# [1] NA正在返回NA
  • 我不确定如何解决这个问题。我要改变什么?
  • @namban 你能发一个dput的例子吗
  • @A5C1D2H2I1M1N2O1R2T1 我更喜欢使用这种方法,因为我不太熟悉使用cut
  • @namnban 你可能需要将格式更改为hms("07:00:00")

标签: r time lubridate


【解决方案1】:

您似乎可以使用format 函数实现相同的目的。

library(tidyverse)
library(lubridate)

chat <- tibble(time = ymd_hms(c("2019-08-02 12:16:40", "2019-08-02 12:36:40", 
    "2019-08-02 13:29:40", "2019-08-02 3:29:40")))

chat <- chat %>%
  mutate(dayblock =  format(time, "%I %p"))

#   time                dayblock
#   <dttm>              <chr>   
# 1 2019-08-02 12:16:40 12 PM   
# 2 2019-08-02 12:36:40 12 PM   
# 3 2019-08-02 13:29:40 01 PM   
# 4 2019-08-02 03:29:40 03 AM   
# 5 2019-08-02 02:01:40 02 AM

【讨论】:

  • hour(time) &gt;= 6 有什么用呢?这真的非常好而且非常快,但有些时候像02:01:4000:21:40被标记为No
  • hour(time) 是来自lubridate 的函数,它以数字形式返回小时数。因此,hours 大于 6 且小于 23(晚上 11 点)的任何条目都将转换为字符串表示形式。对于其他人No
  • 对!我已将其更改为hour(time) &gt;= 0,以说明午夜后和早上 6 点之间的时间。感谢您提供非常优雅的解决方案。我正在尝试的东西现在看起来很像穴居人。
  • 对不起,我误解了你的任务。你根本不需要ifelse。我已更改代码,请立即尝试。
  • 这是有道理的!老实说,对我来说就像黑魔法!我可以知道在哪里可以阅读更多关于%I %p 位的文档吗?我想知道发生了什么,以便我可以将其用于其他事情。
【解决方案2】:

选项是更改要进入hms的格式

library(dplyr)
library(lubridate)
chat %>% 
  mutate(times = hms(times),       
   dayblock = factor(case_when(
     times >= hms('07:00:00') & times <= hms('08:00:00') ~ "7 AM",
     times >= hms('08:00:00') & times <= hms('09:00:00') ~ "8 AM",
     times >= hms('12:00:00') & times <= hms('13:00:00') ~ "12 PM",
      TRUE ~ "No"))
   )
   

-输出

#            time   author                     text day       times timeblock dayblock
#1 2019-08-02 12:16:40 ab5c9c0a This message was deleted   2 12H 16M 40S   Cycle 1    12 PM
#2 2019-08-02 12:36:40 ab5c9c0a     Please take a survey   2 12H 36M 40S   Cycle 1    12 PM
#3 2019-08-02 13:29:40 43cd8b94                  Done :D   2 13H 29M 40S   Cycle 1       No
#4 2019-08-02 17:41:40 083fa508          <Media omitted>   2 17H 41M 40S   Cycle 1       No

数据

chat <- structure(list(time = c("2019-08-02 12:16:40", "2019-08-02 12:36:40", 
"2019-08-02 13:29:40", "2019-08-02 17:41:40"), author = c("ab5c9c0a", 
"ab5c9c0a", "43cd8b94", "083fa508"), text = c("This message was deleted", 
"Please take a survey", "Done :D", "<Media omitted>"), day = c(2L, 
2L, 2L, 2L), times = c("12:16:40", "12:36:40", "13:29:40", "17:41:40"
), timeblock = c("Cycle 1", "Cycle 1", "Cycle 1", "Cycle 1"), 
    dayblock = c("No", "No", "No", "No")), class = "data.frame", 
 row.names = c(NA, 
-4L))       

【讨论】:

  • 哇,这太棒了!我使用相同的代码来标记日期,所以我认为这也可以。非常感谢。
  • @namnban 在这里,我使用了几个条件来检查它是否有效。它工作正常。您也可以添加其他表达式
【解决方案3】:

我们也可以使用R base中的strftime

chat <- tibble(time = c("2019-08-02 12:16:40", "2019-08-02 12:36:40", 
                                "2019-08-02 13:29:40", "2019-08-02 17:41:40"))
chat$dayblock <-  strftime(chat$time, "%I %p")

 time                dayblock
  <chr>               <chr>   
1 2019-08-02 12:16:40 12 PM   
2 2019-08-02 12:36:40 12 PM   
3 2019-08-02 13:29:40 01 PM   
4 2019-08-02 17:41:40 05 PM 

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-10-24
    • 2021-07-16
    • 2022-01-07
    • 1970-01-01
    • 2018-04-20
    • 2022-01-10
    • 1970-01-01
    相关资源
    最近更新 更多