【问题标题】:Collapse rows from 0 to 0将行从 0 折叠到 0
【发布时间】:2019-06-22 20:24:20
【问题描述】:

对于这样的数据集

    Incident.ID..                date           product
    INCFI0000029582     2014-09-25 08:39:45     foo
    INCFI0000029582     2014-09-25 08:39:48     bar 
    INCFI0000029582     2014-09-25 08:40:44     foo
    INCFI0000029582     2014-10-10 23:04:00     foo
    INCFI0000029587     2014-09-25 08:33:32     bar
    INCFI0000029587     2014-09-25 08:34:41     bar
    INCFI0000029587     2014-09-25 08:35:24     bar
    INCFI0000029587     2014-10-10 23:04:00     foo


df <- structure(list(Incident.ID.. = c("INCFI0000029582", "INCFI0000029582", 
"INCFI0000029582", "INCFI0000029582", "INCFI0000029587", "INCFI0000029587", 
"INCFI0000029587", "INCFI0000029587"), date = c("2014-09-25 08:39:45", 
"2014-09-25 08:39:48", "2014-09-25 08:40:44", "2014-10-10 23:04:00", 
"2014-09-25 08:33:32", "2014-09-25 08:34:41", "2014-09-25 08:35:24", 
"2014-10-10 23:04:00"), product = 
c("foo","bar","foo","foo","bar","bar","bar","foo")), 
class = "data.frame", row.names = c(NA, 
-8L))

我正在使用 mutate 函数按 id 计算滚动时间差,如下所示

library(dplyr)
library(lubridate)
df1 <- df %>%
  group_by(Incident.ID..) %>%
  mutate(diff = c(0, diff(ymd_hms(date))))

这将创建一个列diff,如下所示

  Incident.ID..   date                 product    diff
  INCFI0000029582 2014-09-25 08:39:45  foo        0
  INCFI0000029582 2014-09-25 08:39:48  bar        3
  INCFI0000029582 2014-09-25 08:40:44  foo        56
  INCFI0000029582 2014-10-10 23:04:00  foo        1347796
  INCFI0000029587 2014-09-25 08:33:32  bar        0
  INCFI0000029587 2014-09-25 08:34:41  bar        69
  INCFI0000029587 2014-09-25 08:35:24  bar        43
  INCFI0000029587 2014-10-10 23:04:00  foo        1348116

现在我的目标是将行从 zero 聚合/折叠到 zero,预期的最终数据集如下所示

 Incident.ID..     DateMin              DateMax              product
 INCFI0000029582   2014-09-25 08:39:45  2014-10-10 23:04:00  foo,bar,foo,foo
 INCFI0000029587   2014-09-25 08:33:32  2014-10-10 23:04:00  bar,bar,bar,foo

我不确定如何使用 min 和 max date 列折叠如上所示的行,我需要帮助。提前致谢。

【问题讨论】:

    标签: r dplyr tidyr collapse


    【解决方案1】:

    group_by 属性保留在mutate 之后,所以我们通过群组summarise 获取“日期”的minmax 并折叠pasteing 的“产品”将元素放在一起(toStringpaste(., collapse=", ") 的便捷包装器)

    df %>%
       group_by(Incident.ID..) %>%
       mutate(diff = c(0, diff(ymd_hms(date)))) %>% 
       summarise(DateMin = min(date), 
                 DateMax = max(date), 
                 product = toString(product))
    # A tibble: 2 x 4
    #  Incident.ID..   DateMin             DateMax             product           
    #  <chr>           <chr>               <chr>               <chr>             
    #1 INCFI0000029582 2014-09-25 08:39:45 2014-10-10 23:04:00 foo, bar, foo, foo
    #2 INCFI0000029587 2014-09-25 08:33:32 2014-10-10 23:04:00 bar, bar, bar, foo
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-10-17
      • 1970-01-01
      • 2013-11-05
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多