【问题标题】:Calculate min and max (range) by group按组计算最小值和最大值(范围)
【发布时间】:2015-10-12 21:53:59
【问题描述】:

我在数据框中有这样的东西:

PersonId Date_Withdrawal
       A      2012-05-01   
       A      2012-06-01
       B      2012-05-01
       C      2012-05-01
       A      2012-07-01
       A      2012-10-01
       B      2012-08-01
       B      2012-12-01
       C      2012-07-01

我想通过“PersonId”获取最小和最大日期

【问题讨论】:

    标签: r range aggregate


    【解决方案1】:

    首先,转换为正确的日期类(始终是一个好习惯),然后您可以按组运行一个简单的range。这是一个尝试

    library(data.table)
    setDT(df)[, Date_Withdrawal := as.IDate(Date_Withdrawal)]
    df[, as.list(range(Date_Withdrawal)), by = PersonId]
    #    PersonId         V1         V2
    # 1:        A 2012-05-01 2012-10-01
    # 2:        B 2012-05-01 2012-12-01
    # 3:        C 2012-05-01 2012-07-01
    

    或者

    library(dplyr)
    df %>%
      mutate(Date_Withdrawal = as.Date(Date_Withdrawal)) %>%
      group_by(PersonId) %>%
      summarise(Min = min(Date_Withdrawal), Max = max(Date_Withdrawal))
    # Source: local data frame [3 x 3]
    # 
    #  PersonId        Min        Max
    #    (fctr)     (date)     (date)
    # 1        A 2012-05-01 2012-10-01
    # 2        B 2012-05-01 2012-12-01
    # 3        C 2012-05-01 2012-07-01
    

    附: base aggregate 看起来像 aggregate(as.Date(Date_Withdrawal) ~ PersonId, df, range) 但它拒绝保留类。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2014-03-22
      • 2021-11-27
      • 2023-01-15
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多