按日期和 id 组创建索引列答案

【问题标题】：Create an index column by groups of date and id按日期和 id 组创建索引列
【发布时间】：2020-04-07 04:49:10
【问题描述】：

这里是 R / Stackoverflow 新手。

我有一个如下所示的数据集：

    year mother_id   nest_seq incubation_date
   <int> <chr>          <int> <date>         
 1  1994 543762-MMM         1 1994-10-16     
 2  1994 543762-MMM         3 1994-11-06     
 3  1994 543762-MMM         4 1994-11-24     
 4  1994 543762-MMM         6 1994-12-05     
 5  1995 583809-mGMW        4 1994-10-24     
 6  1995 583809-mGMW        7 1994-11-21     
 7  1995 583809-mGMW        8 1994-12-22     
 8  1996 596751-BWM         1 1994-11-20     
 9  1996 596751-BWM         2 1994-12-23     
10  1996 626691-GBW         2 1994-11-08

我只是想根据incubation_date生成一个新的nest_seq 例如：

 year mother_id   nest_seq incubation_date new_nest_seq
   <int> <chr>          <int> <date>         <int>
 1  1994 543762-MMM         1 1994-10-16        1
 2  1994 543762-MMM         3 1994-11-06        2
 3  1994 543762-MMM         4 1994-11-24        3
 4  1994 543762-MMM         6 1994-12-05        4

我一直在尝试使用 if_else() 执行此操作，但遇到了困难...

group_by(year, mother_id) %>%
mutate(new_nest_seq = if_else(min(incubation_date), 1, ?)))

非常感谢任何建议...

【问题讨论】：

R 用于数据科学（可作为网站在线获得）有一章介绍了内存中的 lubridate 包 - 这可能是一个很好的起点。

标签： r indexing group-by dplyr

【解决方案1】：

试试这个：

library(dplyr)

df %>%
  # Just in case: order 
  arrange(year, mother_id, incubation_date) %>% 
  group_by(year, mother_id) %>%
  # Create new index
  mutate(
    new_nest_seq = 1, 
    new_nest_seq = cumsum(new_nest_seq)) %>% 
  ungroup()
#> # A tibble: 10 x 5
#>     year mother_id   nest_seq incubation_date new_nest_seq
#>    <int> <fct>          <int> <fct>                  <dbl>
#>  1  1994 543762-MMM         1 1994-10-16                 1
#>  2  1994 543762-MMM         3 1994-11-06                 2
#>  3  1994 543762-MMM         4 1994-11-24                 3
#>  4  1994 543762-MMM         6 1994-12-05                 4
#>  5  1995 583809-mGMW        4 1994-10-24                 1
#>  6  1995 583809-mGMW        7 1994-11-21                 2
#>  7  1995 583809-mGMW        8 1994-12-22                 3
#>  8  1996 596751-BWM         1 1994-11-20                 1
#>  9  1996 596751-BWM         2 1994-12-23                 2
#> 10  1996 626691-GBW         2 1994-11-08                 1

或通过`dplyr::row_number`:

df %>%
  # Just in case: order 
  arrange(year, mother_id, incubation_date) %>% 
  group_by(year, mother_id) %>%
  # Create new index
  mutate(new_nest_seq = dplyr::row_number()) %>% 
  ungroup()

^{由reprex package (v0.3.0) 于 2020-04-07 创建}

【讨论】：

谢谢@stefan，这正是我所需要的。
我的荣幸。顺便说一句：您也可以使用mutate(new_nest_seq = dplyr::row_number())。添加此作为第二个解决方案。如果你想帮我一个忙：将问题标记为已回答。除了给我一些荣誉之外，它还向其他有类似问题的人表明该解决方案有效，并从仍在等待答案的问题队列中删除了该问题。

或通过dplyr::row_number:

或通过`dplyr::row_number`: