IF else by group - 检查下一行的值答案

【问题标题】：IF else by group - check value in next rowIF else by group - 检查下一行的值
【发布时间】：2021-02-01 15:31:01
【问题描述】：

这里是一个 df 示例：

test_df <- structure(list(plant_sp = c("plant_1", "plant_1", "plant_2", "plant_2", "plant_3",
                                       "plant_3", "plant_3", "plant_3", "plant_3", "plant_4", 
                                       "plant_4", "plant_4", "plant_4", "plant_4", "plant_4",
                                       "plant_5", "plant_5", "plant_5", "plant_5", "plant_5"), 
                          sp_rich = c(1, 1, NA, 1, NA, 
                                      1, 0, 0, NA, 0,
                                      0, 1, 0, 0, 1, 
                                      0, NA, NA, 0,NA)), 
                     row.names = c(NA, -20L), class = "data.frame", 
                     .Names = c("plant_sp", "sp_rich"))

我正在尝试通过以下方式使用 tidyverse 创建一个 ifelse 声明：按“plant_sp”列对每个组检查“sp_rich”列中的值是否为.na 如果是，则在名为“is_na_nest_row”的新列中设置值“1”。我设法做到了以下几点：

test_df %>%
group_by(plant_sp) %>%
mutate(is_na_nest_row = ifelse(???,1,0))

但我不知道如何在 sp_rich 列中引用值，但在下一行（在一个组中）

例如：

如果第 4 行的 sp_rich 下有空行，我希望值“1”在第 3 行的 is_na_nest_row 之下。

非常感谢，伊多

【问题讨论】：

(1) 如果您使用dplyr，请改用if_else，因为base::ifelse 有一些issues。 (2) 坦率地说，你在这里不需要它，因为你可以说mutate(is_na_nest_row = +is.na(sp_rich)) 来达到同样的效果（虽然它是整数，而不是你这里的浮点数）。
不会检查同一行是否有 na.value 吗？敌人示例 - 如果第 4 行的“sp_rich”下有空行，我希望值“1”位于第 3 行。我添加了一个示例，因为我之前不清楚，谢谢！

标签： r tidyverse

【解决方案1】：

编辑：现在它应该可以正常工作了。

您可以在mutate 中使用row_number() 访问下一行。所以我认为这是正确的解决方案。

test_df %>% group_by(plant_sp) %>% mutate(Test = ifelse(is.na(sp_rich[row_number() + 1]), 1, 0),  Test = c(Test[-n()], 0)))

有输出

    # A tibble: 20 x 3
# Groups:   plant_sp [5]
   plant_sp sp_rich  Test
   <chr>      <dbl> <dbl>
 1 plant_1        1     0
 2 plant_1        1     0
 3 plant_2       NA     0
 4 plant_2        1     0
 5 plant_3       NA     0
 6 plant_3        1     0
 7 plant_3        0     0
 8 plant_3        0     1
 9 plant_3       NA     0
10 plant_4        0     0
11 plant_4        0     0
12 plant_4        1     0
13 plant_4        0     0
14 plant_4        0     0
15 plant_4        1     0
16 plant_5        0     1
17 plant_5       NA     1
18 plant_5       NA     0
19 plant_5        0     1
20 plant_5       NA     0

【讨论】：

谢谢！问题是我确实需要分组。例如，在第 2 行，即使 NA 行在另一组 - “plant_2”中，该值也是 1。有没有办法按组制作？
是的，现在我认为它应该可以工作了！查看新代码和输出。

【解决方案2】：

test_df %>%
  group_by(plant_sp) %>%
  mutate(is_na_nest_row = +any(is.na(sp_rich)))
# # A tibble: 20 x 3
# # Groups:   plant_sp [5]
#    plant_sp sp_rich is_na_nest_row
#    <chr>      <dbl>          <int>
#  1 plant_1        1              0
#  2 plant_1        1              0
#  3 plant_2       NA              1
#  4 plant_2        1              1
#  5 plant_3       NA              1
#  6 plant_3        1              1
#  7 plant_3        0              1
#  8 plant_3        0              1
#  9 plant_3       NA              1
# 10 plant_4        0              0
# 11 plant_4        0              0
# 12 plant_4        1              0
# 13 plant_4        0              0
# 14 plant_4        0              0
# 15 plant_4        1              0
# 16 plant_5        0              1
# 17 plant_5       NA              1
# 18 plant_5       NA              1
# 19 plant_5        0              1
# 20 plant_5       NA              1

或者如果它只是下一行，

test_df %>%
  group_by(plant_sp) %>%
  mutate(is_na_nest_row = +(lead(is.na(sp_rich), default = FALSE)))
# # A tibble: 20 x 3
# # Groups:   plant_sp [5]
#    plant_sp sp_rich is_na_nest_row
#    <chr>      <dbl>          <int>
#  1 plant_1        1              0
#  2 plant_1        1              0
#  3 plant_2       NA              0
#  4 plant_2        1              0
#  5 plant_3       NA              0
#  6 plant_3        1              0
#  7 plant_3        0              0
#  8 plant_3        0              1
#  9 plant_3       NA              0
# 10 plant_4        0              0
# 11 plant_4        0              0
# 12 plant_4        1              0
# 13 plant_4        0              0
# 14 plant_4        0              0
# 15 plant_4        1              0
# 16 plant_5        0              1
# 17 plant_5       NA              1
# 18 plant_5       NA              0
# 19 plant_5        0              1
# 20 plant_5       NA              0

【讨论】：

不会检查同一行是否有 na.value 吗？例如 - 如果第 4 行的“sp_rich”下有空行，我希望值“1”在第 3 行。我添加了一个示例，因为我之前不清楚，谢谢！
您知道，如果您在给定样本数据的情况下提供实际的预期输出，那将非常有帮助。它消除了大部分的歧义。