mutate 是否与 rep 功能一起使用？答案

【问题标题】：does mutate work with rep function?mutate 是否与 rep 功能一起使用？
【发布时间】：2018-05-30 10:08:17
【问题描述】：

我有一个名为“confidence_table”的小标题。有谁知道为什么如果我尝试使用 mutate 动词添加一个新列这不起作用？

# A tibble: 12 x 3
# Groups:   Age [2]
  Age   Condition       Prop
<fctr>    <fctr>      <dbl>
   0       old      0.73993056
   1       old      0.75590278
   0       old      0.15069444
   1       old      0.13090278
   0       new      0.06388889
   1       new      0.04965278
   0       new      0.05902778
   1       new      0.05416667
   0      lure      0.23055556
   1      lure      0.23645833
   0      lure      0.13819444
   1      lure      0.12013889

我使用了 base r 中的这个函数，它确实有效

confidence_table$Confidence <- as.factor(rep(c("HC", "LC"), times = 3, each = 2))

# A tibble: 12 x 4
# Groups:   Age [2]
 Age   Condition     Prop Confidence
<fctr>    <fctr>      <dbl>     <fctr>
  0       old      0.73993056     HC
  1       old      0.75590278     HC      
  0       old      0.15069444     LC
  1       old      0.13090278     LC
  0       new      0.06388889     HC
  1       new      0.04965278     HC
  0       new      0.05902778     LC
  1       new      0.05416667     LC
  0      lure      0.23055556     HC
  1      lure      0.23645833     HC
  0      lure      0.13819444     LC
  1      lure      0.12013889     LC

这是使用基本 r 代码的预期输出。但是，如果我使用：

confidence_table <- confidence_table %>%
                    mutate(Confidence = rep(c("HC", "LC"), times = 3, each = 2))

它说： mutate_impl(.data, dots) 中的错误：列置信度的长度必须为 6（组大小）或 1，而不是 12

这有什么问题？

【问题讨论】：

标签： r dplyr rep

【解决方案1】：

在这种情况下，错误消息实际上应该可以帮助您找出问题所在。注意2 x 3 x 2 = 12。

confidence_table %>%
  mutate(Confidence = rep(c("HC", "LC"), times = 3, each = 2))
# Error in mutate_impl(.data, dots) : 
#   Column `Confidence` must be length 6 (the group size) or one, not 12

正如 cmets 中所指出的，解决此问题的一种方法是先发送至 ungroup。

confidence_table %>%
  ungroup() %>%
  mutate(Confidence = rep(c("HC", "LC"), times = 3, each = 2))
# # A tibble: 12 x 4
#      Age Condition       Prop Confidence
#    <int>     <chr>      <dbl>      <chr>
#  1     0       old 0.73993056         HC
#  2     1       old 0.75590278         HC
#  3     0       old 0.15069444         LC
#  4     1       old 0.13090278         LC
#  5     0       new 0.06388889         HC
#  6     1       new 0.04965278         HC
#  7     0       new 0.05902778         LC
#  8     1       new 0.05416667         LC
#  9     0      lure 0.23055556         HC
# 10     1      lure 0.23645833         HC
# 11     0      lure 0.13819444         LC
# 12     1      lure 0.12013889         LC

您也可以在不先ungrouping 的情况下这样做：

confidence_table %>% 
  mutate(Confidence = rep(c("HC", "LC"), times = 3)) # 2x3 = 6
# # A tibble: 12 x 4
# # Groups:   Age [2]
#      Age Condition       Prop Confidence
#    <int>     <chr>      <dbl>      <chr>
#  1     0       old 0.73993056         HC
#  2     1       old 0.75590278         HC
#  3     0       old 0.15069444         LC
#  4     1       old 0.13090278         LC
#  5     0       new 0.06388889         HC
#  6     1       new 0.04965278         HC
#  7     0       new 0.05902778         LC
#  8     1       new 0.05416667         LC
#  9     0      lure 0.23055556         HC
# 10     1      lure 0.23645833         HC
# 11     0      lure 0.13819444         LC
# 12     1      lure 0.12013889         LC

另一种选择是按“条件”分组——可能是这样的：

confidence_table %>% 
  group_by(Condition) %>% 
  mutate(Confidence = c("HC", "LC")[cumsum(Age == 0)])

样本数据：

confidence_table <- structure(list(Age = c(0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 
    0L, 1L), Condition = c("old", "old", "old", "old", "new", "new", 
    "new", "new", "lure", "lure", "lure", "lure"), Prop = c(0.73993056, 
    0.75590278, 0.15069444, 0.13090278, 0.06388889, 0.04965278, 0.05902778, 
    0.05416667, 0.23055556, 0.23645833, 0.13819444, 0.12013889)), .Names = c("Age", 
    "Condition", "Prop"), row.names = c(NA, -12L), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"), vars = "Age", drop = TRUE, indices = list(
        c(0L, 2L, 4L, 6L, 8L, 10L), c(1L, 3L, 5L, 7L, 9L, 11L)), group_sizes = c(6L, 
    6L), biggest_group_size = 6L, labels = structure(list(Age = 0:1), row.names = c(NA, 
    -2L), class = "data.frame", vars = "Age", drop = TRUE, .Names = "Age"))

【讨论】：