mutate() 返回“找不到对象”的错误答案

【问题标题】：mutate() returns error of 'Object not found'mutate() 返回“找不到对象”的错误
【发布时间】：2021-08-13 13:10:18
【问题描述】：

我正在尝试使用mutate() 清理并在我的名为Volume 的数据中添加一个新列。

这是我读入 R 的数据：

> df1 <- file.choose()
> data1 <- read_excel(df1)                                                                                                                                   
> head(data1)
# A tibble: 5 x 3
  `product id` amount `total sales`
  <chr>         <dbl>         <dbl>
1 X180             20           200
2 X109             30           300
3 X918             20           200
4 X273             15           150
5 X988             12           120

接下来，我将product id 和total sales 列分别重命名为Product Code 和Net Sales，并在Net Sales 上应用我自己的函数mutate() 并创建一个新的Volume 列。

> data2 <- data1 %>% 
+   select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
+   replace_na(list(`Net Sales` = 0))%>%
+   arrange(desc(`Net Sales`))%>%
+   mutate(Volume = rank_volume(data1, `Net Sales`))

这是我收到的错误消息：

 Error: Problem with `mutate()` column `Volume`.
ℹ `Volume = rank_volume(data1, `Net Sales`)`.
x arrange() failed at implicit mutate() step. 
* Problem with `mutate()` column `..1`.
ℹ `..1 = Net Sales`.
x object 'Net Sales' not found

这是我创建的函数rank_volume

### a function to label the products that are top one third in total sales as "H", products with the lowest third in sales as "L", and the rest as "M"
rank_volume <- function(data, column) {
  
  column <- ensym(column)
  colstr <- as_string(column)
  data <- arrange(data, desc(!!column))
  size <- length(data[[colstr]])
  first_third <- data[[colstr]][round(size / 3)]
  last_third <- data[[colstr]][round(size - (size / 3))]
  
  case_when(data[[colstr]] > first_third ~ "H",
            data[[colstr]] < last_third ~ "L",
            TRUE ~ "M")
}

当我使用一个简单的数据框单独运行我的函数时，它可以完美运行。但是，当我使用 mutate() 运行它时，出现了错误。我找不到问题。有人可以帮忙吗？

编辑：dput(head(data))

> dput(head(data1))
structure(list(`product id` = c("X180", "X109", "X918", "X273", 
"X988"), amount = c(20, 30, 20, 15, 12), `total sales` = c(200, 
300, 200, 150, 120)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】：

您可以通过dput(head(data)) 分享您的可重现数据吗？
您在 mutate 中调用 data1 而 data1 没有 Net Sales 列
@BastienDucreux 我在清理过程中将总销售额的名称更改为净销售额。 mutate() 是否采用初始 data1 代替？当我将其更改为 mutate(Volume = rank_volume(data1, `total sales`)) 时它可以工作
@AnoushiravanR 我现在在编辑中添加了dput(head(data))。

标签： r dplyr

【解决方案1】：

data1 没有Net Sales 列，它存在于您所做的转换中。您可以使用. 来引用管道中的当前数据帧。

library(dplyr)

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(., `Net Sales`))

# `Product Code` `Net Sales` Volume
#  <chr>                <dbl> <chr> 
#1 X109                   300 H     
#2 X180                   200 M     
#3 X918                   200 M     
#4 X273                   150 L     
#5 X988                   120 L

或者也可以使用cur_data()-

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(cur_data(), `Net Sales`))

【讨论】：

【解决方案2】：

您可以在完成初始清理后添加新列。

  data2 <- data1 %>% 
  select("Product Code" = "product id", "Net Sales" = "total sales") %>%
  replace_na(list("Net Sales" = 0))%>%
  arrange(desc("Net Sales"))
  
  data2 <- data2 %>%
  mutate(Volume = rank_volume(data2, "Net Sales"))

【讨论】：