【问题标题】:mutate() returns error of 'Object not found'mutate() 返回“找不到对象”的错误
【发布时间】:2021-08-13 13:10:18
【问题描述】:

我正在尝试使用mutate() 清理并在我的名为Volume 的数据中添加一个新列。

这是我读入 R 的数据:

> df1 <- file.choose()
> data1 <- read_excel(df1)                                                                                                                                   
> head(data1)
# A tibble: 5 x 3
  `product id` amount `total sales`
  <chr>         <dbl>         <dbl>
1 X180             20           200
2 X109             30           300
3 X918             20           200
4 X273             15           150
5 X988             12           120

接下来,我将product idtotal sales 列分别重命名为Product CodeNet Sales,并在Net Sales 上应用我自己的函数mutate() 并创建一个新的Volume 列。

> data2 <- data1 %>% 
+   select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
+   replace_na(list(`Net Sales` = 0))%>%
+   arrange(desc(`Net Sales`))%>%
+   mutate(Volume = rank_volume(data1, `Net Sales`))

这是我收到的错误消息:

 Error: Problem with `mutate()` column `Volume`.
ℹ `Volume = rank_volume(data1, `Net Sales`)`.
x arrange() failed at implicit mutate() step. 
* Problem with `mutate()` column `..1`.
ℹ `..1 = Net Sales`.
x object 'Net Sales' not found

这是我创建的函数rank_volume

### a function to label the products that are top one third in total sales as "H", products with the lowest third in sales as "L", and the rest as "M"
rank_volume <- function(data, column) {
  
  column <- ensym(column)
  colstr <- as_string(column)
  data <- arrange(data, desc(!!column))
  size <- length(data[[colstr]])
  first_third <- data[[colstr]][round(size / 3)]
  last_third <- data[[colstr]][round(size - (size / 3))]
  
  case_when(data[[colstr]] > first_third ~ "H",
            data[[colstr]] < last_third ~ "L",
            TRUE ~ "M")
}

当我使用一个简单的数据框单独运行我的函数时,它可以完美运行。但是,当我使用 mutate() 运行它时,出现了错误。我找不到问题。有人可以帮忙吗?

编辑:dput(head(data))

> dput(head(data1))
structure(list(`product id` = c("X180", "X109", "X918", "X273", 
"X988"), amount = c(20, 30, 20, 15, 12), `total sales` = c(200, 
300, 200, 150, 120)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

【问题讨论】:

  • 您可以通过dput(head(data)) 分享您的可重现数据吗?
  • 您在 mutate 中调用 data1 而 data1 没有 Net Sales 列
  • @BastienDucreux 我在清理过程中将总销售额的名称更改为净销售额。 mutate() 是否采用初始 data1 代替?当我将其更改为 mutate(Volume = rank_volume(data1, `total sales`)) 时它可以工作
  • @AnoushiravanR 我现在在编辑中添加了dput(head(data))

标签: r dplyr


【解决方案1】:

data1 没有Net Sales 列,它存在于您所做的转换中。您可以使用. 来引用管道中的当前数据帧。

library(dplyr)

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(., `Net Sales`))

# `Product Code` `Net Sales` Volume
#  <chr>                <dbl> <chr> 
#1 X109                   300 H     
#2 X180                   200 M     
#3 X918                   200 M     
#4 X273                   150 L     
#5 X988                   120 L     

或者也可以使用cur_data()-

data1 %>% 
     select(`Product Code` = `product id`, `Net Sales` = `total sales`) %>%
     replace_na(list(`Net Sales` = 0))%>%
     arrange(desc(`Net Sales`)) %>%
     mutate(Volume = rank_volume(cur_data(), `Net Sales`))

【讨论】:

    【解决方案2】:

    您可以在完成初始清理后添加新列。

      data2 <- data1 %>% 
      select("Product Code" = "product id", "Net Sales" = "total sales") %>%
      replace_na(list("Net Sales" = 0))%>%
      arrange(desc("Net Sales"))
      
      data2 <- data2 %>%
      mutate(Volume = rank_volume(data2, "Net Sales"))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-11-27
      • 1970-01-01
      • 2017-08-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-06-02
      • 1970-01-01
      相关资源
      最近更新 更多