【问题标题】:Column name not identified in function using ggplot inR使用 R 中的 ggplot 在函数中未识别列名
【发布时间】:2018-06-26 16:44:11
【问题描述】:

我正在尝试编写一个通用函数,用于对 R 进行分类变量的单变量分析。我可以将变量传递给 dplyr 但它不适用于 ggplot 代码。

这是我的代码 -

 univariate_catogrical <- function(dataset,variable){
  variable <- enquo(variable)

  percentage <- dataset %>%
    select(!!variable) %>%
    group_by(!!variable) %>%
    summarise(n = n()) %>%
    mutate(percantage = (n / sum(n)) * 100)
  print(percentage)

  dataset %>%
    count(!!variable) %>%
    ggplot(mapping = aes_(x = rlang::quo_expr(!!variable), 
                          y = n, fill = rlang::quo_expr(!!variable))) +
    geom_bar(stat = 'identity',
             colour = 'white') +
    labs(x = "Reason.for.absence" , y = "count") + 
    ggtitle(" Count of Reason for absence") +
    theme(legend.position = "bottom") -> p
  plot(p)

}

当我执行上述功能时,我得到了

> univariate_catogrical(employee_data_Imputed,Reason.for.absence)
# A tibble: 28 x 3
   Reason.for.absence     n percantage
   <fct>              <int>      <dbl>
 1 1                     16      2.23 
 2 2                      1      0.139
 3 3                      1      0.139
 4 4                      2      0.279
 5 5                      3      0.418
 6 6                      7      0.975
 7 7                     15      2.09 
 8 8                      6      0.836
 9 9                      4      0.557
10 10                    23      3.20 
# ... with 18 more rows
 Hide Traceback

 Rerun with Debug
 Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `variable` is unknown 

有人可以建议如何解决它。我使用 ase_ 函数来传递参数。

请找出可重现的例子。

dput(head(employee_data_Imputed,8))
structure(list(ID = structure(c(11L, 36L, 3L, 7L, 11L, 10L, 20L, 
14L), .Label = c("1", "2", "3", "4", "5", "6", "7", "8", "9", 
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", 
"21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", 
"32", "33", "34", "35", "36"), class = "factor"), Reason.for.absence = structure(c(26L, 
20L, 23L, 7L, 23L, 22L, 23L, 19L), .Label = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
"16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 
"27", "28"), class = "factor"), Month.of.absence = structure(c(7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11", "12"), class = "factor"), Day.of.the.week = structure(c(2L, 
2L, 3L, 4L, 4L, 5L, 5L, 1L), .Label = c("2", "3", "4", "5", "6"
), class = "factor"), Seasons = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("1", "2", "3", "4"), class = "factor"), 
    Transportation.expense = c(289, 118, 179, 279, 289, 361, 
    260, 155), Distance.from.Residence.to.Work = c(36, 13, 51, 
    5, 36, 52, 50, 12), Service.time = c(13, 18, 18, 14, 13, 
    3, 11, 14), Age = c(33, 50, 38, 39, 33, 28, 36, 34), Work.load.Average.day = c(239554, 
    239554, 239554, 239554, 239554, 239554, 239554, 239554), 
    Hit.target = c(97, 97, 97, 97, 97, 97, 97, 97), Disciplinary.failure = structure(c(1L, 
    2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), 
    Education = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", 
    "2", "3", "4"), class = "factor"), Son = c(2, 1, 0, 2, 2, 
    1, 4, 2), Social.drinker = structure(c(2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("0", "1"), class = "factor"), Social.smoker = structure(c(1L, 
    1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), 
    Pet = c(1, 0, 0, 0, 1, 4, 0, 0), Weight = c(90, 98, 89, 68, 
    90, 80, 65, 95), Height = c(172, 178, 170, 168, 172, 172, 
    168, 196), Body.mass.index = c(30, 31, 31, 24, 30, 27, 23, 
    25), Absenteeism.time.in.hours = c(4, 0, 2, 4, 2, 8, 4, 40
    )), .Names = c("ID", "Reason.for.absence", "Month.of.absence", 
"Day.of.the.week", "Seasons", "Transportation.expense", "Distance.from.Residence.to.Work", 
"Service.time", "Age", "Work.load.Average.day", "Hit.target", 
"Disciplinary.failure", "Education", "Son", "Social.drinker", 
"Social.smoker", "Pet", "Weight", "Height", "Body.mass.index", 
"Absenteeism.time.in.hours"), row.names = c(NA, 8L), class = "data.frame")

【问题讨论】:

  • 尝试使用来自 github 的 ggplot2 版本。这支持 quosures,但 cran 上的那个没有(尽管这可能在过去 1-2 天内发生了变化 - 您可以先尝试从 cran 更新 ggplot2)
  • 寻求帮助时,您应该包含一个简单的reproducible example,其中包含可用于测试和验证可能解决方案的示例输入和所需输出。提供一个示例dataset,以便我们实际运行代码。但正如梅丽莎指出的那样,ggplot2_2.2.1 一样不喜欢 quosures。有一个新版本已提交给 CRAN。现在,试试aes_(x = rlang::quo_smash(variable), ...)
  • 您在count(variable) %&gt;% 也有问题,我认为应该是count(!!variable) %&gt;%,因为打印没有显示名为变量的列。
  • @MrFlick 我已经用 dput 更新了这个问题。我尝试使用 aes_(x = rlang::quo_smash(variable), ...) 但变量仍然未知。谢谢
  • 该错误与ggplot 无关,您需要在@MrFlick 所写的count() 调用中将!! 添加到variable

标签: r function ggplot2 dplyr


【解决方案1】:

这里是需要改变的部分

dataset %>%
    count(!!variable) %>%
    ggplot(mapping = aes_(x = rlang::quo_expr(variable), y = quote(n), fill = rlang::quo_expr(variable))) +
    ...

您需要对count() 中的“变量”进行转义,并且不要将!!quo_expr 一起使用,并且在使用aes_() 时需要引用所有参数。

这个代码和测试数据,这个图是做的。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2011-10-24
    • 2021-11-17
    • 1970-01-01
    • 2011-07-03
    • 2021-01-17
    • 2021-03-01
    • 2021-12-07
    • 1970-01-01
    相关资源
    最近更新 更多