【问题标题】:Can variables in a dataset be used as arguments in a function?数据集中的变量可以用作函数的参数吗?
【发布时间】:2017-04-21 07:25:32
【问题描述】:

设数据为:

> dput(df)
structure(list(NAME.x = c("ANNE", "BOB", "CATHY", "DIANNE", "EMILY"
), NAME.y = c(NA, "BOB", "CATHY", "DIANNE", NA), AGE.x = c("81", 
"47", "47", "47", "37"), AGE.y = c(NA, "47", "47", "47", NA), 
    ADMISSIONDATE.x = structure(c(1380751296, 1382088000, 1382088000, 
    1382088000, 1383207720), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    ADMISSIONDATE.y = structure(c(NA, 1382088000, 1382088000, 
    1382088000, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    DISCHARGEDDATE.x = structure(c(1381172735, 1382189165, 1382189165, 
    1382189165, 1383250549), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    DISCHARGEDDATE.y = structure(c(NA, 1382189165, 1382189165, 
    1382189165, NA), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA, 
-5L), .Names = c("NAME.x", "NAME.y", "AGE.x", "AGE.y", "ADMISSIONDATE.x", 
"ADMISSIONDATE.y", "DISCHARGEDDATE.x", "DISCHARGEDDATE.y"), class = "data.frame")

我想检查此数据集中常见变量之间的相似性和差异。我尝试编写一个函数,其中 3 个参数是数据集,数据集中的 2 个变量。

  check<-function(data,var1,var2){
    # X1: x and y are equal
    # X2: x and y are not equal
    # Y1: x and y are non-empty
    # Y2: x and y are empty
    # Z1: x is non-empty and y is empty
    # Z2: x is empty and y is non-empty
    cnt_each<-data %>% 
      mutate(X1 = (var1==var2),
             X2 = (var1!=var2),
             Y1 = (!is.na(var1) & !is.na(var2)),
             Y2 = (is.na(var1) & is.na(var2)),
             Z1 = (!is.na(var1) & is.na(var2)),
             Z2 = (is.na(var1) & !is.na(var2))) %>%
      summarise_at("X1:Z2",funs(sum(.))) %>%
      mutate(sum_all=sum(.,na.rm=TRUE))
    return(cnt_each)
  }

但是,它在运行时会出错。我在函数外运行代码没有错误。

check(df,NAME.x,NAME.y)

mutate_impl(.data, dots) 中的错误:找不到对象“NAME.x”

【问题讨论】:

  • 参见mutate_ 将字符串作为列名传递。

标签: r function dplyr


【解决方案1】:

我们可以使用dplyr 的开发版本(即将发布0.6.0 用于执行此操作)。 enquo 接受输入参数并转换为 quosure。在 mutate/summarise/group_by 中,quosures 不被引用(!!UQ)用于评估

check<-function(data,var1,var2){
  var1 <- enquo(var1)
  var2 <- enquo(var2) 
  data %>%
       mutate(X1 = UQ(var1)==UQ(var2),
              X2 = UQ(var1) != UQ(var2),
              Y1 = !is.na(UQ(var1)) &  !is.na(UQ(var2)),
              Y2 = is.na(UQ(var1)) & is.na(UQ(var2)),
              Z1 = !is.na(UQ(var1)) & !is.na(UQ(var2)),
              Z2 = is.na(UQ(var1)) & !is.na(UQ(var2))) %>%
        summarise_at(vars(X1:Z2), funs(sum(., na.rm = TRUE))) %>%
        mutate(sum_all = rowSums(., na.rm = TRUE))

}    


check(df, NAME.x, NAME.y)
#   X1 X2 Y1 Y2 Z1 Z2 sum_all
#1  3  0  3  0  3  0       9

【讨论】:

  • 所以enquoUQ 都将在dplyr 的新版本中?
  • @HNSKD 是的,还有quo_name!!!quoquos 等其他内容。
  • 看起来xxx_ 类型的函数将在下一个版本中被淘汰?
  • @zx8754,这些功能已被标记为“主要动词的弃用 SE 版本”。在开发版本 0.5.0.9004 中。
  • @zx8754 好像是这样
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2013-09-15
  • 2011-01-25
  • 2022-11-14
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多