【问题标题】:Writing a function that converts variable modes and classes编写一个转换变量模式和类的函数
【发布时间】:2019-04-17 15:09:09
【问题描述】:

我使用以下代码 (LINK) 为名为 dataframe 的假设 df 清理数据的潜在麻烦方面:

dataframe <- fread(
    "A   B  B.x  C  D   E   iso   year   
     0   3   NA  1  NA  NA  NLD   2009   
     1   4   NA  2  NA  NA  NLD   2009   
     0   5   NA  3  NA  NA  AUS   2011   
     1   5   NA  4  NA  NA  AUS   2011   
     0   0   NA  7  NA  NA  NLD   2008   
     1   1   NA  1  NA  NA  NLD   2008   
     0   1   NA  3  NA  NA  AUS   2012   
     0   NA  1   NA  1  NA  ECU   2009   
     1   NA  0   NA  2  0   ECU   2009   
     0   NA  0   NA  3  0   BRA   2011   
     1   NA  0   NA  4  0   BRA   2011   
     0   NA  1   NA  7  NA  ECU   2008   
     1   NA  0   NA  1  0   ECU   2008   
     0   NA  0   NA  3  2   BRA   2012   
     1   NA  0   NA  4  NA  BRA   2012",
   header = TRUE
)

dataframe <- as.data.frame(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
  stop("matrix variables with 'AsIs' class must be 'numeric'")
  }
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
dataframe[ind1] <- lapply(dataframe[ind1], as.factor)

由于我经常使用它,但我更愿意将它放在一个函数中并尝试以下操作:

cleanfunction <- function(dataframe) {
dataframe <- as.data.frame(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
  stop("matrix variables with 'AsIs' class must be 'numeric'")
  }
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
dataframe[ind1] <- lapply(dataframe[ind1], as.factor)
}

dfclean <- cleanfunction(dataframe)

然而,这创建了一个转换为因子的变量列表,而不是一个将这些变量转换为因子的数据框。

我该如何解决这个问题?

【问题讨论】:

    标签: r function class data-cleaning


    【解决方案1】:

    函数返回最后一个表达式的值。在这种情况下,评估的最后一个表达式是

    dataframe[ind1] <- lapply(dataframe[ind1], as.factor)
    

    &lt;- 操作总是只返回右侧的值。所以你只是从lapply返回结果,而不是更新的dataframe

    您只需要添加另一行说明

    return(dataframe)
    

    或者只是

    dataframe
    

    到你的函数结束。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-03-14
      • 2010-12-09
      • 1970-01-01
      • 2015-08-02
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多