【问题标题】:Add NA as a factor level for the whole Dataframe添加 NA 作为整个 Dataframe 的因子级别
【发布时间】:2018-08-21 18:11:22
【问题描述】:

我有一个数据集,其中一个表的列完全是因子。其中唯一的数据是“是”或 NA 值。每一列只有一个因子级别,即是。我也想让 NA 成为一个因素水平。不幸的是,我对 addNA() 函数的理解很差。请有人帮我以更简洁的方式将 NA 作为因子级别添加到整个数据集,而不是我必须为每列单独输入。谢谢

xl<- structure(list(G = structure(c(1L, NA, NA, NA, NA, 
NA, NA, NA, NA, NA), .Label = "yes", class = "factor"), A = structure(c(1L, 1L, NA, NA, NA, 1L, NA, NA, 1L, 1L), .Label = "yes", class = "factor"), 
L = structure(c(2L, 2L, NA, NA, 2L, 2L, 2L, 
NA, 2L, 2L), .Label = c("no", "yes"), class = "factor"), 
P = structure(c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), .Label = "yes", class = "factor"), 
C = structure(c(NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_), .Label = "yes", class = "factor"), 
S = structure(c(NA, NA, NA, NA, NA, NA, 1L, NA, NA, 
NA), .Label = "yes", class = "factor"), M = structure(c(NA, 
NA, 1L, NA, NA, NA, 1L, NA, NA, NA), .Label = "yes", class = "factor"), 
F = structure(c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), .Label = "yes", class = "factor")), .Names =    c("G", "A", "L", "P", "C", "S", "M", "F"), row.names = c("row_1", "row_2", "row_3", "row_4", "row_5", "row_6", "row_7", "row_8", "row_10", "row_11"), class = "data.frame")
xl <- addNA(xl)

【问题讨论】:

  • 您需要将addNA 应用于列,而不是整个data.frame。试试xl[] &lt;- lapply(xl, addNA) 或者dplyr, xl %&gt;% mutate_all(addNA)
  • @MrFlick 啊,谢谢!!!我尝试做 xl[,1:8]
  • 不是apply 而是lapply 再看给出的代码..
  • @Onyambu 自动更正更正了我的 lapply 以应用于上一个答案,但在 R 上我输入了 lapply。不过感谢您的收获。

标签: r dataframe


【解决方案1】:

purrr来救你了:

library(tidyverse)

xl_new <- xl %>% 
  map_df(factor, levels = c("yes", "NA"))

或者也可以使用forcats

xl_new <- xl %>% 
  map_df(fct_explicit_na, "NA")

【讨论】:

    【解决方案2】:

    我非常喜欢@FMM 使用forcats::fct_explicit_na 的分支,您可以使用dplyr::mutate_all,因为这些列都是因素。如果您有不同类型的列,但只想对因子列执行此操作,则可以改用 dplyr::mutate_ifis.factor 作为谓词。

    library(tidyverse)
    
    xl %>%
      mutate_all(fct_explicit_na, "NA")
    #>      G   A   L  P  C   S   M  F
    #> 1  yes yes yes NA NA  NA  NA NA
    #> 2   NA yes yes NA NA  NA  NA NA
    #> 3   NA  NA  NA NA NA  NA yes NA
    #> 4   NA  NA  NA NA NA  NA  NA NA
    #> 5   NA  NA yes NA NA  NA  NA NA
    #> 6   NA yes yes NA NA  NA  NA NA
    #> 7   NA  NA yes NA NA yes yes NA
    #> 8   NA  NA  NA NA NA  NA  NA NA
    #> 9   NA yes yes NA NA  NA  NA NA
    #> 10  NA yes yes NA NA  NA  NA NA
    
    xl %>%
      mutate_all(fct_explicit_na, "NA") %>%
      str()
    #> 'data.frame':    10 obs. of  8 variables:
    #>  $ G: Factor w/ 2 levels "yes","NA": 1 2 2 2 2 2 2 2 2 2
    #>  $ A: Factor w/ 2 levels "yes","NA": 1 1 2 2 2 1 2 2 1 1
    #>  $ L: Factor w/ 3 levels "no","yes","NA": 2 2 3 3 2 2 2 3 2 2
    #>  $ P: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
    #>  $ C: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
    #>  $ S: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 1 2 2 2
    #>  $ M: Factor w/ 2 levels "yes","NA": 2 2 1 2 2 2 1 2 2 2
    #>  $ F: Factor w/ 2 levels "yes","NA": 2 2 2 2 2 2 2 2 2 2
    

    【讨论】:

      猜你喜欢
      • 2021-11-19
      • 1970-01-01
      • 2019-10-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-03-06
      • 2016-12-01
      • 1970-01-01
      相关资源
      最近更新 更多