【问题标题】:map over and change column names from a data frame映射并更改数据框中的列名
【发布时间】:2020-02-09 18:58:10
【问题描述】:

我有一个数据框和一个列表。该列表由 12 个列表组成,数据框由 12 行和 2 列组成。如何映射列表并根据数据框更改每个列表列名称?。

即数据框长这样:

           Var1         Var2
1   Sepal.Width Sepal.Length
2  Petal.Length Sepal.Length
3   Petal.Width Sepal.Length

列表如下所示:

[[1]]
# A tibble: 2 x 2
      x     y
  <dbl> <dbl>
1   2     2  
2   3.8   3.8

[[2]]
# A tibble: 2 x 2
      x     y
  <dbl> <dbl>
1   3     3  
2   6.9   6.9

[[3]]
# A tibble: 2 x 2
      x     y
  <dbl> <dbl>
1   1     1  
2   2.5   2.5

预期输出

[[1]]
# A tibble: 2 x 2
Sepal.Width Sepal.Length
  <dbl> <dbl>
1   2     2  
2   3.8   3.8

[[2]]
# A tibble: 2 x 2
Petal.Length Sepal.Length
  <dbl> <dbl>
1   3     3  
2   6.9   6.9

[[3]]
# A tibble: 2 x 2
Petal.Width Sepal.Length
  <dbl> <dbl>
1   1     1  
2   2.5   2.5

现在每个列表的列名都根据数据框进行标记。

数据

var_lists <- list(structure(list(x = c(2, 3.8), y = c(2, 3.8)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    x = c(3, 6.9), y = c(3, 6.9)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(x = c(1, 2.5), y = c(1, 
2.5)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(x = c(4.9, 7.9), y = c(4.9, 7.9)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    x = c(3, 6.9), y = c(3, 6.9)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(x = c(1, 2.5), y = c(1, 
2.5)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(x = c(4.9, 7.9), y = c(4.9, 7.9)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    x = c(2, 3.8), y = c(2, 3.8)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(x = c(1, 2.5), y = c(1, 
2.5)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(x = c(4.9, 7.9), y = c(4.9, 7.9)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    x = c(2, 3.8), y = c(2, 3.8)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(x = c(3, 6.9), y = c(3, 
6.9)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
)))

var_combos <- structure(list(Var1 = structure(c(2L, 3L, 4L, 1L, 3L, 4L, 1L, 
2L, 4L, 1L, 2L, 3L), .Label = c("Sepal.Length", "Sepal.Width", 
"Petal.Length", "Petal.Width"), class = "factor"), Var2 = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("Sepal.Length", 
"Sepal.Width", "Petal.Length", "Petal.Width"), class = "factor")), out.attrs = list(
    dim = c(4L, 4L), dimnames = list(Var1 = c("Var1=Sepal.Length", 
    "Var1=Sepal.Width", "Var1=Petal.Length", "Var1=Petal.Width"
    ), Var2 = c("Var2=Sepal.Length", "Var2=Sepal.Width", "Var2=Petal.Length", 
    "Var2=Petal.Width"))), class = "data.frame", row.names = c(NA, 
-12L))

【问题讨论】:

    标签: r purrr


    【解决方案1】:

    我们可以按行拆分“var_combos”,并在map2中使用它

    library(dplyr)
    library(purrr)
    map2(var_lists, asplit(var_combos, 1), ~ .x %>% 
                 set_names(.y))
    #[[1]]
    # A tibble: 2 x 2
    #  Sepal.Width Sepal.Length
    #        <dbl>        <dbl>
    #1         2            2  
    #2         3.8          3.8
    
    #[[2]]
    # A tibble: 2 x 2
    #  Petal.Length Sepal.Length
    #         <dbl>        <dbl>
    #1          3            3  
    #2          6.9          6.9
    
    #[[3]]
    # A tibble: 2 x 2
    #  Petal.Width Sepal.Length
    #        <dbl>        <dbl>
    #1         1            1  
    #2         2.5          2.5
    #...
    

    或者没有匿名函数调用

    map2(var_lists, asplit(var_combos, 1), set_names)
    

    base R中,可以用setNames/Map做等价的操作

    Map(setNames, var_lists, asplit(var_combos, 1))
    

    【讨论】:

      【解决方案2】:

      使用循环:

      for (i in seq_along(var_lists)) {
        names(var_lists[[i]]) <- as.character(unlist(var_combos[i, ]))
      }
      

      如果var_combos 中的字符串不是因素,您可以省略as.character 部分

      【讨论】:

        【解决方案3】:

        我也在尝试寻找解决这个问题的方法,结果遇到了rename_with() 函数!

        假设你有一个数据框:

        library(tidyverse)
        
        df <- tibble(V0 = runif(10), V1 = runif(10), V2 = runif(10), key=letters[1:10])
        

        并且您想更改所有“V”列。通常,我对此类列的引用来自一个 json 文件,该文件在 R 中是一个带标签的列表。例如,

        colmapping <- c("newcol1", "newcol2", "newcol3")
        names(colmapping) <- paste0("V",0:2)
        

        然后您可以使用以下命令将df 的名称更改为colmapping 列表中的字符串:

        df <- rename_with(.data = df, .cols = starts_with("V"), .fn = function(x){colmapping[x]})
        

        【讨论】:

          猜你喜欢
          • 2020-02-17
          • 1970-01-01
          • 1970-01-01
          • 2020-10-06
          • 1970-01-01
          • 2011-08-30
          • 1970-01-01
          • 2021-09-29
          相关资源
          最近更新 更多