【问题标题】:How to add new columns to nested lists using lapply via a string recognition function如何通过字符串识别功能使用 lapply 将新列添加到嵌套列表
【发布时间】:2019-04-05 16:41:41
【问题描述】:

我正在尝试使用 %in% 函数将特定列添加到嵌套在列表列表中的数据框中。下面是我的数据的一个玩具示例。

输入(头部(列表)):

 list(FEB_games = list(GAME1 = structure(list(GAME1_Class = c("paladin", 
"fighter", "wizard", "sorcerer", "rouge"), GAME1_Race = c("human", 
"elf", "orc", "human", "gnome"), GAME1_Alignment = c("NE", "CG", 
"CE", "NN", "LG"), GAME1_Level = c(6, 7, 6, 7, 7), GAME1_Alive = c("y", 
"y", "y", "y", "y")), row.names = c("m.Stan", "m.Kenny", "m.Cartman", 
"m.Kyle", "m.Butters"), class = "data.frame"), GAME2 = structure(list(
GAME2_Class = c("wizard", "cleric", "monk", "bard"), GAME2_Race = c("half-elf", 
"elf", "human", "dwarf"), GAME2_Alignment = c("CG", "CE", 
"NN", "LG"), GAME2_Level = c(5, 5, 5, 5), GAME2_Alive = c("y", 
"y", "y", "y")), row.names = c("m.Kenny", "m.Cartman", "m.Kyle", 
"m.Butters"), class = "data.frame")), MAR_games = list(GAME3 = structure(list(
GAME3_Class = c("cleric", "barbarian", "warlock", "monk"), 
GAME3_Race = c("elf", "half-elf", "elf", "dwarf"), GAME3_Alignment = c("LG", 
"LG", "CE", "LG"), GAME3_Level = c(1, 1, 1, 1), GAME3_Alive = c("y", 
"y", "y", "y")), row.names = c("l.Stan", "l.Kenny", "l.Cartman", 
"l.Butters"), class = "data.frame"), GAME4 = structure(list(GAME4_Class = c("fighter", 
"wizard", "sorcerer", "rouge"), GAME4_Race = c("half-elf", "elf", 
"human", "dwarf"), GAME4_Alignment = c("CG", "CE", "LN", "LG"
), GAME4_Level = c(5, 5, 5, 5), GAME4_Alive = c("y", "y", "y", 
"y")), row.names = c("l.Kenny", "l.Cartman", "l.Kyle", "l.Butters"), class = "data.frame")))

我有两组不同的列(数据框)要添加。 Feb_detentions 到 Feb_games 和 Mar_detentions 到 Mar_games。

输入(头部(Feb_detentions)):

structure(list(Pupil = c("m.Stan", "m.Stan", "m.Kenny", "m.Cartman", 
"m.Kyle", "Butters"), Detention = c("y", "y", "y", "n", "n", "y"
)), row.names = c(NA, 6L), class = "data.frame")

输入(头部(Mar_detentions)):

structure(list(Pupil = c("l.Stan", "l.Kenny", "l.Cartman", "l.Kyle"), 
Detention = c("n", "y", "y", "n")), row.names = c(NA, 4L), class = "data.frame")

我已成功使用这些步骤将感兴趣的列添加到数据框(未嵌套在列表中)。必须删除重复的函数,我无法在函数内部执行此操作。

Feb_detentions[!重复(Feb_detentions$Pupil),] -> Feb_detentions_dup

addDetentions <- function(df, df_namecol, detentions,  detention_namecol){
df[which(df_namecol %in% detention_namecol == T),] -> df_v1
detentions[which(detention_namecol %in% df_namecol == T),] -> det_v1
cbind(df_v1, det_v1) -> df_edit
return(df_edit)
}

addDetentions(df = GAME1, df_namecol = rownames(GAME1),
          detentions = Feb_detentions_dup, detention_namecol = Feb_detentions_dup$Pupil) -> output

输入(头(输出)):

structure(list(GAME1_Class = c("paladin", "fighter", "wizard", 
"sorcerer", "rouge"), GAME1_Race = c("human", "elf", "orc", "human", 
"gnome"), GAME1_Alignment = c("NE", "CG", "CE", "NN", "LG"), 
GAME1_Level = c(6, 7, 6, 7, 7), GAME1_Alive = c("y", "y", 
"y", "y", "y"), Pupil = c("m.Stan", "m.Kenny", "m.Cartman", "m.Kyle", 
"m.Butters"), Detention = c("y", "y", "n", "n", "y")), row.names =  c("m.Stan", "m.Kenny", "m.Cartman", "m.Kyle", "m.Butters"), class = "data.frame")

我想对整个列表执行此功能(或其他有效的功能)。但是由于有两组不同的列要添加到单个列表中的两个不同的嵌套列表中,所以我有点卡住了。

lapply(Chars_alive, function(x) {addDetentions(x, rownames(x), Feb_detentions, Feb_detentions$Pupil)})

如有任何帮助,我们将不胜感激。


【问题讨论】:

  • 你需要Map(function(x, y) lapply(x, function(dat) {dat$Pupil &lt;- row.names(dat); merge(dat, y)}), lst1, list(Feb_detentions, Mar_detentions))
  • @akrun 认为这很好用,谢谢。我能问一下 x 和 y 是做什么的吗?

标签: r list lapply


【解决方案1】:

一个选项是在list 的嵌套data.frames 和以与名称相同的顺序创建的相应list 之间执行merge(第一个list 的月份名称)。 Map 循环遍历对应的 list 元素

Map(function(x, y) 
   # x is the first list which is a nested one
   # so loop through it
   lapply(x, function(dat) {
      # create a Pupil column from the row names
      dat$Pupil <- row.names(dat)
      # merge with the corresponding 'detentions' dataset
      merge(dat, y)
      }),
      # first list, created list
      lst1, list(Feb_detentions, Mar_detentions)) 

使用tidyverse,这可以使用map2完成

library(tidyverse)
map2(lst1, list(Feb_detentions, Mar_detentions),
       ~ {
         ydat <- .y
         map(.x, ~ .x %>%
                    rownames_to_column("Pupil") %>% 
                    inner_join(ydat))
         })

更新

如果我们只需要从“lst1”更新第二个嵌套的list,只需提取list元素并执行merge

Map(function(x, y)  x[[2]] <- {
      x[[2]]$Pupil <- row.names(x[[2]])
     merge(x[[2]], y)
      x
      }, lst1, list(Feb_detentions, Mar_detentions))

【讨论】:

  • 这两个函数都适用于我的玩具数据集,但在我的真实数据上,它将新列添加到第二个列表而不是第一个列表。我认为这是因为我的两个嵌套列表中的 row.names 不同。将编辑问题,以便更清楚。
  • @Krutik 在这种情况下,您不需要第二个循环,即lapplymap,而是使用x[[2]] &lt;- {x[[2]]$Pupil &lt;- row.names(x[[2]]); merge(x[[2]], y)}
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-02-08
  • 2021-03-31
  • 2017-11-23
  • 2021-01-26
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多