【问题标题】:Recursive indexing of lists with variable index value per recursion step每个递归步骤对具有可变索引值的列表进行递归索引
【发布时间】:2020-02-15 08:39:51
【问题描述】:

呸...即使尝试正确地框出标题也已经让我头疼了。

我有一个带有嵌套值的config.yml,我想定义一个索引函数get_config(),它接受“类路径”值字符串。

值字符串的“路径实体”匹配配置文件的嵌套实体结构。根据类似路径的值,该函数应该从配置文件中获取相应的层次结构实体(“分支”或“叶子”)。

示例

假设这是config.yml的结构:

default:
  column_names:
    col_id: "id"
    col_value: "value"
  column_orders:
    data_structure_a: [
      column_names/col_id,
      column_names/col_value
    ]
    data_structure_b: [
      column_names/col_value,
      column_names/col_id
    ]

这是一个解析后的版本供您使用:

x <- yaml::yaml.load(
'default:
  column_names:
    col_id: "id"
    col_value: "value"
  column_orders:
    data_structure_a: [
      column_names/col_id,
      column_names/col_value
    ]
    data_structure_b: [
      column_names/col_value,
      column_names/col_id
    ]'
)

使用config::get(value) 可以轻松访问顶级实体:

config::get("column_names")
# $col_id
# [1] "id"
# 
# $col_value
# [1] "value"

config::get("column_orders")
# [1] "hello" "world"

但我也想访问更深层次的实体,例如column_names: col_id.

在伪代码中:

config::get("column_names:col_id")

config::get("column_orders/data_structure_a")

到目前为止我能想到的最好的:依靠unlist()

get_config <- function(value, sep = ":") {
  if (value %>% stringr::str_detect(sep)) {
    value <- value %>% stringr::str_replace(sep, ".")
    configs <- config::get() %>% unlist()
    configs[value]
  } else {
    config::get(value)
  }
}

get_config("column_names")
# $col_id
# [1] "id"
#
# $col_value
# [1] "value"

get_config("column_names:col_id")
# column_names.col_id 
# "id" 

虽然不优雅,但它适用于大多数用例,但对于配置文件中的未命名列表实体失败

get_config("column_orders:data_structure_a")
# <NA> 
#   NA 

因为我的索引方法与unlist() 在未命名列表上的结果不符:

config::get() %>% unlist()
# column_names.col_id          column_names.col_value 
# "id"                         "value" 
# column_orders.data_structure_a1 column_orders.data_structure_a2 
# "column_names/col_id"        "column_names/col_value" 
# column_orders.data_structure_b1 column_orders.data_structure_b2 
# "column_names/col_value"           "column_names/col_id" 

因此,我想“递归”,但我的大脑说:“没办法,伙计”

尽职调查

This solution 接近(我猜)。

但我一直在想,我需要像 purrr::map2_if()purrr::pmap_if()(不存在 AFAIK)而不是 purrr::map_if(),因为我不仅需要递归遍历 config::get() 后面的列表,而且value 的列表版本(例如通过stringr::str_split(value, sep) %&gt;% unlist() %&gt;% as.list())?

【问题讨论】:

    标签: r recursion indexing purrr


    【解决方案1】:

    你也可以使用purrr::pluck 按名称索引嵌套列表,如果这是你所追求的:

    x <- yaml::yaml.load('
      column_names:
        col_id: "id"
        col_value: "value"
      column_orders:
        data_structure_a: [
          column_names/col_id,
          column_names/col_value
        ]
        data_structure_b: [
          column_names/col_value,
          column_names/col_id
        ]
      nested_list:
        element_1:
          element_2:
            value: "hello world"
      ')
    
    purrr::pluck(x, "column_names", "col_id")
    #> [1] "id"
    
    purrr::pluck(x, "column_names")
    #> $col_id
    #> [1] "id"
    #> 
    #> $col_value
    #> [1] "value"
    
    purrr::pluck(x, "column_orders", "data_structure_a")
    #> [1] "column_names/col_id"    "column_names/col_value"
    
    purrr::pluck(x, "column_names", "col_notthere")
    #> NULL
    

    【讨论】:

      【解决方案2】:

      我想出了一个基于Recall()的解决方案。

      然而,在挖掘互联网试图到达这里时,我记得在某处读到 Recall() 通常不是在 R 中进行递归的一种非常(内存)有效的方式?还希望获得有关如何与purrr 和朋友一起以整洁的方式进行递归的其他提示。

      配置文件内容

      能够调用get_config() 意味着您在here::here() 指定的项目根目录中有一个包含上述内容的config.yml 文件,但您可以使用此解决方法测试get_list_element_recursively()

      x <- yaml::yaml.load('
        column_names:
          col_id: "id"
          col_value: "value"
        column_orders:
          data_structure_a: [
            column_names/col_id,
            column_names/col_value
          ]
          data_structure_b: [
            column_names/col_value,
            column_names/col_id
          ]
        nested_list:
          element_1:
            element_2:
              value: "hello world"
        ')
      

      函数定义

      get_config <- function(value, sep = "/") {
        get_list_element_recursively(
          config::get(),
          stringr::str_split(value, sep, simplify = TRUE)
        )
      }
      
      get_list_element_recursively <- function(
        lst,
        el,
        .el_trace = el,
        .level_trace = 1
      ) {
        # Reached leaf:
        if (!is.list(lst)) {
          return(lst)
        }
      
        # Element not in list:
        if (!(el[1] %in% names(lst))) {
          message("Current list branch:")
          # print(lst)
          message(str(lst))
          message("Trace of indexing vec (last element is invalid):")
          message(stringr::str_c(.el_trace[.level_trace], collapse = "/"))
          stop(stringr::str_glue("No such element in list: {el[1]}"))
        }
      
        lst <- lst[[ el[1] ]]
      
        if (!is.na(el[2])) {
          # Continue if there are additional elements in `el` vec
          Recall(lst, el[-1], .el_trace, .level_trace = 1:(.level_trace + 1))
        } else {
          # Otherwise return last indexing result:
          lst
        }
      }
      

      测试get_config()

      get_config("column_names")
      # $col_id
      # [1] "id"
      #
      # $col_value
      # [1] "value"
      
      get_config("column_names/col_id")
      # [1] "id"
      
      get_config("column_names/col_nonexisting")
      # Current list branch:
      #   List of 6
      # $ col_id                    : chr "id"
      # $ col_value                 : chr "value"
      #
      # Trace of indexing vec (last element is invalid):
      #   column_names/col_nonexisting
      # Error in get_list_element_recursively(config::get(), stringr::str_split(value,  :
      #     No such element in list: col_nonexisting
      
      get_config("column_orders")
      # $data_structure_a
      # [1] "column_names/col_id"    "column_names/col_value"
      #
      # $data_structure_b
      # [1] "column_names/col_value" "column_names/col_id"
      
      get_config("column_orders/data_structure_a")
      # [1] "column_names/col_id"    "column_names/col_value"
      

      测试get_list_element_recursively()

      get_list_element_recursively(x, c("column_names"))
      # $col_id
      # [1] "id"
      #
      # $col_value
      # [1] "value"
      
      get_list_element_recursively(x, c("column_names", "col_id"))
      # [1] "id"
      
      get_list_element_recursively(x, c("column_names", "col_notthere"))
      # Current list branch:
      #   List of 2
      # $ col_id   : chr "id"
      # $ col_value: chr "value"
      #
      # Trace of indexing vec (last element is invalid):
      #   column_names/col_notthere
      # Error in get_list_element_recursively(x$default, c("column_names", "col_notthere")) :
      #   No such element in list: col_notthere
      

      【讨论】:

        猜你喜欢
        • 2021-01-06
        • 2021-01-05
        • 1970-01-01
        • 2012-12-08
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多