【问题标题】:R: How to read text file into a lists of lists recursively?R:如何以递归方式将文本文件读入列表列表?
【发布时间】:2018-11-04 00:15:10
【问题描述】:

我有一个格式如下的文本文件:

date=1638.1.16
player=\"BYZ\"
savegame_version={
\tfirst=1
\tsecond=25
\tthird=1
\tforth=0
\tname=\"England\"
}
mod_enabled={
\t\"Large Font\"
\t\"Large Tooltips\"
}

我想要做的是将其作为字符向量列表读入 R,其中 { 和 } 符号表示创建另一个列表。结果应如下所示:

[[1]]
[1] "date=1638.1.16"

[[2]]
[1] "player=\"BYZ\""

[[3]]
[[3]][[1]]
[1] "savegame_version={"

[[3]][[2]]
[1] "\tfirst=1"

[[3]][[3]]
[1] "\tsecond=25"

[[3]][[4]]
[1] "\tthird=1"

[[3]][[5]]
[1] "\tforth=0"

[[3]][[6]]
[1] "\tname=\"England\""

[[3]][[7]]
[1] "}"

[[4]]
[[4]][[1]]
[1] "mod_enabled={"

[[4]][[2]]
[1] "\t\"Large Font\""

[[4]][[3]]
[1] "\t\"Large Tooltips\""

[[4]][[4]]
[1] "}"

我尝试使用创建列表的函数遍历数据行,其中 { 符号再次递归调用同一函数。问题是结果只是一个列表,而不是上面看到的嵌套列表。

当前函数写成:

list_create <- function(vector){
  temp_list <- list()
  for(i in 1:length(vector)){
    if(str_detect(vector[i], pattern = "\\{")) {
      list_create(vector[i+1:length(vector)])
    }
    if(str_detect(vector[i], pattern = "\\}")) {
      return(temp_list)
    }
    temp_list <- append(temp_list, vector[i])
  }
}

有什么方法可以得到我想要的结果吗?

【问题讨论】:

    标签: r list recursion nested


    【解决方案1】:

    你有多少层子列表?对于您提供的示例(只有 2 级列表),这应该可以:

    # read the file in
    txt <- readLines("listtext.txt")
    
    # create an empty list
    main.list <- list()
    
    # indicator that we are within sublist
    sub=FALSE
    
    # loop through each line
    for( i in seq(txt) ){
    
      # check if the string opens a new sublist
      if ( grepl("\\{", txt[i]) ){
        sub.list <- list()   # start a new sublist
        sub.list <- c(sub.list, txt[i])  # add the line as the first line in the new list
        sub = TRUE                       # inside the sublist
    
      # check if we need to close sublist
      } else if(grepl("\\}", txt[i]) ){
        sub.list <- c(sub.list, txt[i])  # add the last line to sublist
        main.list <- c(main.list, list(sub.list))   # add sublist to the main list
        sub=FALSE                        # no longer inside sublist
    
      # if we are within sublist    
      } else if(sub) {
        sub.list <- c(sub.list, txt[i])
    
      # regular record    
      } else {
        main.list <- c(main.list, txt[i] )
      }
    }
    
    main.list
    # [[1]]
    # [1] "date=1638.1.16"
    # 
    # [[2]]
    # [1] "player=\\\"BYZ\\\""
    # 
    # [[3]]
    # [[3]][[1]]
    # [1] "savegame_version={"
    # 
    # [[3]][[2]]
    # [1] "\\tfirst=1"
    # 
    # [[3]][[3]]
    # [1] "\\tsecond=25"
    # 
    # [[3]][[4]]
    # [1] "\\tthird=1"
    # 
    # [[3]][[5]]
    # [1] "\\tforth=0"
    # 
    # [[3]][[6]]
    # [1] "\\tname=\\\"England\\\""
    # 
    # [[3]][[7]]
    # [1] "}"
    # 
    # 
    # [[4]]
    # [[4]][[1]]
    # [1] "mod_enabled={"
    # 
    # [[4]][[2]]
    # [1] "\\t\\\"Large Font\\\""
    # 
    # [[4]][[3]]
    # [1] "\\t\\\"Large Tooltips\\\""
    # 
    # [[4]][[4]]
    # [1] "}"
    

    如果你有很多递归子列表,你可以写一个递归函数:

    main.list <- list()
    subfun <- function(istart, txt){
    
      sub.list <- list()
      sub.list <- c(sub.list, txt[istart])
      j = istart + 1
      while( !grepl("\\}", txt[j]) ){
    
        if ( grepl("\\{", txt[j]) ){
          x <- subfun(j, txt)
          sub.list <- c(sub.list, list(x$sub) )  # add sublist to the main list
          j=x$iend
    
          # regular record    
        } else {
          sub.list <- c(sub.list, txt[j] )
        }    
        j <- j+1
      }
      sub.list <- c(sub.list, txt[j])
      return(list(sub=sub.list, iend=j))
    }
    
    # loop through each line
    i=1
    while( i <= length(txt) ){
    
      # check if the string opens a new sublist
      if ( grepl("\\{", txt[i]) ){
        x <- subfun(i, txt)
        main.list <- c(main.list, list(x$sub) )  # add sublist to the main list
        i=x$iend
    
        # regular record    
      } else {
        main.list <- c(main.list, txt[i] )
      }
      i <- i+1
    }
    

    对于您的示例,它将产生与第一种方法相同的结果

    【讨论】:

    • 谢谢!实际文本文件中存在超过 5 个子列表级别,因此使用您提供的递归函数,整个列表中都会提供正确的结果。
    猜你喜欢
    • 1970-01-01
    • 2017-11-08
    • 2011-08-17
    • 2012-03-27
    • 1970-01-01
    • 1970-01-01
    • 2020-04-20
    • 2016-07-02
    相关资源
    最近更新 更多