【问题标题】:Restructuring a nested list重构嵌套列表
【发布时间】:2020-04-30 07:46:59
【问题描述】:

我正在使用列表(2014-2018 年每年的数据框)列表(测量深度)的嵌套列表(地面传感器)。 'SE' 表示传感器及其编号,'d' 表示传感器放置在土壤中的深度。看起来像这样:

str(GRP3_OUT_gwFERN)

List of 9
 $ SE10:List of 3
  ..$ d20:List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : logi [1:8760] NA NA NA NA NA NA ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : logi [1:8760] NA NA NA NA NA NA ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : logi [1:8784] NA NA NA NA NA NA ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : num [1:8760] NA NA NA NA NA NA NA NA NA NA ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : logi [1:8760] NA NA NA NA NA NA ...
  ..$ d50:List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : num [1:8760] 39.8 39.7 39.8 39.7 39.7 ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : num [1:8760] 39.7 39.7 39.7 39.7 39.7 ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : num [1:8784] 39 39.1 39.1 39 39 ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : num [1:8760] 37.9 38 37.9 37.9 37.9 ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : num [1:8760] 39.1 39 39.1 39 39 ...
  ..$ d5 :List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : num [1:8760] 41 41 40.9 41 40.9 ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : num [1:8760] 42 42.1 42.1 42 42.1 ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : num [1:8784] 43.3 43.4 43.4 43.3 43.3 ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : num [1:8760] 42.1 42.1 42.2 42.1 42.1 ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : num [1:8760] 44 44.1 44.1 44.1 44.1 ...
 $ SE11:List of 3
  ..$ d20:List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : num [1:8760] 46.6 46.5 46.4 46.4 46.4 ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : num [1:8760] 46.6 46.5 46.6 46.6 46.6 ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : num [1:8784] 45.1 45.1 45.1 45.1 45.1 ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : num [1:8760] 40.2 40.2 40.2 40.2 40.2 ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : num [1:8760] 49.1 49.2 49.3 49.2 49.3 ...
  ..$ d50:List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : num [1:8760] 34.1 34 34.1 34 34 ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : num [1:8760] 32.8 32.8 32.8 32.7 32.7 ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : logi [1:8784] NA NA NA NA NA NA ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : logi [1:8760] NA NA NA NA NA NA ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : logi [1:8760] NA NA NA NA NA NA ...
  ..$ d5 :List of 5
  .. ..$ 2014:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2014-01-01" "2014-01-01" "2014-01-01" ...
  .. .. ..$ SWC : num [1:8760] 33.8 33.8 33.8 33.8 33.7 ...
  .. ..$ 2015:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2015-01-01" "2015-01-01" "2015-01-01" ...
  .. .. ..$ SWC : num [1:8760] 35.7 35.7 35.7 35.7 35.7 ...
  .. ..$ 2016:'data.frame': 8784 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8784], format: "2016-01-01" "2016-01-01" "2016-01-01" ...
  .. .. ..$ SWC : num [1:8784] 31.5 31.5 31.5 31.5 31.5 ...
  .. ..$ 2017:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2017-01-01" "2017-01-01" "2017-01-01" ...
  .. .. ..$ SWC : num [1:8760] 28.3 28.3 28.3 28.2 28.2 ...
  .. ..$ 2018:'data.frame': 8760 obs. of  2 variables:
  .. .. ..$ Date: Date[1:8760], format: "2018-01-01" "2018-01-01" "2018-01-01" ...
  .. .. ..$ SWC : num [1:8760] 35.4 35.5 35.6 35.5 35.4 ..

由于所有数据帧都包含 NA 值,我想执行线性回归以用 NA 值填补空白。我认为为了做到这一点,我需要重组我的列表,以便获得 2014 年的数据框,深度 20 以及列表中同一年和深度的所有传感器。然后在下一个数据框中,2015 年相同,下一个 2016 年相同,依此类推......

我为什么要这样做?因为为了执行线性回归,我想在新列中为 SE11 创建一个指示变量(例如),并用相关系数最高的另一个传感器的值填充 NA 空白这是 2014 年的样子,例如:

 SE11_d20_2014_SWC SE12_d20_2014_SWC SE_44_d20_2014_SWC
             2            NA              4
             3             2              6
             NA            3             NA
             6            NA              3

 SE11_d50_2014_SWC SE12_d50_2014_SWC SE_44_d50_2014_SWC
             2            NA              4
             3             2              6
             4             5              4
            NA             3             NA
             6            NA              3


 SE11_d5_2014_SWC SE12_d5_2014_SWC SE_44_d5_2014_SWC
             2            NA              4
             3             2              6
             4             5              4
            NA             3             NA
             6            NA              3

我已经做了一些研究并寻找重组列表,但不幸的是我找不到任何可以帮助我的东西。任何人都可以帮忙吗?

为 toydata 编辑:(我的列表及其结构的副本)

dat <- setNames(replicate(3, setNames(replicate(3, setNames(lapply(2014:2018, function(y) {
    d <- expand.grid(date=as.Date(as.character(seq(ISOdate(y, 1, 1, 0), ISOdate(y, 12, 31, 0), by="day"))), 
                     hour=1:24)
    d$swc <- rnorm(nrow(d))
    d[order(d$date), -2]
}), 2014:2018), simplify=F), c("d20", "d50", "d5")), simplify=F), c("SE104", "SE105", "SE106"))

菲尔

【问题讨论】:

    标签: r list regression linear-regression


    【解决方案1】:

    我猜你只需要将list 转换为data.frame 并保留所有嵌套信息。然后,您可以根据需要对其进行过滤。

    您可以使用for() 循环来执行此操作。

    我创建了一些数据来展示它:

    vec_a <- c(1,2,3)
    vec_b <- c(10,20,30)
    
    l <- list(
      top1 = list(
        middle1 = list(
          a=vec_a,
          b=vec_b
        ),
        middle2 = list(
          a=vec_a,
          b=vec_b
        )
      ),
      top2 = list(
        middle1 = list(
          a=vec_a,
          b=vec_b
        ),
        middle2 = list(
          a=vec_a,
          b=vec_b
        )
      ),
      top3 = list(
        middle1 = list(
          a=vec_a,
          b=vec_b
        ),
        middle2 = list(
          a=vec_a,
          b=vec_b
        )
      )
    )
    
    str(l)
    #> List of 3
    #>  $ top1:List of 2
    #>   ..$ middle1:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    #>   ..$ middle2:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    #>  $ top2:List of 2
    #>   ..$ middle1:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    #>   ..$ middle2:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    #>  $ top3:List of 2
    #>   ..$ middle1:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    #>   ..$ middle2:List of 2
    #>   .. ..$ a: num [1:3] 1 2 3
    #>   .. ..$ b: num [1:3] 10 20 30
    
    library(tidyverse)
    
    d <- data.frame()
    
    for (i in seq_along(l)){
      df <- l[[i]] %>% bind_rows(.id = "level2") %>% 
        mutate(level1 = names(l)[i])
      d <- d %>% bind_rows(df)
    }
    
    d
    #>     level2 a  b level1
    #> 1  middle1 1 10   top1
    #> 2  middle1 2 20   top1
    #> 3  middle1 3 30   top1
    #> 4  middle2 1 10   top1
    #> 5  middle2 2 20   top1
    #> 6  middle2 3 30   top1
    #> 7  middle1 1 10   top2
    #> 8  middle1 2 20   top2
    #> 9  middle1 3 30   top2
    #> 10 middle2 1 10   top2
    #> 11 middle2 2 20   top2
    #> 12 middle2 3 30   top2
    #> 13 middle1 1 10   top3
    #> 14 middle1 2 20   top3
    #> 15 middle1 3 30   top3
    #> 16 middle2 1 10   top3
    #> 17 middle2 2 20   top3
    #> 18 middle2 3 30   top3
    

    reprex package (v0.3.0) 于 2020 年 4 月 30 日创建

    【讨论】:

    • 这意味着我首先必须定义我想要放入数据框中的所有向量,对吧? (就像你对 vec_a &lt;-vec_b &lt;- 所做的那样)问题是我总共有 146 个传感器,所以这可能是只包含 9 个传感器的列表的一个选项,但这样做需要很长时间传感器总数。
    • 我刚刚制作了vec_avec_b 来创建示例列表。在实际代码中,这两个向量没有命名。如果您使用dput(data) 发布部分数据,则可以为您提供特定数据的准确答案。
    • 编辑了帖子,这样你就有了一些玩具数据。我将dput() 用于相同深度和同一年份的两个传感器,因此您总共有两个向量。
    • 抱歉,我刚刚意识到这对您没有帮助。再次编辑,现在您有了具有适当结构的列表。
    猜你喜欢
    • 2018-08-20
    • 1970-01-01
    • 2013-11-17
    • 1970-01-01
    • 1970-01-01
    • 2019-10-04
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多