【问题标题】:Extracting elements from a list to create a matrix从列表中提取元素以创建矩阵
【发布时间】:2021-04-28 16:50:01
【问题描述】:

我有一个国家列表,每个国家都有列表。

举个例子,一个国家的 list 对象和两个国家的列表 (df_DOTS):

df_DOTS <- list(BR = structure(list(`@FREQ` = "M", `@REF_AREA` = "AU", `@INDICATOR` = "TXG_FOB_USD", 
    `@COUNTERPART_AREA` = "BR", `@UNIT_MULT` = "6", `@TIME_FORMAT` = "P1M", 
    Obs = list(structure(list(`@TIME_PERIOD` = c("2019-07", "2019-08", 
    "2019-09"), `@OBS_VALUE` = c("55.687747", "36.076581", "57.764474"
    )), class = "data.frame", row.names = c(NA, 3L)))), row.names = 2L, class = "data.frame"), 
    US = structure(list(`@FREQ` = "M", `@REF_AREA` = "AU", `@INDICATOR` = "TXG_FOB_USD", 
        `@COUNTERPART_AREA` = "US", `@UNIT_MULT` = "6", `@TIME_FORMAT` = "P1M", 
        Obs = list(structure(list(`@TIME_PERIOD` = c("2019-07", 
        "2019-08", "2019-09"), `@OBS_VALUE` = c("876.025841", 
        "872.02118", "787.272851")), class = "data.frame", row.names = c(NA, 
        3L)))), row.names = 1L, class = "data.frame"))

我可以使用这些代码行到达我正在寻找的矩阵 (matrix_DOTS):

library(dplyr)
library(rlist)
library(magrittr)

BR <- df_DOTS[["BR"]][["Obs"]] %>%
  list.select(.$`@OBS_VALUE`) %>%
  unlist() %>%
  sapply(function(x) as.numeric(as.character(x))) %>%
  mean()

US <- df_DOTS[["US"]][["Obs"]] %>%
  list.select(.$`@OBS_VALUE`) %>%
  unlist() %>%
  sapply(function(x) as.numeric(as.character(x))) %>%
  mean()

matrix_DOTS <- matrix(c(BR, US), nrow = 1, dimnames = list(c("AU"), c("BR", "US")))

由于我有几个国家的列表,其中包含其他几个国家的列表,我正在寻找一种更实用的方法来实现matrix_DOTS。非常感谢任何帮助!

PS:这是本例中最终矩阵的dput

matrix_DOTS <- structure(c(49.842934, 845.106624), .Dim = 1:2, .Dimnames = list(
    "AU", c("BR", "US")))

编辑

这是获取df_DOTS的过程:

library(IMFData)

databaseID <- "DOT"
startdate = "2019-07-01"
enddate = "2019-09-01"
checkquery = FALSE

queryfilter <- list(CL_FREQ = "M", CL_AREA_DOT = "AU",
                    CL_INDICATOR_DOT = "TXG_FOB_USD",
                    CL_COUNTERPART_AREA_DOT = c("BR", "US"))
df_DOTS <- CompactDataMethod(databaseID, queryfilter, startdate, enddate, checkquery) %>%
  split(.$`@COUNTERPART_AREA`) 

【问题讨论】:

  • 您是如何获得这份清单的?也许还有更聪明的方法。
  • 感谢您的评论。我正在编辑我的问题以添加收集数据的过程。

标签: r list loops matrix dplyr


【解决方案1】:

只需将tidy = TRUE 添加到CompactDataMethod 调用中:

library(IMFData)

databaseID <- "DOT"
startdate = "2019-07-01"
enddate = "2019-09-01"
checkquery = FALSE

queryfilter <- list(CL_FREQ = "M", CL_AREA_DOT = "AU",
                    CL_INDICATOR_DOT = "TXG_FOB_USD",
                    CL_COUNTERPART_AREA_DOT = c("BR", "US"))
df_DOTS <- CompactDataMethod(databaseID,
                             queryfilter,
                             startdate,
                             enddate,
                             checkquery,
                             tidy = TRUE)



df_DOTS
  @TIME_PERIOD @OBS_VALUE @FREQ @REF_AREA  @INDICATOR @COUNTERPART_AREA @UNIT_MULT @TIME_FORMAT
1      2019-07 876.025841     M        AU TXG_FOB_USD                US          6          P1M
2      2019-08  872.02118     M        AU TXG_FOB_USD                US          6          P1M
3      2019-09 787.272851     M        AU TXG_FOB_USD                US          6          P1M
4      2019-07  55.687747     M        AU TXG_FOB_USD                BR          6          P1M
5      2019-08  36.076581     M        AU TXG_FOB_USD                BR          6          P1M
6      2019-09  57.764474     M        AU TXG_FOB_USD                BR          6          P1M

你只需要一个group_by(@COUNTERPART_AREA) %&gt;% summarise(mean = mean(@OBS_VALUE)):

library(tidyverse)
df_DOTS %>%
  group_by(`@COUNTERPART_AREA`, `@REF_AREA`) %>%
  summarise(mean = mean(as.numeric(`@OBS_VALUE`))) %>%
  spread( `@COUNTERPART_AREA`, mean) 
#output
  `@REF_AREA`    BR    US
  <chr>       <dbl> <dbl>
1 AU           49.8  845.
  

或者如果你坚持使用矩阵

df_DOTS %>%
  group_by(`@COUNTERPART_AREA`, `@REF_AREA`) %>%
  summarise(mean = mean(as.numeric(`@OBS_VALUE`))) %>%
  spread( `@COUNTERPART_AREA`, mean) %>%
  column_to_rownames("@REF_AREA") %>%
  as.matrix
#output
         BR       US
AU 49.84293 845.1066

【讨论】:

  • 这是完美的,因为它将“AU”保留在第一列。非常感谢!
【解决方案2】:

另一种选择是这样的:

tmp <- df_DOTS %>% 
  as_tibble() %>% 
  summarise(across(everything(), ~mean(as.numeric(.x$Obs[[1]]$`@OBS_VALUE`))))
tmp
# # A tibble: 1 x 2
#       BR    US
#     <dbl> <dbl>
#   1  49.8  845.  

【讨论】:

    【解决方案3】:

    从输入数据中,我们可以循环使用mappluck 所需的元素,转换为numeric,获取mean,然后转换为两列tibble 和@987654326 @

    library(purrr)
    library(tidyr)
    map(df_DOTS, ~ .x %>% 
           pluck("Obs", 1, "@OBS_VALUE") %>%
            as.numeric %>%
            mean) %>%
       enframe %>%
       unnest(c(value))
    # A tibble: 2 x 2
    #   name  value
    #  <chr> <dbl>
    #1 BR     49.8
    #2 US    845. 
    

    【讨论】:

      猜你喜欢
      • 2021-08-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-08-17
      相关资源
      最近更新 更多