【问题标题】:R data.table reshape dataR data.table 重塑数据
【发布时间】:2017-11-04 18:33:04
【问题描述】:

我使用 data.table 重塑数据。

library(data.table)
market <- data.table(
  stkcd=c(1,2),
  type =c(1,0),
  roa2013=c(2,3),
  roa2014=c(4,5),
  lev2013=c(6,7),
  lev2016=c(8,9))
market
#     stkcd type roa2013 roa2014 lev2013 lev2016
# 1:     1    1       2       4       6       8
# 2:     2    0       3       5       7       9
melt(market,
     measure.vars = patterns("^roa", "^lev"),
     variable.name = "year", 
     value.name = c("roa","lev"))
#     stkcd type year roa lev
# 1:     1    1    1   2   6
# 2:     2    0    1   3   7
# 3:     1    1    2   4   8
# 4:     2    0    2   5   9

这就是最终数据的样子。

#     stkcd type year roa lev
# 1     1    1 2013   2   6
# 2     1    1 2014   4  NA
# 3     1    1 2016  NA   8
# 4     2    0 2013   3   7
# 5     2    0 2014   5  NA
# 6     2    0 2016  NA   9

有人有什么好的方法吗? 谢谢。

【问题讨论】:

标签: r data.table


【解决方案1】:

1.使用重塑{stats},

library(data.table)
market <- data.table(
  stkcd=c(1,2),
  type =c(1,0),
  roa2013=c(2,3),
  roa2014=c(4,5),
  lev2013=c(6,7),
  lev2016=c(8,9))

market[,`:=`(roa2016=NA,lev2014=NA)]
long <- reshape(market, 
        idvar = "stkcd", 
        varying = c("roa2013","lev2013",
                    "roa2014","lev2014",
                    "roa2016","lev2016"),
        sep = "",
        timevar = "year",
        direction = "long")
setorder(long,stkcd,year)
long
#     stkcd type year roa lev
# 1:     1    1 2013   2   6
# 2:     1    1 2014   4  NA
# 3:     1    1 2016  NA   8
# 4:     2    0 2013   3   7
# 5:     2    0 2014   5  NA
# 6:     2    0 2016  NA   9

2.str_extract str

library(data.table)
library(stringr)
market <- data.table(
  stkcd=c(1,2),
  type =c(1,0),
  roa2013=c(2,3),
  roa2014=c(4,5),
  lev2013=c(6,7),
  lev2016=c(8,9))
market
long <- melt(market,
             id.vars = c("stkcd","type"))
long[,`:=`(year=str_extract(variable,pattern = "[0-9]{4}"),
           vars=str_extract(variable,pattern = "[a-zA-Z]{1,}"))][,variable:=NULL]
long <- dcast(long, stkcd + type + year ~ vars, value.var = "value")
long
#     stkcd type year lev roa
# 1:     1    1 2013   6   2
# 2:     1    1 2014  NA   4
# 3:     1    1 2016   8  NA
# 4:     2    0 2013   7   3
# 5:     2    0 2014  NA   5
# 6:     2    0 2016   9  NA

...

【讨论】:

    【解决方案2】:

    我们可以通过splitstackshape 轻松做到这一点。在感兴趣的列中的数字和非数字部分之间创建一个分隔符,然后使用merged.stack 重塑为“long”并将“.time_1”列名称更改为“year”

    library(splitstackshape)
    names(market) <- sub("(\\d+)", "_\\1", names(market))
    res <- merged.stack(market, var.stubs = c("roa", "lev"), sep="_")
    setnames(res, ".time_1", "year")
    res
    #   stkcd type year roa lev
    #1:     1    1 2013   2   6
    #2:     1    1 2014   4  NA
    #3:     1    1 2016  NA   8
    #4:     2    0 2013   3   7
    #5:     2    0 2014   5  NA
    #6:     2    0 2016  NA   9
    

    【讨论】:

    • 谢谢。这是个好方法。
    猜你喜欢
    • 2020-07-08
    • 2013-08-21
    • 2013-08-24
    • 2019-02-05
    • 2014-03-10
    • 1970-01-01
    • 1970-01-01
    • 2013-01-11
    • 2019-02-28
    相关资源
    最近更新 更多