【问题标题】:Melting multiple rows熔化多行
【发布时间】:2013-08-03 18:25:00
【问题描述】:

我有一个宽格式表,其中前 3 行用于描述表中显示的数据。例如:

Company:               |  Company A  |  Company B  |  Company C  |       |  Company N
Data source:           |  Budget     |  Actual     |  Budget     |  ...  |    ...
Currency:              |  USD        |  EUR        |  USD        |       |    ...
Indicator:
 Sales                    500            1000         1500        ...       ...
 Gross Income             200            300           400        ...       ...
 ...                      ...            ...           ...        ...       ...
 Indicator J              ...            ...           ...        ...

我想用以下布局将它改造成长格式:

Indicator | Company   | Currency | Data Source | Value
 Sales    | Company A |   USD    | Budget      | 500
 Sales    | Company B |   EUR    | Actual      | 1000
 ...      |    ...    |    ...   |    ...      |  ...

我尝试使用 reshape2 包将其融化,但未能将第 2 行和第 3 行转换为变量

dput(AAA)
structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
    V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
    V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
    V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
    V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
-8L))

【问题讨论】:

  • dput数据,以便复制

标签: r reshape reshape2


【解决方案1】:

这是一个解决方案,涉及转置数据并进行一些清理。休息是通过 'melt' 完成的:

    AAA <- structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
              "Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
               V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
               V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
               V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
               V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
                                                                                                                                                                      "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                   -8L))
# transpose data
dft <- data.frame(t(AAA), stringsAsFactors=FALSE)

require(reshape2)
# set colnames
colnames(dft) <- dft[1, ]
dft <- dft[-1, ]

# remove empty indicator col
dft[ , 4] <- NULL

# melt data
melt(dft, id.vars=c('Company:', 'Source:', 'Currency:'), variable.name='Indicator:')

# Company: Source: Currency:   Indicator: value
# 1  Company A  Budget       USD        Sales  1000
# 2  Company B  Actual       USD        Sales  1500
# 3  Company C  Budget       EUR        Sales  2000
# 4  Company D  Actual       EUR        Sales  2500

也许你需要更多的清理(现在每个 col 都是字符,也许还可以在转置之前设置 colnames...)。

【讨论】:

  • +1。也许像下面这样的东西也很有用:data.frame(lapply(x1, function(x) type.convert(as.character(x))))(其中“x1”是你的melt命令的结果)。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-02-16
  • 1970-01-01
  • 1970-01-01
  • 2020-02-06
  • 2019-01-02
  • 2014-07-11
相关资源
最近更新 更多