如何在将两个色谱柱保持在一起的同时熔化色谱柱？答案

【问题标题】：How can I melt columns while keeping two together?如何在将两个色谱柱保持在一起的同时熔化色谱柱？
【发布时间】：2015-01-23 07:49:23
【问题描述】：

我有这种宽格式的数据要转换为长格式

    Cond    Construct   Line    Plant   Tube_shoot  weight_shoot    Tube_root   weight_root
1   Standard            NA      NA      2           199.95          -           -
2   Cd0     IIF         43.1    1       3           51.87           4           10.39
3   Cd0     IIF         43.1    2       5           81.80           6           15.05
4   Cd0     IIF         43.1    3       7           101.56          8           16.70

我基本上想要的是将 Tube_shoot 和 weight_shoot 保持在一起，即将这两列视为熔化的列。但是因为我只能用

id.vars=c("Cond","Construct","Line","Plant")

结果不是我想要的。

到目前为止，我有两个（丑陋的）解决方案：

我融化了两次，首先是 measure.vars=c("Tube_shoot", "Tube_root" )，然后是权重，然后删除了结果完全错误的一半行。这对我来说是不可行的，因为我有不同长度的数据，而且我总是必须检查我是否取出了正确的行。
我将带有“重量”的“管”粘贴到一个新列中，取出其他的，将它们熔化，然后再将它们拆开。
在excel中一一复制。但是有数百行我宁愿学习如何在 R 中做到这一点。

我确信有更好的方法。

我到底想要什么：

    Cond    Construct   Line    Plant   Tube        weight
1   Standard            NA      NA      2           199.95
2   Cd0     IIF         43.1    1       3           51.87
3   Cd0     IIF         43.1    2       5           81.80
4   Cd0     IIF         43.1    3       7           101.56
2   Cd0     IIF         43.1    1       4           10.39
3   Cd0     IIF         43.1    2       6           15.05
4   Cd0     IIF         43.1    3       8           16.70

【问题讨论】：

标签： r melt

【解决方案1】：

你可以试试

 res <- reshape(df1, idvar=c('Cond', 'Construct', 'Line', 'Plant'),
              varying=5:8, direction='long', sep="_")

 res1 <-  res[res$weight!='-', -5]
 row.names(res1) <- NULL

 res1
 #      Cond Construct Line Plant Tube weight_shoot
 #1 Standard             NA    NA    2       199.95
 #2      Cd0       IIF 43.1     1    3        51.87
 #3      Cd0       IIF 43.1     2    5         81.8
 #4      Cd0       IIF 43.1     3    7       101.56
 #5      Cd0       IIF 43.1     1    4        10.39
 #6      Cd0       IIF 43.1     2    6        15.05
 #7      Cd0       IIF 43.1     3    8        16.70

数据

 df1 <- structure(list(Cond = c("Standard", "Cd0", "Cd0", "Cd0"), 
  Construct = c("", "IIF", "IIF", "IIF"), Line = c(NA, 43.1, 43.1, 43.1),
  Plant = c(NA, 1L, 2L, 3L), Tube_shoot = c(2L, 3L, 5L, 7L), weight_shoot = 
  c(199.95,51.87, 81.8, 101.56), Tube_root = c("-", "4", "6", "8"), 
  weight_root = c("-", "10.39", "15.05", "16.70")), .Names = c("Cond",
  "Construct", "Line", "Plant", "Tube_shoot", "weight_shoot", "Tube_root",
  "weight_root"), class = "data.frame", row.names = c("1", "2", "3", "4"))

【讨论】：

谢谢，这绝对是完美的。我没有尝试重塑，因为我正在关注此链接：cookbook-r.com/Manipulating_data/…
@Raphael Iselin 很高兴知道它有效。 reshape 是一个 base R 函数，有时使用起来可能有点棘手，但对于多列，它比 dcast 效果更好。例如检查这个链接stackoverflow.com/questions/27118314/…
@akrun, dcast.data.table 将从 1.9.8 版本开始使用此类数据。

【解决方案2】：

另一种选择，使用 dplyr 和 tidyr：

library(dplyr)
libarary(tidyr)

gather(df1, x, Tube, c(Tube_shoot, Tube_root)) %>% 
   mutate(weight = ifelse(grepl("*root$", x), weight_root, weight_shoot)) %>%
   select(-c(weight_shoot, weight_root, x))

#      Cond Construct Line Plant Tube weight
#1 Standard             NA    NA    2 199.95
#2      Cd0       IIF 43.1     1    3  51.87
#3      Cd0       IIF 43.1     2    5   81.8
#4      Cd0       IIF 43.1     3    7 101.56
#5 Standard             NA    NA    -      -
#6      Cd0       IIF 43.1     1    4  10.39
#7      Cd0       IIF 43.1     2    6  15.05
#8      Cd0       IIF 43.1     3    8  16.70

【讨论】：

【解决方案3】：

您可能需要考虑我的“splitstackshape”包中的merged.stack，您可以使用它执行以下操作：

library(splitstackshape)
merged.stack(as.data.table(df1, keep.rownames = TRUE), 
             var.stubs = c("Tube", "weight"), sep = "_")
#    rn     Cond Construct Line Plant .time_1 Tube weight
# 1:  1 Standard             NA    NA    root    -      -
# 2:  1 Standard             NA    NA   shoot    2 199.95
# 3:  2      Cd0       IIF 43.1     1    root    4  10.39
# 4:  2      Cd0       IIF 43.1     1   shoot    3  51.87
# 5:  3      Cd0       IIF 43.1     2    root    6  15.05
# 6:  3      Cd0       IIF 43.1     2   shoot    5   81.8
# 7:  4      Cd0       IIF 43.1     3    root    8  16.70
# 8:  4      Cd0       IIF 43.1     3   shoot    7 101.56

当然，您也可以在末尾添加[Tube != "-" | weight != "-"] 以删除“管”或“重量”具有“-”的行...但请注意，这样做不会神奇地将这些列转换为数字:-)

【讨论】：

我想知道如果merged.stack默认将数据帧转换为data.table会不会更容易，除非它也可以处理数据帧？
@DavidArenburg，它会自动转换为data.tables。我做了as.data.table，因为如果以后需要的话，我想要用于订购的行名。
哦，好吧，很好的解决方案
这个解决方案对我的原始数据不起作用，即使使用 akrun 提供的 df1 也不起作用。我必须为此添加 id.vars=c('Cond', 'Construct', 'Line', 'Plant')。
@riselin，您使用的是哪个版本的“splitstackshape”？在 1.4 及更高版本中，它会尝试猜测“id.vars”。