【问题标题】:Tricky reshaping of a data frame数据框的棘手重塑
【发布时间】:2016-07-08 14:10:05
【问题描述】:

我有一个包含过去 12 个月库存数据的数据框。我在下面创建了一个为期三个月的模拟数据框,类似于我的数据集。

inventory <- data.frame(ID=c(1,1,1,1,2,2,3,3,3,3,4,4,4),
                        SKU=c("375F","375F","375F","375F","QX51","QX51","AEC","AEC","AEC","AEC","115332H","115332H","115332H"), 
                        inventory=c(3,4,14,5,18,5,4,13,4,10,3,2,2), 
                        sold=c(3,2,0,1,4,0,0,3,1,5,0,2,1), 
                        returned=c(1,0,2,0,0,0,1,0,1,1,0,2,0), 
                        month=c(0,1,2,3,0,2,3,0,1,2,3,2,3))

我正在尝试操作数据框以生成一个报告,该报告显示每个变量及其 ID 和 SKU 以及每个月的列,如下图所示。

我曾尝试使用 dplyr 和 data.table 库重新调整数据框,但没有取得任何成功。如何将数据转换为每个月都有一列,就像我发布的图片一样?我对 R 还是很陌生,所以请放轻松。谢谢。

【问题讨论】:

  • 您应该在问题中包含您的图片。插入图像链接是一个坏主意,因为最终这些链接将变得不可用,并且未来的用户将无法从您的问题中受益。你可以编辑你的帖子来解决这个问题。
  • 感谢您的提示,安德烈。已更新。
  • 更好的是,不要使用图像。有更好的方法来显示预期结果。
  • 第一篇,学习了。下次会做。谢谢@Sotos
  • 没问题。例如,您可以使用dput

标签: r dataframe dplyr reshape


【解决方案1】:

ID = 4 and SKU = 115332H 存在重复项,因此我必须更改值以删除重复项。

# Creating the data frame
inventory <- data.frame(ID=c(1,1,1,1,2,2,3,3,3,3,4,4,4), 
                        SKU=c("375F","375F","375F","375F","QX51","QX51","AEC","AEC","AEC","AEC","115332H","115332H","115332H"), 
                        inventory=c(3,4,14,5,18,5,4,13,4,10,3,2,2), sold=c(3,2,0,1,4,0,0,3,1,5,0,2,1), 
                        returned=c(1,0,2,0,0,0,1,0,1,1,0,2,0), 
                        month=c(0,1,2,3,0,2,3,0,1,2,1,2,3))

# Reshaping the data
  # Melting the data frame
  inv2 <- melt(inventory,id=c("ID","SKU","month"))
  # Reshaping
  inv2_wide <- reshape(inv2,v.names = "value",idvar = c("ID","SKU","variable"),
                       timevar = "month", direction = "wide")

# Ordering by ID variables
inv2_wide <- inv2_wide[order(inv2_wide$ID,inv2_wide$SKU),]

# Renaming the variables
names(inv2_wide) <- gsub("value\\.","Month",names(inv2_wide))


   ID     SKU  variable Month0 Month1 Month2 Month3
1   1    375F inventory      3      4     14      5
14  1    375F      sold      3      2      0      1
27  1    375F  returned      1      0      2      0
5   2    QX51 inventory     18     NA      5     NA
18  2    QX51      sold      4     NA      0     NA
31  2    QX51  returned      0     NA      0     NA
7   3     AEC inventory     13      4     10      4
20  3     AEC      sold      3      1      5      0
33  3     AEC  returned      0      1      1      1
11  4 115332H inventory     NA      3      2      2
24  4 115332H      sold     NA      0      2      1
37  4 115332H  returned     NA      0      2      0

【讨论】:

    【解决方案2】:

    我们可以使用tidyr

    library(dplyr)
    library(tidyr)
    gather(inventory, Variable,value, inventory:returned)  %>% #reshape to long
           mutate(month = paste0("Month", month)) %>% #concat with "Month" string
           spread(month, value)#reshape to wide
    #   ID     SKU  Variable Month0 Month1 Month2 Month3
    #1   1    375F inventory      3      4     14      5
    #2   1    375F  returned      1      0      2      0
    #3   1    375F      sold      3      2      0      1
    #4   2    QX51 inventory     18     NA      5     NA
    #5   2    QX51  returned      0     NA      0     NA
    #6   2    QX51      sold      4     NA      0     NA
    #7   3     AEC inventory     13      4     10      4
    #8   3     AEC  returned      0      1      1      1
    #9   3     AEC      sold      3      1      5      0
    #10  4 115332H inventory     NA      3      2      2
    #11  4 115332H  returned     NA      0      2      0
    #12  4 115332H      sold     NA      0      2      1
    

    【讨论】:

      猜你喜欢
      • 2023-03-22
      • 2013-01-27
      • 2015-10-07
      • 2019-05-29
      相关资源
      最近更新 更多