旋转数据框并排除 r 中的空白单元格答案

【问题标题】：pivot a data frame and exclude blank cells in r旋转数据框并排除 r 中的空白单元格
【发布时间】：2018-12-13 21:32:14
【问题描述】：

给定一个如下形式的数据框dat：

    property_id                      tenant count
1              1     Burlington Coat Factory     1
2              1                      Macy's     2
3              1                       Sears     3
4              1                AMC Theatres     4
5              1                 Macy's Home     5
6              2     Burlington Coat Factory     1
7              2                    JCPenney     2
8              2                  Value City     3

我们怎样才能得到以下结果？

property_id                       X1                      X2                    X3            X4            X5     
1               Burlington Coat Factory                Macy's              Sears            AMC Theatres   Macy's Home  
2               Burlington Coat Factory                JCPenney            Value City       <NA>          <NA>

Melt/reshape 似乎会产生大部分稀疏矩阵。

我已经非常麻烦地使用了以下方法，但它很糟糕，我正在寻找更好的方法：

df<-data.frame(matrix(NA,1167,20))
df['id']<-unique(dat$property_id)
for(i in seq(1:dim(df)[1])){
  df[i,1:length(subset(dat,dat$property_id==df[i,'id'])$tenant)]<-t(subset(dat,dat$property_id==df[i,'id'])$tenant)
}

【问题讨论】：

标签： r dataframe reshape

【解决方案1】：

spread 似乎正是您所需要的：

library(tidyverse)
spread(dat, count, tenant)
# A tibble: 2 x 6
#   property_id `1`                     `2`      `3`        `4`          `5`        
#         <dbl> <chr>                   <chr>    <chr>      <chr>        <chr>      
# 1           1 Burlington Coat Factory Macy's   Sears      AMC Theatres Macy's Home
# 2           2 Burlington Coat Factory JCPenney Value City NA           NA

另一种选择：

library(reshape2)
dcast(dat, property_id ~ count, value.var = "tenant")
#   property_id                       1        2          3            4           5
# 1           1 Burlington Coat Factory   Macy's      Sears AMC Theatres Macy's Home
# 2           2 Burlington Coat Factory JCPenney Value City         <NA>        <NA>

最后：

reshape(dat, v.names = "tenant", idvar = "property_id", timevar = "count", direction = "wide")
#   property_id                tenant.1 tenant.2   tenant.3     tenant.4    tenant.5
# 1           1 Burlington Coat Factory   Macy's      Sears AMC Theatres Macy's Home
# 6           2 Burlington Coat Factory JCPenney Value City         <NA>        <NA>

【讨论】：