【问题标题】:pivot a data frame and exclude blank cells in r旋转数据框并排除 r 中的空白单元格
【发布时间】:2018-12-13 21:32:14
【问题描述】:

给定一个如下形式的数据框dat

    property_id                      tenant count
1              1     Burlington Coat Factory     1
2              1                      Macy's     2
3              1                       Sears     3
4              1                AMC Theatres     4
5              1                 Macy's Home     5
6              2     Burlington Coat Factory     1
7              2                    JCPenney     2
8              2                  Value City     3

我们怎样才能得到以下结果?

property_id                       X1                      X2                    X3            X4            X5     
1               Burlington Coat Factory                Macy's              Sears            AMC Theatres   Macy's Home  
2               Burlington Coat Factory                JCPenney            Value City       <NA>          <NA>

Melt/reshape 似乎会产生大部分稀疏矩阵。

我已经非常麻烦地使用了以下方法,但它很糟糕,我正在寻找更好的方法:

df<-data.frame(matrix(NA,1167,20))
df['id']<-unique(dat$property_id)
for(i in seq(1:dim(df)[1])){
  df[i,1:length(subset(dat,dat$property_id==df[i,'id'])$tenant)]<-t(subset(dat,dat$property_id==df[i,'id'])$tenant)
}

【问题讨论】:

    标签: r dataframe reshape


    【解决方案1】:

    spread 似乎正是您所需要的:

    library(tidyverse)
    spread(dat, count, tenant)
    # A tibble: 2 x 6
    #   property_id `1`                     `2`      `3`        `4`          `5`        
    #         <dbl> <chr>                   <chr>    <chr>      <chr>        <chr>      
    # 1           1 Burlington Coat Factory Macy's   Sears      AMC Theatres Macy's Home
    # 2           2 Burlington Coat Factory JCPenney Value City NA           NA         
    

    另一种选择:

    library(reshape2)
    dcast(dat, property_id ~ count, value.var = "tenant")
    #   property_id                       1        2          3            4           5
    # 1           1 Burlington Coat Factory   Macy's      Sears AMC Theatres Macy's Home
    # 2           2 Burlington Coat Factory JCPenney Value City         <NA>        <NA>
    

    最后:

    reshape(dat, v.names = "tenant", idvar = "property_id", timevar = "count", direction = "wide")
    #   property_id                tenant.1 tenant.2   tenant.3     tenant.4    tenant.5
    # 1           1 Burlington Coat Factory   Macy's      Sears AMC Theatres Macy's Home
    # 6           2 Burlington Coat Factory JCPenney Value City         <NA>        <NA>
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-04-06
      • 2019-10-01
      • 2019-11-27
      • 1970-01-01
      • 2015-03-22
      相关资源
      最近更新 更多