【问题标题】:data.table - select rows by multiple group columnsdata.table - 按多个组列选择行
【发布时间】:2018-11-27 18:17:24
【问题描述】:

一些数据(取自https://www.r-bloggers.com/two-of-my-favorite-data-table-features/

# generate a small dataset
set.seed(1234)
smalldat <- data.frame(group1 = rep(1:2, each = 5), 
                       group2 = rep(c('a','b'), times = 5), 
                       x = rnorm(10))

# convert to data.frame to data.table
library(data.table)
smalldat <- data.table(smalldat)

# convert aggregated variable into raw data file
smalldat[, aggGroup1 := mean(x), by = group1]

# aggregate with 2 variables
smalldat[, aggGroup1.2 := mean(x), by = list(group1, group2)]



Output

##     group1 group2       x aggGroup1 aggGroup1.2
##  1:      1      a -1.2071   -0.3524      0.1022
##  2:      1      b  0.2774   -0.3524     -1.0341
##  3:      1      a  1.0844   -0.3524      0.1022
##  4:      1      b -2.3457   -0.3524     -1.0341
##  5:      1      a  0.4291   -0.3524      0.1022
##  6:      2      b  0.5061   -0.4140     -0.3102
##  7:      2      a -0.5747   -0.4140     -0.5696
##  8:      2      b -0.5466   -0.4140     -0.3102
##  9:      2      a -0.5645   -0.4140     -0.5696
## 10:      2      b -0.8900   -0.4140     -0.3102

如何通过保留group2的信息来选择aggGroup1.2具有min值为group1的行。

结果应该是这样的:

group1   group2   aggGroup1.2
1        b        -1.0341
2        a        -0.5696

我尝试过使用 data.table 语法,但失败了...

【问题讨论】:

    标签: r data.table subset


    【解决方案1】:

    这是一种方法:

    smalldat[, .(group2 = group2[which.min(aggGroup1.2)], aggGroup1.2 = min(aggGroup1.2)), by = group1]
    #    group1 group2 aggGroup1.2
    # 1:      1      b   -1.034134
    # 2:      2      a   -0.569596
    

    【讨论】:

      【解决方案2】:

      除了 Gregor 的回答之外,还可以。也尝试获取整行:

      smalldat[smalldat[, .I[which.min(aggGroup1.2)], by = group1][, V1]]
      
         group1 group2          x  aggGroup1 aggGroup1.2
      1:      1      b  0.2774292 -0.3523537   -1.034134
      2:      2      a -0.5747400 -0.4139612   -0.569596
      

      【讨论】:

      • 如果您有多个列并且想要所有列,这种方式会更有效!
      猜你喜欢
      • 1970-01-01
      • 2012-09-10
      • 2016-10-08
      • 1970-01-01
      • 1970-01-01
      • 2018-07-23
      • 2023-03-13
      • 1970-01-01
      • 2011-07-01
      相关资源
      最近更新 更多