【问题标题】:dplyr - How to obtain the order of one column within a group?dplyr - 如何获取组内一列的顺序?
【发布时间】:2018-08-17 11:23:10
【问题描述】:

示例数据:

tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
              grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
              grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
              value = c(1,2,3,4,4,6,2,5,3,6,3,2))
> tibbly
# A tibble: 12 x 4
     age grouping1 grouping2 value
   <dbl> <chr>     <chr>     <dbl>
 1    10 A         X             1
 2    30 A         X             2
 3    50 A         X             3
 4    10 A         Y             4
 5    30 A         Y             4
 6    50 A         Y             6
 7    10 B         X             2
 8    30 B         X             5
 9    50 B         X             3
10    10 B         Y             6
11    30 B         Y             3
12    50 B         Y             2

问题: 如何获取数据框中每个组的行顺序?我可以使用 dplyr 以适当的形式排列数据,以可视化我感兴趣的内容:

> tibbly %>% 
     group_by(grouping1, grouping2) %>%
     arrange(grouping1, grouping2, desc(value))
# A tibble: 12 x 4
# Groups:   grouping1, grouping2 [4]
     age grouping1 grouping2 value
   <dbl> <chr>     <chr>     <dbl>
 1    50 A         X             3
 2    30 A         X             2
 3    10 A         X             1
 4    50 A         Y             6
 5    10 A         Y             4
 6    30 A         Y             4
 7    30 B         X             5
 8    50 B         X             3
 9    10 B         X             2
10    10 B         Y             6
11    30 B         Y             3
12    50 B         Y             2

最后,我对基于值列的每个组的年龄列的顺序感兴趣。 dplyr 有没有一种优雅的方法来做到这一点? summarise() 之类的东西基于行的顺序而不是实际值

【问题讨论】:

  • 您在寻找row_number吗?如tibbly %&gt;% group_by(grouping1, grouping2) %&gt;% arrange(grouping1, grouping2, desc(value)) %&gt;% mutate(RowNum=row_number())
  • ....或类似的东西? tibbly %&gt;% group_by(grouping1, grouping2) %&gt;% arrange(grouping1, grouping2, desc(value)) %&gt;% summarise(order = paste0(age, collapse = ",")) %&gt;% ungroup()
  • @AntoniosK 这正是我想要的!每个组中年龄列的顺序。在那种情况下,我不知道 paste with collapse 的用法。很整齐。谢谢!
  • @AntoniosK 请发表您的评论作为答案,这样问题就不会再显示为未回答。
  • @Mojoesque 请注意,arrange(grouping1, grouping2, desc(value)) 可用于可视化目的,但分组后的arrange(desc(value)) 足以完成您想要的工作。

标签: r dplyr


【解决方案1】:
library(dplyr)

tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
                grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
                grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
                value = c(1,2,3,4,4,6,2,5,3,6,3,2))


tibbly %>% 
  group_by(grouping1, grouping2) %>%                  # for each group
  arrange(desc(value)) %>%                            # arrange value descending
  summarise(order = paste0(age, collapse = ",")) %>%  # get the order of age as a strings
  ungroup()                                           # forget the grouping

# # A tibble: 4 x 3
#   grouping1 grouping2 order   
#   <chr>     <chr>     <chr>   
# 1 A         X         50,30,10
# 2 A         Y         50,10,30
# 3 B         X         30,50,10
# 4 B         Y         10,30,50

【讨论】:

    【解决方案2】:

    data.table

    library(data.table)
    setDT(tibbly)[order(-value), .(order = toString(age)),.(grouping1, grouping2)]
    

    【讨论】:

      猜你喜欢
      • 2016-06-28
      • 1970-01-01
      • 1970-01-01
      • 2012-03-18
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多