【问题标题】:R - build unique groups based on consecutive rows and factor levelR - 基于连续行和因子级别构建唯一组
【发布时间】:2020-04-21 19:12:04
【问题描述】:

一般来说,只要它们来自数据框中的连续行,我将如何根据相同的因素进行分组?例如,我想从test 得到下面想要的good_output

test <- data.frame(time = 1:10, letter = c("a","a","a","b","a","a","a","b","b","b"))
bad_output <- test %>% group_by(letter) %>% summarize(mean_time = mean(time))
bad_output
# A tibble: 2 x 2
  letter mean_time
  <fct>      <dbl>
1 a           4   
2 b           7.75

good_output <- data.frame(letter=c("a","b","a","b"), id=c(1,1,2,2), mean_time=c(2,4,6,9))
good_output
  letter id mean_time
1      a  1         2
2      b  1         4
3      a  2         6
4      b  2         9

【问题讨论】:

    标签: r group-by unique


    【解决方案1】:

    我们可以通过'信'和'信'上的运行长度ID(rleid来自data.table)进行分组,summarise得到'时间'的mean,创建带有row_number() 的序列列并选择“grp”列

    library(dplyr)
    library(data.table)
    test %>% 
        group_by(letter, grp = rleid(letter))  %>%
        summarise(mean_time = mean(time)) %>%       
        mutate(id = row_number()) %>%
        ungroup %>%
        select(-grp)
    # A tibble: 4 x 3
    #  letter mean_time    id
    #  <fct>      <dbl> <int>
    #1 a              2     1
    #2 a              6     2
    #3 b              4     1
    #4 b              9     2
    

    【讨论】:

    • 优秀。 rleid!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多