在R中计算具有相同模式的行[重复]答案

【问题标题】：Count rows with the same pattern in R [duplicate]在R中计算具有相同模式的行[重复]
【发布时间】：2021-12-30 17:13:33
【问题描述】：

我有一个类似于下面的数据框，我需要计算在这个数据框中相同的行模式重复了多少次。

start_id | end_id | type | id
1        | 2      | a    | 1
2        | 5      | a    | 2
1        | 3      | b    | 3
2        | 5      | a    | 4
1        | 3      | b    | 5

我想要的结果是这样的：

start_id | end_id | type | n
1        | 2      | a    | 1
2        | 5      | a    | 2
1        | 3      | b    | 2

我尝试了以下代码，但它没有合并记录，它返回与它们相同的行，只是用计数器添加一个新列，这对我的分析不利：

Sumary <- clear_filt_trip  %>%
    group_by(start_id, end_id, type) %>% 
    add_count(across(everything()))

我尝试使用summarize，但它只是重复列。

我该怎么办？

【问题讨论】：

查看欺骗链接，并在脑海中将所有提到的“按组平均”替换为“长度”或“行数”或类似内容，从而导致相同的可能解决方案（在dplyr 内或不在）。

标签： r dataframe

【解决方案1】：

dplyr

library(dplyr)
dat %>%
  group_by(start_id, end_id, type) %>%
  tally() %>%
  ungroup()
# # A tibble: 3 x 4
#   start_id end_id type      n
#      <dbl>  <dbl> <chr> <int>
# 1        1      2 a         1
# 2        1      3 b         2
# 3        2      5 a         2

基础 R

aggregate(. ~ start_id + end_id + type, data = dat, FUN = length)
#   start_id end_id type id
# 1        1      2    a  1
# 2        2      5    a  2
# 3        1      3    b  2

数据

dat <- structure(list(start_id = c(1, 2, 1, 2, 1), end_id = c(2, 5, 3, 5, 3), type = c("a", "a", "b", "a", "b"), id = 1:5), row.names = c(NA, -5L), class = "data.frame")

【讨论】：

【解决方案2】：

再一次，除了 r2evans：

data.table

library(data.table)
   
df[, id:=NULL]

df[, .N, by=names(df)]

   start_id end_id type N
1:        1      2    a 1
2:        2      5    a 2
3:        1      3    b 2

数据：

df = structure(list(start_id = c(1L, 2L, 1L, 2L, 1L), end_id = c(2L, 
5L, 3L, 5L, 3L), type = c("a", "a", "b", "a", "b"), id = 1:5), row.names = c(NA, 
-5L), class = c("data.table", "data.frame"))

【讨论】：