如何在 R 中按年份计算唯一值？ [复制]答案

【问题标题】：How can I count unique values by year in R? [duplicate]如何在 R 中按年份计算唯一值？ [复制]
【发布时间】：2020-02-28 05:07:56
【问题描述】：

我有一个这样的数据集：

Year Month Day Location Target Perpetrator
1970  5     1  Place1   x      A
1970  7     5  Place2   y      A
1971  2     3  Place3   x      B
1972  10    8  Place4   x      C
1972  12   13  Place2   y      C
1973  1     3  Place5   z      B

我完全不知道如何做到这一点。我试过了

data <- data %>%
  distinct() %>%
  count(Perpetrator)

但这当然只给了我“犯罪者”中每个唯一值的计数。

输出 I 是按年份计算的“犯罪者”中每个唯一值的计数。我该怎么做？

【问题讨论】：

试试data %>% group_by(Year) %>% distinct() %>% count(perpetrator)
这正是我想要的！我已经尝试了几个小时，非常感谢
另一种方式（不会导致 tibble）是来自 plyr 包的 ddply(data, .(Year), summarise, n = n_distinct(Perpetrator))。我个人更喜欢这种方式，因为我讨厌 tibbles：P
这能回答你的问题吗？ How to add count of unique values by group to R data.frame
可以count多个变量data %>% count(Year, Perpetrator)

标签： r grouping unique

【解决方案1】：

在基础 R 中，我们可以使用 tapply。

with(dat, tapply(Perpetrator, Year, FUN=length))
# 1970 1971 1972 1973 
#    2    1    2    1

数据：

dat <- structure(list(Year = c(1970L, 1970L, 1971L, 1972L, 1972L, 1973L
), Month = c(5L, 7L, 2L, 10L, 12L, 1L), Day = c(1L, 5L, 3L, 8L, 
13L, 3L), Location = c("Place1", "Place2", "Place3", "Place4", 
"Place2", "Place5"), Target = c("x", "y", "x", "x", "y", "z"), 
    Perpetrator = c("A", "A", "B", "C", "C", "B")), row.names = c(NA, 
-6L), class = "data.frame")

【讨论】：