【发布时间】:2020-10-08 12:01:07
【问题描述】:
给定一些像x = 10.08506, 10.32809, ... 这样的随机数据,我怎样才能以有效的方式创建分类列表?结果(请参见下面的可重现示例)应如下所示
classes n
(10,10.1] 3
(10.1,10.2] 1
(10.2,10.3] 0
(10.3,10.4] 2
(10.4,10.5] 3
(10.5,10.6] 0
(10.6,10.7] 0
(10.7,10.8] 1
这是一个可重现的示例,它显示了迄今为止最简单的方法:我可以摆脱 data.frame df 和 full_join 吗?也许,我也可以摆脱br, h?
library(dplyr)
set.seed(1)
number_of_observations <- 10
nbr <- 10
x <- rnorm(n = number_of_observations, mean = 10.273, sd = 0.3)
br <- seq(from = ceiling(min(nbr*x)-1)/nbr,
to = floor(max(nbr*x)+1)/nbr, by = 1/nbr)
h <- hist(x, breaks = br)
df <- tibble(
classes = h$mids)
df <- df %>%
mutate(classes = cut(classes, breaks = br)) %>%
group_by(classes) %>%
mutate(n = n()) %>%
ungroup() %>%
mutate(freq = n / sum(n)) %>%
arrange(classes)
df2 <- tibble(
classes = x)
df2 <- df2 %>%
mutate(classes = cut(classes, breaks = br)) %>%
group_by(classes) %>%
mutate(n = n()) %>%
ungroup() %>%
mutate(freq = n / sum(n)) %>%
arrange(classes) %>%
distinct()
df <- df %>% full_join(df2, by = "classes")
df$n.y[is.na(df$n.y)] <- 0
result <- df[, c("classes", "n.y")]
colnames(result) <- c("classes", "n")
result
【问题讨论】: