geom_histogram 没有 weight 审美所以我不明白你想如何处理 tonne.km。但是如果你想将CDF叠加到直方图上,这里有一个方法。
首先要认识到,经验直方图密度和 ECDF 等密度在不同的尺度上是很多倍的,特别是在分布是连续的且样本很大的情况下。然后,主要技巧是通过最大密度 y 值缩放 ECDF。
library(ggplot2)
library(scales)
distance <- rnorm(1000000, mean = 1000, sd = 500)
tonne.km <- rnorm(1000000, mean = 25000, sd = 500)
dist.tk.test <- data.frame(distance, tonne.km)
bins <- 50L
x_breaks <- 10L
max_y <- max(density(dist.tk.test$distance)$y)
ggplot(dist.tk.test) +
geom_histogram(
aes(x = distance, y = ..density..), bins = bins
) +
geom_line(
aes(
x = sort(distance),
y = max_y * seq_along(distance)/length(distance)
),
color = "red"
) +
scale_x_continuous(label = comma,
breaks = extended_breaks(x_breaks)) +
scale_y_continuous(
name = "Density",
sec.axis = sec_axis(~ .x / max_y ,
labels = scales::percent,
name = "Cumulative Share (%)")
)
由reprex package (v2.0.1) 于 2022 年 8 月 17 日创建
编辑
在下面的评论之后,这是另一个解决方案。
首先计算distance 的总tonne.km。
为了做到这一点,必须对距离进行装箱。我使用findInterval 将它们分箱,然后将每个箱的tonne.km(变量breaks)与aggregate 相加。这是图中使用的 data.frame。
library(ggplot2)
library(scales)
set.seed(2022)
distance <- rnorm(1000000, mean = 1000, sd = 500)
tonne.km <- rnorm(1000000, mean = 25000, sd = 500)
dist.tk.test <- data.frame(distance, tonne.km)
breaks <- range(dist.tk.test$distance)
breaks <- round(breaks/100)*100
breaks <- seq(breaks[1], breaks[2], by = 50)
bins <- findInterval(dist.tk.test$distance, breaks)
breaks <- breaks[bins]
new_df <- aggregate(tonne.km ~ breaks, dist.tk.test, sum, na.rm = TRUE)
y_max <- max(new_df$tonne.km, na.rm = TRUE)
x_axis_breaks <- 10L
ggplot(new_df, aes(breaks, tonne.km)) +
geom_col(position = position_dodge(), width = 100) +
geom_line(
aes(
y = y_max * cumsum(tonne.km)/sum(tonne.km)
),
color = "red"
) +
scale_x_continuous(
name = "Distance",
label = comma,
breaks = extended_breaks(x_axis_breaks)) +
scale_y_continuous(
name = "Tonne/Km",
sec.axis = sec_axis(~ .x/y_max,
labels = scales::percent,
name = "Cumulative Share (%)")
)
#> Warning: position_dodge requires non-overlapping x intervals
由reprex package (v2.0.1) 于 2022 年 8 月 17 日创建