【问题标题】:Change count to probability in this histogram在此直方图中将计数更改为概率
【发布时间】:2020-04-10 06:12:47
【问题描述】:

我正在努力将计数更改为以下直方图中的概率,而不会弄乱红色区域。另外,如何将 1,10 与 x 轴上的其余数字对齐?

library(dplyr)
library(tibble)
library(ggplot2)
nrep = 10000
scientific <- function(x){
  ifelse(x==1e+0, "1", ifelse(x==1e+1,"10",parse(text=gsub("[+]", "", gsub("1e+", "10^", scales::scientific_format()(x))))))
}
bw <- 0.05
mx=rf(nrep,5,2)
df = tibble(x = mx)
ggplot(df,aes(x)) + 
geom_histogram(binwidth=bw, color="white", fill = "#1380A1") + 
geom_histogram(data=df %>% filter(x < 10^(-1) + 1.15*bw), binwidth=bw, color="white", fill = "red") +
geom_density(aes(y = bw*after_stat(count)), color="blue") +
scale_x_continuous(trans="log10", breaks = 10^seq(-1, 5, by = 1), labels = scientific)

【问题讨论】:

  • 很遗憾,我无法重现您的代码。要将geom_histogram 从计数转换为密度,您需要将y = after_stat(density) 添加到aes()。然后,您可以使用fill 条件突出显示aes() 内的部分直方图。见github.com/yutannihilation/gghighlight/issues/…
  • @atsyplenkov 谢谢。我编辑了代码。它现在应该可以工作了。

标签: r ggplot2 histogram


【解决方案1】:

您需要将层geom_histogram 中的比例从计数更改为密度。在第一个直方图中,您可以使用after_stat(density) 来实现,它相当于after_stat(count/sum(count))/bw。但是,相同的过程在第二个直方图中不起作用,因为当您对数据集进行子集化时,sum(count) 是不同的。如果这样做,第二个直方图将采用不同的比例。

library(dplyr)
library(tibble)
library(ggplot2)
nrep = 10000
scientific <- function(x){
  ifelse(x==1e+0, "1", ifelse(x==1e+1,"10",parse(text=gsub("[+]", "", gsub("1e+", "10^", scales::scientific_format()(x))))))
}
bw <- 0.05
mx=rf(nrep,5,2)

df = tibble(x = mx) 
pdf <- df %>% filter(x < 10^(-1) + 1.15*bw)
ggplot() + 
  geom_histogram(data = df,
                 aes(x = x, y = after_stat(count/sum(count)/bw),
                 binwidth=bw, color="white", fill = "#1380A1") + 
  geom_histogram(data = pdf, 
                 aes(x = x, y = after_stat(count/sum(count)/bw),
                 binwidth=bw, color="white", fill = "red") +
  geom_density(data = df, 
               aes(x = x), color="blue") +
  scale_x_continuous(trans="log10", breaks = 10^seq(-1, 5, by = 1), 
                     labels = scientific)

因此,需要从第一个直方图计算出同分母的密度,定义为nrep

library(dplyr)
library(tibble)
library(ggplot2)
nrep = 10000
scientific <- function(x){
  ifelse(x==1e+0, "1", ifelse(x==1e+1,"10",parse(text=gsub("[+]", "", gsub("1e+", "10^", scales::scientific_format()(x))))))
}
bw <- 0.05
mx=rf(nrep,5,2)

df = tibble(x = mx) 
pdf <- df %>% filter(x < 10^(-1) + 1.15*bw)
ggplot() + 
  geom_histogram(data = df,
                 aes(x = x, y = after_stat(density)),
                 binwidth=bw, color="white", fill = "#1380A1") + 
  geom_histogram(data = pdf, 
                 aes(x = x, y = after_stat(count/nrep)/bw),
                 binwidth=bw, color="white", fill = "red") +
  geom_density(data = df, 
               aes(x = x), color="blue") +
  scale_x_continuous(trans="log10", breaks = 10^seq(-1, 5, by = 1), 
                     labels = scientific)

【讨论】:

  • 谢谢。你知道如何将 x 轴上的 1 和 10 与其余数字对齐吗?当我将 10^0 和 10^1 更改为 1 和 10 时,它们的位置略高于其他位置。
  • 我认为将选定刻度的标签放置在与 x 轴的整体标签对齐方式不同的对齐方式中是不可行的。
猜你喜欢
  • 2020-01-27
  • 1970-01-01
  • 2018-04-16
  • 2015-09-02
  • 2013-02-02
  • 2013-06-29
  • 1970-01-01
  • 1970-01-01
  • 2012-07-29
相关资源
最近更新 更多