【问题标题】:Generating a histogram and density plot from binned data从分箱数据生成直方图和密度图
【发布时间】:2015-04-29 16:07:41
【问题描述】:

我已经对一些数据进行了分箱,目前有一个包含两列的数据框,一列指定分箱范围,另一列指定频率,如下所示:-

> head(data)
      binRange Frequency
1    (0,0.025]        88
2 (0.025,0.05]        72
3 (0.05,0.075]        92
4  (0.075,0.1]        38
5  (0.1,0.125]        20
6 (0.125,0.15]        16

我想使用它来绘制直方图和密度图,但我似乎找不到这样做的方法,而不必生成新的 bin 等。使用此解决方案 here 我尝试执行以下操作:-

p <- ggplot(data, aes(x= binRange, y=Frequency)) + geom_histogram(stat="identity")

但它崩溃了。有人知道如何处理吗?

谢谢

【问题讨论】:

  • 看看这个post
  • 谢谢你,刚刚更新了我的帖子。我试图为我的数据做这件事,所以我执行了p &lt;- ggplot(data, aes(x= binRange, y=Frequency)) + geom_histogram(stat="identity"),但它只是崩溃了
  • 你得到什么错误信息?
  • geom_bar替换geom_histogram
  • 嗯,主要是我去打印的时候,我得到以下错误:- Error in withCallingHandlers (tryCatch (evalq((function (i) : object '.rcpp_warning_recorder' not found

标签: r ggplot2 histogram density-plot


【解决方案1】:

问题是 ggplot 不理解您输入数据的方式,您需要像这样重塑它(我不是正则表达式大师,所以肯定有更好的方法):

df <- read.table(header = TRUE, text = "
                 binRange Frequency
1    (0,0.025]        88
2 (0.025,0.05]        72
3 (0.05,0.075]        92
4  (0.075,0.1]        38
5  (0.1,0.125]        20
6 (0.125,0.15]        16")

library(stringr)
library(splitstackshape)
library(ggplot2)
# extract the numbers out,
df$binRange <- str_extract(df$binRange, "[0-9].*[0-9]+")

# split the data using the , into to columns:
# one for the start-point and one for the end-point
df <- cSplit(df, "binRange")

# plot it, you actually dont need the second column
ggplot(df, aes(x = binRange_1, y = Frequency, width = 0.025)) +
    geom_bar(stat = "identity", breaks=seq(0,0.125, by=0.025))

或者如果您不希望以数字方式解释数据,您只需执行以下操作:

df <- read.table(header = TRUE, text = "
                 binRange Frequency
1    (0,0.025]        88
2 (0.025,0.05]        72
3 (0.05,0.075]        92
4  (0.075,0.1]        38
5  (0.1,0.125]        20
6 (0.125,0.15]        16")

library(ggplot2)
ggplot(df, aes(x = binRange, y = Frequency)) + geom_bar(stat = "identity")

你将无法用你的数据绘制密度图,因为它不是连续的而是分类的,这就是为什么我实际上更喜欢第二种显示方式,

【讨论】:

    【解决方案2】:

    你可以试试

    library(ggplot2)
    ggplot(df, aes(x = binRange, y = Frequency)) + geom_col()
    

    【讨论】:

      猜你喜欢
      • 2014-12-07
      • 2012-08-14
      • 2014-09-21
      • 2015-06-03
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-08-15
      相关资源
      最近更新 更多