【问题标题】:Plotting average monthly counts per decade on a plot在一个地块上绘制每十年的平均每月计数
【发布时间】:2020-02-08 05:58:17
【问题描述】:

我有一个数据集,其每月“流量”超过 68 年。我试图通过在 x 轴上绘制一个季节性分布并在图上显示每个十年的平均值的图来按十年来比较流量分布。

【问题讨论】:

  • 嗨,ckc,请包含您的数据子集,以制作一个最小可重复的示例,我们可以将其剪切并粘贴到我们的 R 会话中,以帮助您找到解决方案。谢谢:)
  • See here 提出一个人们可以帮助解决的 R 问题。这包括数据样本、所有必要的代码,以及对您正在尝试做什么和什么没有奏效的清晰解释。
  • 对不起!对 R 来说非常新。我添加了一些数据

标签: r ggplot2 plot distribution


【解决方案1】:

使用您的示例数据和tidyverse 包,以下代码将计算每十年和每月的平均值:

library(tidyverse)

x <- "Year    Jan     Feb     Mar     Apr     May     Jun     Jul     Aug     Sep
1948    29550   47330   64940   61140   20320   17540   37850   29250   17100   
1949    45700   53200   37870   36310   39200   23040   31170   23640   19720   
1950    16050   17950   27040   21610   15510   16090   12010   11360   14390   
1951    14280   13210   16260   24280   13570   9547    9921    8129    7304    
1952    19030   29250   58860   31780   19940   16930   9268    9862    9708    
1953    24340   28020   31830   29700   44980   15630   22660   14190   13430   
1954    34660   23260   24390   21500   13250   10860   10700   8188    6092    
1955    14050   19430   12780   19330   12210   7892    12450   10920   6850    
1956    7262    20800   27680   24110   13560   8594    10150   7721    10540   
1957    14470   13350   22720   39860   23980   12630   10230   7008    8567"

d <- read_table(x) %>% 
  mutate(
    decade = (Year %/% 10)*10 # add column for decade
  ) %>% 
  select(-Year) %>%  # remove the year
  pivot_longer(  # convert to a 'tidy' (long) format
    cols = Jan:Sep,
    names_to = "month",
    values_to = "count"
  ) %>% 
  mutate(
    month = factor(month, levels = month.abb, ordered = TRUE)  # make sure months are ordered
  ) %>% 
  group_by(decade, month) %>% 
  summarise(
    mean = mean(count)
  )

如果你打印那个数据框,你会得到:

> d
# A tibble: 18 x 3
# Groups:   decade [2]
   decade month   mean
    <dbl> <ord>  <dbl>
 1   1940 Jan   37625 
 2   1940 Feb   50265 
 3   1940 Mar   51405 
 4   1940 Apr   48725 
 5   1940 May   29760 
 6   1940 Jun   20290 
 7   1940 Jul   34510 
 8   1940 Aug   26445 
 9   1940 Sep   18410 
10   1950 Jan   18018.
11   1950 Feb   20659.
12   1950 Mar   27695 
13   1950 Apr   26521.
14   1950 May   19625 
15   1950 Jun   12272.
16   1950 Jul   12174.
17   1950 Aug    9672.
18   1950 Sep    9610.

如果您需要宽幅格式:

d2 <- d %>% 
  pivot_wider(
    id_cols = decade,
    names_from = month,
    values_from = mean
  )
> d2
# A tibble: 2 x 10
# Groups:   decade [2]
  decade    Jan    Feb   Mar    Apr   May    Jun    Jul    Aug    Sep
   <dbl>  <dbl>  <dbl> <dbl>  <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
1   1940 37625  50265  51405 48725  29760 20290  34510  26445  18410 
2   1950 18018. 20659. 27695 26521. 19625 12272. 12174.  9672.  9610.

【讨论】:

    【解决方案2】:

    (编辑:从折线图更改为闪避条形图,以更好地与 OP 代码对齐。)

    这是使用来自tidyverse 的 dplyr、tidyr 和 ggplot2 的方法。

    library(tidyverse)
    M %>%
      group_by(Decade = floor(Year/10)*10) %>%
      summarize_at(vars(Jan:Sep), mean) %>%
    
      # This uses tidyr::pivot_longer to reshape the data longer, which gives us the
      #  ability to map decade to color.
      pivot_longer(-Decade, names_to = "Month", values_to = "Avg") %>%
    
      # This step to get the months to be an ordered factor in order of appearance, 
      #   which is necessary to avoid the months showing up in alphabetical order.
      mutate(Month = fct_inorder(Month)) %>%
      # Alternatively, we could have aligned these thusly
      # mutate(Month_order = match(Month, month.abb)) %>%
      # mutate(Month = fct_reorder(Month, Month_order)) %>%
    
      ggplot(aes(Month, Avg, fill = as.factor(Decade))) +
      geom_col(position = position_dodge()) +
      scale_fill_discrete(name = "Decade")
    

    【讨论】:

    • OP 可能想要一个基于他/她的代码的躲闪的条形图
    猜你喜欢
    • 2015-08-29
    • 1970-01-01
    • 1970-01-01
    • 2021-12-18
    • 1970-01-01
    • 2020-10-31
    • 2021-09-06
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多