【发布时间】:2020-02-29 15:20:52
【问题描述】:
我正在尝试回答这个问题:
使用 nycflights13 包和航班数据框回答以下问题:取消航班比例最高的月份是哪一个月?哪个月最低?解释任何季节性模式。
我已经从技术上回答了这个问题,但我正在尝试制作一个比现在更简洁的标题。
这是我目前所拥有的:
#Load packages
library(nycflights13)
library(tidyverse)
#Data frame "cancprop" with three new variables ("canc" = flights that were canceled, "notc" = flights that were not canceled, and "canp" = proportion of all flights that were canceled)
cancprop <- flights %>%
mutate(
canc = is.na(dep_time),
notc = !is.na(dep_time),
canp = canc / (canc + notc)
)
#A tibble showing the average proportion of all flights that were canceled by month sorted by descending average proportion.
cancprop %>%
group_by(month) %>%
summarize(mcanp = mean(canp)) %>%
arrange(desc(mcanp))
# A tibble: 12 x 2
month mcanp
<int> <dbl>
1 2 0.0505
2 12 0.0364
3 6 0.0357
4 7 0.0319
5 3 0.0299
6 4 0.0236
7 5 0.0196
8 1 0.0193
9 8 0.0166
10 9 0.0164
11 11 0.00854
12 10 0.00817
#Data frame "seas" with a new variable ("season" = the season corresponding with the month)
seas <- cancprop %>%
group_by(month) %>%
summarize(mcanp = mean(canp)) %>%
mutate(
season = case_when(
month %in% 3:5 ~ "Spring",
month %in% 6:8 ~ "Summer",
month %in% 9:11 ~ "Fall",
TRUE ~ "Winter"
))
seas
# A tibble: 12 x 3
month mcanp season
<int> <dbl> <chr>
1 1 0.0193 Winter
2 2 0.0505 Winter
3 3 0.0299 Spring
4 4 0.0236 Spring
5 5 0.0196 Spring
6 6 0.0357 Summer
7 7 0.0319 Summer
8 8 0.0166 Summer
9 9 0.0164 Fall
10 10 0.00817 Fall
11 11 0.00854 Fall
12 12 0.0364 Winter
#A plot showing the proportion of flights canceled
ggplot(seas, aes(x = factor(month), y = mcanp, fill = season)) +
geom_bar(stat = "identity") +
labs(x = "Month", y = "Proportion of Flights Canceled", color = "Season")
我要创建的是一个显示每个季节取消航班的平均比例的小标题,例如这个(随机的、非计算的比例,因为我不确定如何实际获得结果):
# A tibble: 4 x 2
season mcanp
<chr> <dbl>
1 Winter 0.0433
2 Spring 0.0235
3 Summer 0.0109
4 Fall 0.0246
感谢您的帮助,谢谢!
【问题讨论】:
-
我觉得你需要
seas %>% group_by(season) %>% summarise(mcanp = mean(mcanp))? -
在这种情况下效果很好,但并不是我想要的,因为它采用了每个季节的月度平均值,而不是季节的平均值。跨度>
-
Using
seas %>% group_by(season) %>% summarise(mcanp = mean(mcanp))得到 1 Winter 0.0354, 2 Summer 0.0281, 3 Spring 0.0243, 4 Fall 0.0110 而我正在寻找的答案因为是 1 冬季 0.0350,2 夏季 0.0280,3 春季 0.0243,4 秋季 0.0110