【问题标题】:How to predict values into the future using either gam or forecast如何使用游戏或预测来预测未来的价值
【发布时间】:2021-10-24 05:42:02
【问题描述】:

我有一些鱼类数据,我想对未来做出预测。我想根据两个变量(airtemp_f 和 watertemp_f)预测“fishcount”。理想情况下,我想使用 R 包预测来预测 fishcount 2 或 3 个周期数,但是,我不知道如何将 airtemp_f 和 watertemp_f 包含到模型中。下面是一个小数据集:

 library(forecast)
 library(ggfortify)
 library(ggplot2)
 library(xts)

fish <- structure(list(year = c(2011, 2011, 2011, 2011, 2011, 2011, 2011, 
2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 
2012, 2012, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 
2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012
), period = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10), district = c("221", "221", "221", "221", "221", 
"221", "221", "221", "221", "221", "221", "221", "221", "221", 
"221", "221", "221", "221", "221", "221", "222", "222", "222", 
"222", "222", "222", "222", "222", "222", "222", "222", "222", 
"222", "222", "222", "222", "222", "222", "222", "222"), date = structure(c(15158, 
15160, 15162, 15164, 15165, 15167, 15168, 15169, 15172, 15174, 
15512, 15519, 15525, 15529, 15531, 15533, 15535, 15536, 15537, 
15538, 15187, 15190, 15192, 15194, 15197, 15199, 15201, 15203, 
15205, 15207, 15903, 15905, 15908, 15911, 15914, 15916, 15918, 
15919, 15920, 15921), class = "Date"), fishcount = c(2101, 16031, 
13498, 7024, 42569, 36288, 101565, 204305, 235376, 39851, 14879, 
24512, 97382, 109688, 164938, 182892, 115047, 203842, 247499, 
33879, 89551, 25576, 61377, 4517, 0, 11739, 22318, 69831, 2895, 
16720, 349586, 136904, 365634, 369484, 1054650, 1009362, 1080558, 
671706, 631603, 1007896), airtemp_f = c(54.95, 56.15, 54.1325, 
53.3975, 54.1775, 53.945, 54.62, 54.0773913043478, 56.63, 54.7625, 
50.8025, 49.6625, 49.8575, 49.3775, 49.55, 49.49, 50.0525, 49.775, 
49.6775, 50.795, 57.8675, 53.9225, 53.5475, 51.905, 58.8875, 
55.0475, 54.185, 56.24, 53.915, 54.1325, 56.8154545454545, 58.6021052631579, 
60.5381818181818, 58.084347826087, 57.6930434782609, 56.9808695652174, 
59.3109090909091, 57.8136363636364, 174.548, 56.1623529411765
), watertemp_f = c(56.735, 57.2225, 56.4125, 55.5275, 54.6575, 
54.7625, 54.4475, 53.7095652173913, 55.6925, 53.09, 50, 51.635, 
52.61, 51.0425, 51.095, 50.63, 50.825, 51.065, 50.8625, 52.25, 
59.7425, 55.9325, 55.67, 54.6575, 55.2575, 54.8375, 55.7525, 
56.78, 55.985, 55.595, 59.09, 59.4263157894737, 59.7690909090909, 
58.7417391304348, 59.7513043478261, 60.424347826087, 61.2172727272727, 
59.9163636363636, 58.676, 58.2588235294118)), row.names = c(NA, 
-40L), class = c("tbl_df", "tbl", "data.frame"))

 head(fish)
year period district date       fishcount airtemp_f watertemp_f
  <dbl>  <dbl> <chr>    <date>         <dbl>     <dbl>       <dbl>
1  2011      1 221      2011-07-03      2101      55.0        56.7
2  2011      2 221      2011-07-05     16031      56.2        57.2
3  2011      3 221      2011-07-07     13498      54.1        56.4
4  2011      4 221      2011-07-09      7024      53.4        55.5
5  2011      5 221      2011-07-10     42569      54.2        54.7
6  2011      6 221      2011-07-12     36288      53.9        54.8

这是我的尝试:

#convert fish to xts or ts?
count <- as.xts(fish$fishcount,order.by=seq(as.Date("2011-07-03"),by=2,len=40))
d.arima <- auto.arima(count)
d.forecast <- forecast(d.arima, level = c(95), h = 3)
d.forecast

问题:如何将 airtemp_f 和 watertemp_f 包含到模型中以按周期进行预测,以及如何在 ggplot 中绘制?

提前感谢您的帮助。

【问题讨论】:

    标签: r ggplot2 xts forecast


    【解决方案1】:

    如何将 airtemp_f 和 watertemp_f 包含到模型中

    函数有一个参数xreg:

    d.arima <- auto.arima(
      count, 
      xreg = as.matrix(fish[, c("airtemp_f", "watertemp_f")])
    )
    

    按周期预测

    不熟悉这个包,但h 似乎在预测中有xregs 后变得无用,它将使用xreg 中的行数。

    我假设它从训练期结束时开始预测,所以让我们创建一个函数来强制每次从头开始:

    fcst <- function(airtemp_f, watertemp_f, h = 3) {
      tibble::rownames_to_column(
        as.data.frame(forecast(
          d.arima, 
          level = 95, 
          xreg = cbind(airtemp_f = rep(airtemp_f, h), watertemp_f = watertemp_f)
        )),
        var = "period"
      )
    }
    

    如何在 ggplot 中绘制它?

    有很多方法,但你必须决定你正在使用的回归量的一些切片,例如:

    # Get the median and 95% interval of temperatures.
    tidyr::crossing(
      airtemp_f = quantile(fish$airtemp_f, c(0.05, 0.5, 0.95)),
      watertemp_f = quantile(fish$watertemp_f, c(0.05, 0.5, 0.95)),
    ) %>%
      # Run the forecast with out defined function.
      dplyr::mutate(fcst = purrr::map2(airtemp_f, watertemp_f, fcst)) %>%
      # Flatten the data frame we got from the foreacst.
      tidyr::unnest(fcst) %>%
      # Plot the results in facets,
      ggplot(aes(period, group = 1)) +
      facet_grid(airtemp_f ~ watertemp_f) +
      geom_ribbon(aes(ymin = `Lo 95`, ymax = `Hi 95`), alpha = 0.5) +
      geom_line(aes(y = `Point Forecast`)
    

    【讨论】:

    • @Robin--感谢您的建议。我现在正在浏览你的代码。那么,为了更好地理解它,这是预测水温和空气温度吗?我想要预测的是鱼数加班。日期变量应位于 x 轴上,而 fishcount 应位于 y 轴上。
    猜你喜欢
    • 2017-10-13
    • 2018-02-17
    • 1970-01-01
    • 2021-03-06
    • 2022-08-03
    • 2020-08-24
    • 2019-05-05
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多