【问题标题】:How create a matrix from the data.frame where time variable is separated to year/month in R? [closed]如何从 data.frame 创建一个矩阵,其中时间变量在 R 中被分隔为年/月? [关闭]
【发布时间】:2021-01-29 10:09:56
【问题描述】:

我有以下数据:

structure(list(monthtrunc = c("2011-01-01 00:00:00", "2011-02-01 00:00:00", 
"2011-03-01 00:00:00", "2011-04-01 00:00:00", "2011-05-01 00:00:00", 
"2011-06-01 00:00:00", "2011-07-01 00:00:00", "2011-08-01 00:00:00", 
"2011-09-01 00:00:00", "2011-10-01 00:00:00", "2011-11-01 00:00:00", 
"2011-12-01 00:00:00", "2012-01-01 00:00:00", "2012-02-01 00:00:00", 
"2012-03-01 00:00:00", "2012-04-01 00:00:00", "2012-05-01 00:00:00", 
"2012-06-01 00:00:00", "2012-07-01 00:00:00", "2012-08-01 00:00:00", 
"2012-09-01 00:00:00", "2012-10-01 00:00:00", "2012-11-01 00:00:00", 
"2012-12-01 00:00:00", "2013-01-01 00:00:00", "2013-02-01 00:00:00", 
"2013-03-01 00:00:00", "2013-04-01 00:00:00", "2013-05-01 00:00:00", 
"2013-06-01 00:00:00", "2013-07-01 00:00:00", "2013-08-01 00:00:00", 
"2013-09-01 00:00:00", "2013-10-01 00:00:00", "2013-11-01 00:00:00", 
"2013-12-01 00:00:00", "2014-01-01 00:00:00", "2014-02-01 00:00:00", 
"2014-03-01 00:00:00", "2014-04-01 00:00:00", "2014-05-01 00:00:00", 
"2014-06-01 00:00:00", "2014-07-01 00:00:00", "2014-08-01 00:00:00", 
"2014-09-01 00:00:00", "2014-10-01 00:00:00", "2014-11-01 00:00:00", 
"2014-12-01 00:00:00", "2015-01-01 00:00:00", "2015-02-01 00:00:00", 
"2015-03-01 00:00:00", "2015-04-01 00:00:00", "2015-05-01 00:00:00", 
"2015-06-01 00:00:00", "2015-07-01 00:00:00", "2015-08-01 00:00:00", 
"2015-09-01 00:00:00", "2015-10-01 00:00:00", "2015-11-01 00:00:00", 
"2015-12-01 00:00:00", "2016-01-01 00:00:00", "2016-02-01 00:00:00", 
"2016-03-01 00:00:00", "2016-04-01 00:00:00", "2016-05-01 00:00:00", 
"2016-06-01 00:00:00", "2016-07-01 00:00:00", "2016-08-01 00:00:00", 
"2016-09-01 00:00:00", "2016-10-01 00:00:00", "2016-11-01 00:00:00", 
"2016-12-01 00:00:00", "2017-01-01 00:00:00", "2017-02-01 00:00:00", 
"2017-03-01 00:00:00", "2017-04-01 00:00:00", "2017-05-01 00:00:00", 
"2017-06-01 00:00:00", "2017-07-01 00:00:00", "2017-08-01 00:00:00", 
"2017-09-01 00:00:00", "2017-10-01 00:00:00", "2017-11-01 00:00:00", 
"2017-12-01 00:00:00", "2018-01-01 00:00:00", "2018-02-01 00:00:00", 
"2018-03-01 00:00:00", "2018-04-01 00:00:00", "2018-05-01 00:00:00", 
"2018-06-01 00:00:00", "2018-07-01 00:00:00", "2018-08-01 00:00:00", 
"2018-09-01 00:00:00", "2018-10-01 00:00:00", "2018-11-01 00:00:00", 
"2018-12-01 00:00:00", "2019-01-01 00:00:00", "2019-02-01 00:00:00", 
"2019-03-01 00:00:00", "2019-04-01 00:00:00", "2019-05-01 00:00:00", 
"2019-06-01 00:00:00", "2019-07-01 00:00:00", "2019-08-01 00:00:00", 
"2019-09-01 00:00:00", "2019-10-01 00:00:00", "2019-11-01 00:00:00", 
"2019-12-01 00:00:00", "2020-01-01 00:00:00", "2020-02-01 00:00:00", 
"2020-03-01 00:00:00", "2020-04-01 00:00:00", "2020-05-01 00:00:00", 
"2020-06-01 00:00:00", "2020-07-01 00:00:00", "2020-08-01 00:00:00", 
"2020-09-01 00:00:00", "2020-10-01 00:00:00", "2020-11-01 00:00:00", 
"2020-12-01 00:00:00", "2021-01-01 00:00:00"), pricevel = c(0.97404039323986, 
1.59072883788402, 1.27480749627224, 1.81090268262643, 1.08446437755464, 
0.972108246283993, 1.49791700178465, 1.37510660100886, 2.04150133131786, 
1.42884858882619, 1.19802471954172, 1.14329588578247, 1.42742056250377, 
2.2275041319492, 1.53007075524382, 1.23329026496101, 1.57506464615333, 
1.32751803011859, 1.7848957852864, 1.9637793672329, 1.74357511065471, 
1.55939337072177, 1.67010998791817, 1.55129791834254, 1.30384467900289, 
0.832447180752655, 1.13135024580089, 1.3765404096192, 1.19674689679198, 
1.98490299304297, 1.3434307071074, 1.4514810468706, 1.77871435204876, 
1.97963621576311, 1.87404747674559, 1.49749444828933, 2.01124526136312, 
1.82079049940119, 1.84927857760505, 1.92380117921576, 1.96061069983973, 
1.88130563099087, 1.94738530876845, 1.45278140059751, 1.72778164035316, 
2.21831418309796, 2.37833850516538, 2.26110412263556, 1.3667357004183, 
2.16716492707592, 1.9964658667007, 1.90550429105743, 2.52401661346915, 
4.65229321968885, 4.06309750288639, 3.52501180210932, 3.36073193455, 
3.27687520648688, 2.71981238272502, 3.13234208680292, 3.43216816895142, 
2.51011328691671, 2.43371721452435, 1.97282761223673, 2.03611652461561, 
1.98055096785881, 1.25302398447412, 1.39595140584167, 1.92349634511272, 
1.3898703102964, 1.25360052095124, 1.38729469889915, 1.10785848021687, 
1.44164297488934, 1.12537028492826, 1.3019182239242, 1.45406310480236, 
2.18468971262534, 1.49264380934283, 1.95556923187788, 1.53946965298746, 
1.89836287979905, 1.49722376226602, 1.3316288963285, 1.86684289490321, 
1.94546131494554, 1.34082856346, 1.06542821694662, 3.26472001074854, 
1.24904641000808, 0.666437798683986, 0.808438390674756, 1.6081014240707, 
1.36685105055708, 1.2901325053293, 0.99353374892888, 1.12990336780221, 
0.940318105238634, 1.0941265837922, 1.44237646495435, 3.09351182655182, 
4.37448213503437, 1.68973645485736, 1.57185142258305, 2.16241007171827, 
1.74207778169647, 1.66474213081745, 1.58488926144574, 2.61599263116864, 
4.82905420739183, 5.03432249952418, 4.17095067204738, 4.74128391277353, 
4.89784902848934, 4.73860640822821, 2.45502699423883, 3.85568379725977, 
3.82281698762841, 5.47542632894932, 4.13157986707201, 1.91563131285955
)), row.names = c(NA, -121L), class = c("tbl_df", "tbl", "data.frame"
))

如何创建矩阵,其中行是 year(monthtrunc),列是 month(monthtrunc),值与此数据框中的值相同。因此,矩阵应该是 11x12。 或者,假设我有每小时数据。是否可以做相同的矩阵,其中的值是聚合的(平均值、总和等)?

【问题讨论】:

    标签: r datetime


    【解决方案1】:

    monthtrunc中提取年月数据,获取宽格式数据。

    library(tidyverse)
    
    df %>%
      mutate(monthtrunc = ymd_hms(monthtrunc), 
             year = year(monthtrunc), 
             month = month(monthtrunc)) %>%
      select(-monthtrunc) %>%
      pivot_wider(names_from = month, values_from = pricevel) %>%
      column_to_rownames('year')
    
    #         1     2    3    4    5     6     7     8    9   10   11    12
    #2011 0.974 1.591 1.27 1.81 1.08 0.972 1.498 1.375 2.04 1.43 1.20 1.143
    #2012 1.427 2.228 1.53 1.23 1.58 1.328 1.785 1.964 1.74 1.56 1.67 1.551
    #2013 1.304 0.832 1.13 1.38 1.20 1.985 1.343 1.451 1.78 1.98 1.87 1.497
    #2014 2.011 1.821 1.85 1.92 1.96 1.881 1.947 1.453 1.73 2.22 2.38 2.261
    #2015 1.367 2.167 2.00 1.91 2.52 4.652 4.063 3.525 3.36 3.28 2.72 3.132
    #2016 3.432 2.510 2.43 1.97 2.04 1.981 1.253 1.396 1.92 1.39 1.25 1.387
    #2017 1.108 1.442 1.13 1.30 1.45 2.185 1.493 1.956 1.54 1.90 1.50 1.332
    #2018 1.867 1.945 1.34 1.07 3.26 1.249 0.666 0.808 1.61 1.37 1.29 0.994
    #2019 1.130 0.940 1.09 1.44 3.09 4.374 1.690 1.572 2.16 1.74 1.66 1.585
    #2020 2.616 4.829 5.03 4.17 4.74 4.898 4.739 2.455 3.86 3.82 5.48 4.132
    #2021 1.916    NA   NA   NA   NA    NA    NA    NA   NA   NA   NA    NA
    

    如果您有每小时数据,您可以通过在pivot_wider 中包含values_fn = sumvalues_fn = mean 来聚合这些值。

    【讨论】:

    • 这是有效的,但不是以正确的方式。年份列应该作为行的索引,但这里它是一个单独的列
    • @ErkoTru 您可以使用column_to_rownames 函数轻松将该列设置为行名。请参阅我的更新答案。如果要将其转换为矩阵,还可以在上面的输出中添加%>% as.matrix
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2015-08-21
    • 1970-01-01
    • 2015-02-04
    • 2016-01-19
    • 2021-03-10
    • 2012-07-10
    • 1970-01-01
    相关资源
    最近更新 更多