【问题标题】:R Average values of posixct rows by hour [duplicate]R按小时计算的posixct行的平均值[重复]
【发布时间】:2023-01-13 01:37:44
【问题描述】:

我有一个数据框“df1”,它有一个 posixct 列“日期”和其他数据列,其值与“日期”中的日期和时间相关联。日期看起来像这样:

Date Value1 Value2
2022-03-14 13:00:00 1 3
2022-03-14 13:10:00 2 4
2022-03-14 13:20:00 3 5
2022-03-14 13:30:00 4 6
2022-03-14 13:40:00 5 7
2022-03-14 13:50:00 6 8
2022-03-14 14:00:00 10 40
2022-03-14 14:10:00 20 50
2022-03-14 14:20:00 30 60
2022-03-14 14:30:00 40 70
2022-03-14 14:40:00 50 80
2022-03-14 14:50:00 60 90

我想在每一天的所有实例中对“Value1”和“Value2”中的值进行平均,并创建一个新的数据框“df2”,其中“日期”现在是每天每小时的开始,“Value1”是平均值。生成的 df2 看起来像:

Date Value1 Value2
2022-03-14 13:00:00 3.5 5.5
2022-03-14 14:00:00 35 65

【问题讨论】:

    标签: r mean posixct


    【解决方案1】:

    如果日期作为字符,你可以使用substr

    substr(df$Date, 15,16) <- "00"
    df %>% group_by(Date) %>% summarise(Value1 = mean(Value1), Value2 = mean(Value2))
    

    输出:

    A tibble: 2 × 3
     Date                Value1 Value2
     <chr>                <dbl>  <dbl>
    1 2022-03-14 13:00:00    3.5    5.5
    2 2022-03-14 14:00:00   35     65 
    

    数据:

    df <- data.frame(
      Date = c("2022-03-14 13:00:00",
               "2022-03-14 13:10:00","2022-03-14 13:20:00","2022-03-14 13:30:00",
               "2022-03-14 13:40:00","2022-03-14 13:50:00","2022-03-14 14:00:00",
               "2022-03-14 14:10:00","2022-03-14 14:20:00","2022-03-14 14:30:00",
               "2022-03-14 14:40:00","2022-03-14 14:50:00"),
      Value1 = c(1L, 2L, 3L, 4L, 5L, 6L, 10L, 20L, 30L, 40L, 50L, 60L),
      Value2 = c(3L, 4L, 5L, 6L, 7L, 8L, 40L, 50L, 60L, 70L, 80L, 90L)
    )
    

    【讨论】:

      【解决方案2】:

      您可以使用 floor_date()lubridate 将日期时间对象向下舍入到最近的小时边界。

      library(dplyr)
      library(lubridate)
      
      df %>%
        group_by(Date = floor_date(Date, "hour")) %>%
        summarise(across(contains("Value"), mean))
      
      # # A tibble: 2 × 3
      #   Date                Value1 Value2
      #   <dttm>               <dbl>  <dbl>
      # 1 2022-03-14 13:00:00    3.5    5.5
      # 2 2022-03-14 14:00:00   35     65
      

      数据
      df <- read.csv(text = "Date, Value1, Value2
      2022-03-14 13:00:00, 1, 3
      2022-03-14 13:10:00, 2, 4
      2022-03-14 13:20:00, 3, 5
      2022-03-14 13:30:00, 4, 6
      2022-03-14 13:40:00, 5, 7
      2022-03-14 13:50:00, 6, 8
      2022-03-14 14:00:00, 10, 40
      2022-03-14 14:10:00, 20, 50
      2022-03-14 14:20:00, 30, 60
      2022-03-14 14:30:00, 40, 70
      2022-03-14 14:40:00, 50, 80
      2022-03-14 14:50:00, 60, 90", colClasses = c(Date = "POSIXct"))
      

      【讨论】:

        猜你喜欢
        • 2015-11-02
        • 2016-01-10
        • 2014-08-30
        • 2020-04-15
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多