【问题标题】:summarise daily value to weekly mean for each ids将每个 ID 的每日值汇总为每周平均值
【发布时间】:2018-10-17 14:04:28
【问题描述】:

我有一个数据框,其 id 包含不同连续时间段中的值现在我想创建一个列,它是每日数据的每周平均值。

df
id   date      value
 1   2018-1-12 3
 1   2018-1-13 4
 1   2018-1-14 5
 1   2018-1-15 5
 1   2018-1-16 3
 1   2018-1-17 5
 1   2018-1-18 5
 1   2018-1-19 5
 2   2017-1-14 8
 .
 .
 .
 12  2016-12-10 7

我想要我的 df 是什么

df
id   date      value  mean_week
 1   2018-1-12 3      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-13 4      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-14 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-15 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-16 3      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-17 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-18 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-19 5      NA(since there is no consecutive seven days)
 2   2017-1-14 5      mean(7 consecutive days starting 2017-1-14 and id=2)
 .
 .
 .
 12  2016-12-10 7    NA(since there is no consecutive seven days)

我搜索了一种简单的方法,但到目前为止我只在循环方式中进行。

【问题讨论】:

    标签: r dplyr plyr xts lubridate


    【解决方案1】:

    类似这样,但我不了解周开始条件

    library(tidyverse)
     df=read.table(text="id   date      value
      1   2018-1-12 3
                   1   2018-1-13 4
                   1   2018-1-14 5
                   1   2018-1-16 3
                   1   2018-1-17 5",header=T)
    
     library(lubridate)
     df%>%
       mutate(week=isoweek(date))%>%
       group_by(week,id)%>%
       mutate(mean_week=mean(value,na.rm = T))
    # A tibble: 5 x 5
    # Groups:   week, id [2]
         id date      value  week mean_week
      <int> <fct>     <int> <dbl>     <dbl>
    1     1 2018-1-12     3    2.        4.
    2     1 2018-1-13     4    2.        4.
    3     1 2018-1-14     5    2.        4.
    4     1 2018-1-16     3    3.        4.
    5     1 2018-1-17     5    3.        4.
    

    【讨论】:

    • 感谢@jyjek 的回复...编辑了我的问题,以便更容易理解我的需求。
    【解决方案2】:

    按周汇总您的数据。但是使用mutate() 这样每一行都会得到汇总值。

    df <- data.frame(date = as.Date("2018-01-01")+1:100,
                     value = sample(1:10,size = 100,replace = TRUE))
    
    
    require(dplyr)
    require(lubridate)
    
    
    
    df %>% mutate(week = week(date)) %>%
      group_by(week) %>%
      mutate(summary = paste(round(mean(value),1),"(",n()," consecutive days starting ",min(date),")"))
    

    给予

    date value  week                                           summary
    <date> <int> <dbl>                                             <chr>
    1  2018-01-02     3     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    2  2018-01-03     6     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    3  2018-01-04     1     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    4  2018-01-05     1     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    5  2018-01-06    10     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    6  2018-01-07     7     1 4.7  ( 6  consecutive days starting  2018-01-02 )
    7  2018-01-08     2     2   4  ( 7  consecutive days starting  2018-01-08 )
    8  2018-01-09     2     2   4  ( 7  consecutive days starting  2018-01-08 )
    9  2018-01-10     5     2   4  ( 7  consecutive days starting  2018-01-08 )
    10 2018-01-11     7     2   4  ( 7  consecutive days starting  2018-01-08 )
    

    【讨论】:

      猜你喜欢
      • 2018-01-24
      • 2017-11-10
      • 2018-02-15
      • 1970-01-01
      • 1970-01-01
      • 2015-05-11
      • 1970-01-01
      • 2016-03-28
      • 1970-01-01
      相关资源
      最近更新 更多