【问题标题】:Count observations at any given date在任何给定日期计算观察值
【发布时间】:2019-08-28 15:36:02
【问题描述】:

我正在使用类似于以下的数据集:

df <- data.frame(type = c("A", "A", "A", "A", "A", "B", "B", "B", "C", "D", "D", "D"), 
                 start_date = as.Date(c("2010-02-01", "2011-03-15", "2011-09-15", "2015-01-01", "2015-05-15", "2009-01-01", "2015-07-14", "2016-06-30", "2012-01-15", "2010-04-05", "2010-08-01", "2012-04-01"), format = "%Y-%m-%d"), 
                 end_date = as.Date(c("2010-12-31", "2011-07-31", "2014-04-04", "2015-02-15", "2016-12-15", "2013-02-16", "2015-12-31", "2016-12-31", "2015-09-17", "2010-04-10", "2010-09-30", "2013-12-31"), format = "%Y-%m-%d"))

我想计算任何给定日期的观察次数。

预期输出

本质上,我希望我的结果显示在 2010-02-01 之前只有一个 type,然后在 2010-04-05 之前有两个,然后在 2010-04-10 之前有三个,以此类推,即一列带有日期(每天一行)和一列计数为type

date count_of_type
2009-01-01 1
2009-01-02 1
2009-01-03 1
...
2010-01-31 1
2010-02-01 2
2010-02-02 2
...
2010-04-04 2
2010-04-05 3
2010-04-06 3
2010-04-07 3
2010-04-08 3
2010-04-09 3
2010-04-10 2
2010-04-11 2
...

我认为这很容易做到,但无法弄清楚...有什么想法吗?

干杯,

【问题讨论】:

  • 你能显示预期的输出吗?你需要df %&gt;% mutate(n = as.integer(difftime(end_date, start_date, unit = 'day')))
  • 也许您需要Gantt Diagram 之类的东西?另外,也许这个问答可以提供帮助:stackoverflow.com/q/3550341/4137985
  • 我刚刚添加了预期的输出

标签: r dplyr


【解决方案1】:

一个选项是transmute,通过day获取'start_date'的相应sequence,'end_date',然后获取count

library(tidyverse)
df %>% 
   transmute(date = map2(start_date, end_date, seq, by = '1 day')) %>% 
   unnest %>% 
   count(date)

【讨论】:

  • 谢谢!就是这样。看起来很简单,但它让我汗流浃背:)
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-01-12
  • 2022-12-07
相关资源
最近更新 更多