【问题标题】:How do I count how many kids is present in different time intervals?如何计算不同时间间隔内有多少孩子?
【发布时间】:2020-10-23 19:14:34
【问题描述】:

我有一个数据集,其中包含有关幼儿园孩子的接送时间的信息。我想计算在 30 分钟的时间间隔内有多少孩子在场。因此,即上午 7.30-7.59 或下午 16.00-16.29 有多少孩子在场。

在我的数据集中有 18 个孩子,但这是其中的一小部分。 问题是我在 kommet_antal 和 afhentet_antal 列表中的每个条目上得到 18 的总和。

在底部我写了一些有效的代码,但它真的不是那么漂亮!

 [1] "08:09:00" "09:00:00" "07:37:00" "07:51:00"
 lst_afhentet
 [1] "15:38:00" "15:19:00" "15:56:00" "14:24:00"

W1M_antal <- for(i in 1:nrow(df_W1M)){      #W1m is the dataframe with week1 and mondays selected
  lst_kommet <- df_W1M$Kommet
  lst_afhentet <- df_W1M$Afhentet
  
  kommet_antal <- vector("list", 21)      #21 timeintervals, open from 6.30 to 17.00
  afhentet_antal <- vector("list", 21)
  
                            
  tid<- as.ITime("07:00")                  #initial start time, T1
  n <- 0
  for(k in 1:length(lst_kommet)){         #runs over when children is delivered
    if(k < tid){                          #If the time delivered is before time (tid), then count it
      n <- n + 1  
    } else n <- n
    for(t in 1:length(kommet_antal)){     #want to save number of kids delivered in the different intervals
      kommet_antal[t] <- n
    }
    
    tid = tid + as.ITime("00:30")     #add 30 min to the time so we have next time interval
  }
  
  tid <- as.ITime("07:00")            #Do the same for pick up
  m <- 0
  for(a in 1:length(lst_afhentet)){
    if(a < tid){
      m <- m + 1
    } else m <- m
    for(l in 1:length(afhentet_antal)){   #Save number of kids in intervals until they are picked up
      afhentet_antal[l] <- m
    }
    tid <- tid + as.ITime("00:30")
  }
}
tid
total_antal <- vector("list", 21)
total_antal <- as.numeric(kommet_antal) - as.numeric(afhentet_antal) 
total_antal



This code works, and give me the correct number, but with datasets from a year with 5 days per week it is going to take a long time to count number of kids present.



T1 <- count(subset(Mandag, Kommet < "07:00")) - count(subset(Mandag, Afhentet <"07:00"))
T2 <- count(subset(Mandag, Kommet < "07:30")) - count(subset(Mandag, Afhentet <"07:30"))
T3 <- count(subset(Mandag, Kommet < "08:00")) - count(subset(Mandag, Afhentet <"08:00"))
T4 <- count(subset(Mandag, Kommet < "08:30")) - count(subset(Mandag, Afhentet <"08:30"))
T5 <- count(subset(Mandag, Kommet < "09:00")) - count(subset(Mandag, Afhentet <"09:00"))
T6 <- count(subset(Mandag, Kommet < "09:30")) - count(subset(Mandag, Afhentet <"09:30"))
T7 <- count(subset(Mandag, Kommet < "10:00")) - count(subset(Mandag, Afhentet <"10:00"))
T8 <- count(subset(Mandag, Kommet < "10:30")) - count(subset(Mandag, Afhentet <"10:30"))
T9 <- count(subset(Mandag, Kommet < "11:00")) - count(subset(Mandag, Afhentet <"11:00"))
T10 <- count(subset(Mandag, Kommet < "11:30")) - count(subset(Mandag, Afhentet <"11:30"))
T11 <- count(subset(Mandag, Kommet < "12:00")) - count(subset(Mandag, Afhentet <"12:00"))
T12 <- count(subset(Mandag, Kommet < "12:30")) - count(subset(Mandag, Afhentet <"12:30"))
T13 <- count(subset(Mandag, Kommet < "13:00")) - count(subset(Mandag, Afhentet <"13:00"))
T14 <- count(subset(Mandag, Kommet < "13:30")) - count(subset(Mandag, Afhentet <"13:30"))
T15 <- count(subset(Mandag, Kommet < "14:00")) - count(subset(Mandag, Afhentet <"14:00"))
T16 <- count(subset(Mandag, Kommet < "14:30")) - count(subset(Mandag, Afhentet <"14:30"))
T17 <- count(subset(Mandag, Kommet < "15:00")) - count(subset(Mandag, Afhentet <"15:00"))
T18 <- count(subset(Mandag, Kommet < "15:30")) - count(subset(Mandag, Afhentet <"15:30"))
T19 <- count(subset(Mandag, Kommet < "16:00")) - count(subset(Mandag, Afhentet <"16:00"))
T20 <- count(subset(Mandag, Kommet < "16:30")) - count(subset(Mandag, Afhentet <"16:30"))
T21 <- count(subset(Mandag, Kommet < "17:00")) - count(subset(Mandag, Afhentet <"17:00"))

#Laver output i dataframe

W <- c(rep("Week1", 22*5), rep("Week2", 22*5), rep("Week3", 22*5), rep("Week4", 22*5))
D <- c(rep("Monday", 22*4), rep("Tuesday", 22*4), rep("Wednesday", 22*4), rep("Thursday", 22*4),rep("Friday", 22*4))
Time <- c(rep(1:22, 20))
Value1 <- c(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, rep(0,419))
Value <- do.call(rbind, Value1)

Output <- data.frame(W, D, Time, Value)
View(Output)  ```        

【问题讨论】:

    标签: r for-loop data-manipulation


    【解决方案1】:

    lubridate 包有一些有用的函数可以解决这类问题。

    您没有可重复的示例,因此我的帮助必须是通用的,您可以将这些原则应用于您的情况。

    设置上下文:日期和时间在编程和 R 语言中都非常复杂。您正在检查时间是否落在下车和取车的两个端点之间。函数%within% 负责这种类型的操作。一个例子是time_check %within% dropoff_pickup_intervals。当时间检查(假设07:30 介于学生的下车和接送时间之间)时,这将返回 TRUE。

    但我们首先需要正确的日期和间隔格式。下面是一些示例代码:(注意大部分代码最初是为了帮助创建示例,函数的内容较低)

    library(lubridate)
    
    # Make example data
    # ----
    # set seed for repeatable results for random processes
    set.seed(20201023)
    
    # set a time for start and ending for dropoffs and pickups using "ymd_hm" lubridate function
    daystart <- ymd_hm("2020-01-01 07:00")
    dayend <- ymd_hm("2020-01-01 16:00")
    
    # Create a sequence of dates 
    time_set <- seq(daystart, dayend, by="min")
    
    # data frame sampling times from 
    kinder <- data.frame(student_id=1:10,
                         dropoff=sample(time_set, 10),
                         pickup=sample(time_set, 10))
    
    # remove dates that don't make sense
    kinder <- kinder %>% filter(pickup > dropoff)
    # --- Example data complete
    
    # create time intervals for student arrival and leave times
    dropoff_pickup <- interval(kinder$dropoff, kinder$pickup)
    
    # Create sequence to check times every 30 minutes
    time_checks <- seq(daystart, dayend, by="30 min")
    
    # for every student check whether present at time checks
    student_present <- sapply(k$intervals, function(x) time_checks %within% x)
    
    # (Bonus: Make into a nice looking data frame)
    df1 <- as.data.frame(t(student_present))
    names(df1) <- substr(as.character(time_checks), 12, 16)
    df1 <- cbind(k$id, df1)
    df1
    

    有关%within% functionlubridate 的更多信息

    【讨论】:

      【解决方案2】:

      谢谢.. 我可以运行您的示例并且它有效,然后我想我可以运行一个计数函数来计算在不同时间间隔内出现的孩子的实际数量。

      但是我不能在我的数据上运行它,我试图改变它但是当我想使用区间函数时,我遇到了问题。即使我的变量是字符类型。文档说应该可以 - 如果我理解正确的话。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2015-06-15
        • 1970-01-01
        • 2018-07-08
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多