【发布时间】:2021-03-10 17:13:06
【问题描述】:
我需要根据一组变量的成员资格和 5 分钟的时间间隔为独立观察分配不同的值。 作为我的数据框的例子:
Fecha <- c(rep("22-05-2019", 9), rep("23-05-2019", 10), rep("24-05-2019", 3))
Noche <- c(rep(1,9), rep(2,13))
Parcela <- c(rep("CH1", 9), rep("CC1", 13))
Camara <- c(rep(2, 18), rep(3, 4))
Tratamiento <- c(rep("CHUNCHO", 9), rep("CONCON", 13))
Hora <- c("20:07", "20:10", "20:15", "23:19", "23:20", "23:22", "23:25", "23:43", "23:44", "22:10", "22:12", "22:13", "22:18", "22:39", "23:12", "23:29", "23:33", "23:43", "23:59", "0:21", "0:22", "1:20")
Especie <- c(rep("OL", 3), rep("AX", 4), rep("RR", 2), rep("AX", 5), rep("RR", 8))
datos <- data.frame(Fecha, Noche, Parcela, Camara, Tratamiento, Hora, Especie)
datos
# Fecha Noche Parcela Camara Tratamiento Hora Especie
1 22-05-2019 1 CH1 2 CHUNCHO 20:07 OL
2 22-05-2019 1 CH1 2 CHUNCHO 20:10 OL
3 22-05-2019 1 CH1 2 CHUNCHO 20:15 OL
4 22-05-2019 1 CH1 2 CHUNCHO 23:19 AX
5 22-05-2019 1 CH1 2 CHUNCHO 23:20 AX
6 22-05-2019 1 CH1 2 CHUNCHO 23:22 AX
7 22-05-2019 1 CH1 2 CHUNCHO 23:25 AX
8 22-05-2019 1 CH1 2 CHUNCHO 23:43 RR
9 22-05-2019 1 CH1 2 CHUNCHO 23:44 RR
10 23-05-2019 2 CC1 2 CONCON 22:10 AX
11 23-05-2019 2 CC1 2 CONCON 22:12 AX
12 23-05-2019 2 CC1 2 CONCON 22:13 AX
13 23-05-2019 2 CC1 2 CONCON 22:18 AX
14 23-05-2019 2 CC1 2 CONCON 22:39 AX
15 23-05-2019 2 CC1 2 CONCON 23:12 RR
16 23-05-2019 2 CC1 2 CONCON 23:29 RR
17 23-05-2019 2 CC1 2 CONCON 23:33 RR
18 23-05-2019 2 CC1 2 CONCON 23:43 RR
19 23-05-2019 2 CC1 3 CONCON 23:59 RR
20 24-05-2019 2 CC1 3 CONCON 0:21 RR
21 24-05-2019 2 CC1 3 CONCON 0:22 RR
22 24-05-2019 2 CC1 3 CONCON 1:20 RR
这将是分配的事件:
# Fecha Noche Parcela Camara Tratamiento Hora Especie Group Event
1 22-05-2019 1 CH1 2 CHUNCHO 20:07 OL AA 1
2 22-05-2019 1 CH1 2 CHUNCHO 20:10 OL AA 1
3 22-05-2019 1 CH1 2 CHUNCHO 20:15 OL AA 2
4 22-05-2019 1 CH1 2 CHUNCHO 23:19 AX AB 3
5 22-05-2019 1 CH1 2 CHUNCHO 23:20 AX AB 3
6 22-05-2019 1 CH1 2 CHUNCHO 23:22 AX AB 3
7 22-05-2019 1 CH1 2 CHUNCHO 23:25 AX AB 4
8 22-05-2019 1 CH1 2 CHUNCHO 23:43 RR AC 5
9 22-05-2019 1 CH1 2 CHUNCHO 23:44 RR AC 5
10 23-05-2019 2 CC1 2 CONCON 22:10 AX AD 6
11 23-05-2019 2 CC1 2 CONCON 22:12 AX AD 6
12 23-05-2019 2 CC1 2 CONCON 22:13 AX AD 6
13 23-05-2019 2 CC1 2 CONCON 22:18 AX AD 7
14 23-05-2019 2 CC1 2 CONCON 22:39 AX AD 8
15 23-05-2019 2 CC1 2 CONCON 23:12 RR AE 9
16 23-05-2019 2 CC1 2 CONCON 23:29 RR AE 10
17 23-05-2019 2 CC1 2 CONCON 23:33 RR AE 10
18 23-05-2019 2 CC1 2 CONCON 23:43 RR AE 11
19 23-05-2019 2 CC1 3 CONCON 23:59 RR AF 12
20 24-05-2019 2 CC1 3 CONCON 0:21 RR AF 13
21 24-05-2019 2 CC1 3 CONCON 0:22 RR AF 13
22 24-05-2019 2 CC1 3 CONCON 1:20 RR AF 14
“事件”将是一个新变量,其值或标签(可以是数字、字母、符号等)在组之间(由 Noche、Parcela、Camara、Tratamiento 和 Especie 提供)和组内不同(如果有)他们之间的时间超过5分钟。间隔开始将设置为一些早期观察,因此对于所有后续观察,它不会是 5 分钟的差异。 “组”列不是必需的,我只是为了阐明组,如果解决方案只为每个组提供独特的事件,这将很有用。
Ronak Shah 提供的解决方案非常接近:
library(dplyr)
datos %>%
tidyr::unite(datetime, Fecha, Hora, sep = ' ') %>%
mutate(datetime = dmy_hm(datetime)) %>%
group_by(Parcela, Camara, Tratamiento, Especie) %>%
mutate(grp = cut(datetime, breaks = '5 mins')) %>%
group_by(grp, .add = TRUE) %>%
mutate(Event = cur_group_id())
,但仍然存在一些错误。在示例中,第 16 行和第 17 行应该在同一个事件中,但使用此方法时会分开显示
【问题讨论】:
标签: r events variable-assignment intervals