【发布时间】:2020-07-04 17:36:32
【问题描述】:
我正在努力弄清楚如何将 BORIS 输出转换为我可以使用 TraMineR 分析的一种状态序列分析格式。
BORIS 输出基本上是如下所示的表格:
File Time Behavior Status
1 K8121319_feed3_01 0.000 Approach START
2 K8121319_feed3_01 393.225 Approach STOP
3 K8121319_feed3_01 393.226 Out-of-Frame START
4 K8121319_feed3_01 426.003 Out-of-Frame STOP
5 K8121319_feed3_01 442.006 Approach START
6 K8121319_feed3_01 465.755 Approach STOP
7 K8121319_feed3_01 465.756 Avoid START
8 K8121319_feed3_01 513.255 Avoid STOP
9 K8121319_feed3_01 513.256 Explore START
10 K8121319_feed3_01 746.577 Explore STOP
似乎可以使用 dplyr 转换为 SPELL 序列格式,但我不知道如何。有人一起用过这两个软件吗?
SPELL 格式如下所示:
File Behavior Start Stop
1 K8121319_feed3_01 Approach 0.000 393.225
2 K8121319_feed3_01 OOF 393.226 426.003
3 K8121319_feed3_01 Approach 426.006 465.755
4 K8121319_feed3_01 Avoid 465.756 513.255
5 K8121319_feed3_01 Explore 513.256 746.577
我一直在尝试使用 dplyr::spread 来做到这一点。
编辑:这是 dput(data1[1:20,]) 的结果
structure(list(File = c("K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_01",
"K8121319_feed3_01", "K8121319_feed3_01", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02",
"K8121319_feed3_02", "K8121319_feed3_02", "K8121319_feed3_02"
), Time = c(0, 393.225, 393.226, 426.003, 442.006, 465.755, 465.756,
513.255, 513.256, 746.577, 0, 29.85, 29.851, 66.6, 66.601, 292.646,
292.647, 362.208, 362.209, 442.456), Behavior = c("Approach",
"Approach", "Out-of-Frame", "Out-of-Frame", "Approach", "Approach",
"Avoid", "Avoid", "Explore", "Explore", "Approach", "Approach",
"Avoid", "Avoid", "Approach", "Approach", "Avoid", "Avoid", "Approach",
"Approach"), Status = c("START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP", "START",
"STOP", "START", "STOP", "START", "STOP", "START", "STOP")), row.names = c(NA,
20L), class = "data.frame")
编辑:dput 用于重复状态的部分 df
dput(data1[360:370,])
structure(list(File = c("K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_13", "K8121819_feed3_13", "K8121819_feed3_13",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14",
"K8121819_feed3_14", "K8121819_feed3_14", "K8121819_feed3_14"
), Time = c(700.311, 700.312, 720.311, 742.851, 754.339, 0, 32.124,
32.125, 47.14, 47.141, 84.671), Behavior = c("Approach", "Avoid",
"Avoid", "Avoid", "Avoid", "Avoid", "Avoid", "Explore", "Explore",
"Approach", "Approach"), Status = c("STOP", "START", "STOP",
"START", "STOP", "START", "STOP", "START", "STOP", "START", "STOP"
)), row.names = 360:370, class = "data.frame")
【问题讨论】:
-
从user manual看来,这个包旨在分析离散时间序列上的分类值。您能否澄清一下您希望如何将连续尺度时间数据转换为离散数据?否则,我担心这个问题不是 Stack Overflow 的主题,因为它不够集中,无法被视为一个特定编程问题。不过,这可能是Cross Validated 的主题。
-
TraMineR 用于序列数据。如果您查看用户手册表 4.2 中的格式 - 它可以与连续时间尺度一起使用。请参阅类似的 SPELL 格式。我也会添加到简历中。谢谢
-
是的,dplyr 版本是 1.0.0。以下是所有错误和警告:错误:
mutate()输入问题START。 x 数学函数的非数字参数 i 输入START是1L + as.integer(floor(START))。运行rlang::last_error()以查看错误发生的位置。另外:警告信息:值不是唯一标识的;输出将包含列表列。 * 使用values_fn = list抑制此警告。 * 使用values_fn = length确定重复出现的位置 * 使用values_fn = {summary_fun}总结重复 -
是的。谢谢。
-
我确认 TraMineR 适用于具有离散时间尺度的分类序列。在用户手册的表 4.2 中,时间是一个离散值。对于 SPELL 格式也是如此,其中
from和to必须是离散值。