【发布时间】:2018-11-06 14:05:58
【问题描述】:
我有这个数据框:
df <- structure(list(Name = c("Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1",
"Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1",
"Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub1", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2",
"Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2",
"Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2", "Sub2"),
StimulusName = c("Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1",
"Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2",
"Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2",
"Stim2", "Stim2", "Stim2", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1",
"Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim1", "Stim2",
"Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2",
"Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2", "Stim2"),
Fixation = c(NA, NA, 1L, 1L, NA, NA, 2L, 2L, 3L, 3L, NA, NA, NA, NA, NA, 4L, 4L, 5L, 5L, NA, NA, NA, NA, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, NA, NA, NA, 3L, 3L, 3L, NA, NA, NA, NA, NA, NA, 1L, 1L, 1L, 1L, 2L, 2L, NA, NA, 3L, 3L, 3L, 4L, 4L, 4L, NA, NA, 1L, 1L, NA,
NA, 2L, 2L, 3L, 3L, NA, NA, NA, NA, NA, 4L, 4L, 5L, 5L, NA)),
row.names = c(NA, -79L), class = c("tbl_df", "tbl", "data.frame"))
共有 3 列:Name、StimulusName 和 Fixation。
我希望能够返回Fixation 列中唯一值的第一个 示例的行号,并将它们按Name 和StimulusName 分组。
这是我迄今为止尝试过的(基于在其他地方找到的部分解决方案):
# function to return rows
Unique_Indices <- function(Values){
unik <- !duplicated(Values) ## logical vector of unique values
return(seq_along(Values)[unik]) ## indices
}
但是当我将它与 dplyr 链一起使用时,它不会返回原始行号,而是通过分组重新开始行计数:
library(tidyr)
# This doesn't work
Unique_Index <- df %>%
group_by(Name, StimulusName) %>%
summarise(Indices = list(Unique_Indices(Fixation))) %>%
unnest()
不正确的输出如下所示:
您可以看到,Indices 移动到下一个 StimulusName 后,由于 group_by 指令,它不包含原始行号。在保留df 的原始行号的同时,我有什么办法可以group_by 吗?
【问题讨论】:
-
正确的预期结果是什么?
-
我不确定,但
df %>% rownames_to_column() %>% group_by(Name, StimulusName) %>% filter(!duplicated(Fixation))是否给出了您预期的输出? -
您的数据没有唯一价值
-
嗨@kath,这似乎奏效了,是的。如果您将解决方案弹出到答案中,我会接受。