【发布时间】:2016-07-17 08:03:26
【问题描述】:
我有数据集
ID <- c(1,1,2,2,2,2,3,3,3,3,3,4,4,4)
Eval <- c("A","A","B","B","A","A","A","A","B","B","A","A","A","B")
med <- c("c","d","k","k","h","h","c","d","h","h","h","c","h","k")
df <- data.frame(ID,Eval,med)
> df
ID Eval med
1 1 A c
2 1 A d
3 2 B k
4 2 B k
5 2 A h
6 2 A h
7 3 A c
8 3 A d
9 3 B h
10 3 B h
11 3 A h
12 4 A c
13 4 A h
14 4 B k
我尝试创建变量 x 和 y,按 ID 和 Eval 分组。对于每个ID,if Eval = A, and med = "h" or "k",我设置x = 1,其他明智的x = 0,if Eval = B and med = "h" or "k",我设置y = 1,其他明智的y = 0。我用我不喜欢的方式,我得到了答案,但似乎不是那么好
df <- data.table(df)
setDT(df)[, count := uniqueN(med) , by = .(ID,Eval)]
setDT(df)[Eval == "A", x:= ifelse(count == 1 & med %in% c("k","h"),1,0), by=ID]
setDT(df)[Eval == "B", y:= ifelse(count == 1 & med %in% c("k","h"),1,0), by=ID]
ID Eval med count x y
1: 1 A c 2 0 NA
2: 1 A d 2 0 NA
3: 2 B k 1 NA 1
4: 2 B k 1 NA 1
5: 2 A h 1 1 NA
6: 2 A h 1 1 NA
7: 3 A c 3 0 NA
8: 3 A d 3 0 NA
9: 3 B h 1 NA 1
10: 3 B h 1 NA 1
11: 3 A h 3 0 NA
12: 4 A c 2 0 NA
13: 4 A h 2 0 NA
14: 4 B k 1 NA 1
然后我需要折叠行以获得唯一ID,我不知道如何折叠行,知道吗?
输出
ID x y
1 0 0
2 1 1
3 0 1
4 0 1
【问题讨论】:
-
ID 1 的“y”不应该为 0,因为其中只有 NA
-
是的,你是对的
-
对于数据的第 11 行,
x不应该是1,因为 Eval 是A而 med 是h? -
@Maiasaura, line 11, ID 3 with Eval A, 你可以看到它有 Eval "c", "d","h", 它不在组 "h" 和 "k" .所以应该是 0
标签: r if-statement duplicates data.table