【发布时间】:2016-07-06 23:37:08
【问题描述】:
我在这里Finding the index based on two data frames of strings 提出了一个问题,我得到了完美的答案。 现在我遇到了另一个我无法解决的问题。如果我的第二个数据超过一列,那么我可以根据
setDT(strs)[, c('colids1','colids2') := lapply(.SD, function(x) toString(which(colSums(lut == x, na.rm=TRUE) > 0))), by = 1:nrow(strs)][]
只要我的第二个数据(strs)在所有列中的长度相同,就可以了 但如果它们不同(长度不同),那么这不起作用并给我一个错误。
假设我的第一个数据是
lut <- structure(list(V1 = c("O75663", "O95400", "O95433", NA, NA),
V2 = c("O95456", "O95670", NA, NA, NA), V3 = c("O75663",
"O95400", "O95433", "O95456", "O95670"), V4 = c("O95456",
"O95670", "O95801", "P00352", NA), V1 = c("O75663", "O95400",
"O95433", NA, NA), V2 = c("O95456", "O95670", NA, NA, NA),
V3 = c("O75663", "O95400", "O95433", "O95456", "O95670"),
V4 = c("O95456", "O95670", "O95801", "P00352", NA)), .Names = c("V1",
"V2", "V3", "V4", "V1", "V2", "V3", "V4"), row.names = c(NA,
-5L), class = "data.frame")
我的第二个数据是
strs <- structure(list(strings = structure(c(2L, 3L, 4L, 5L, 6L, 7L,
1L, 1L), .Label = c("", "O75663", "O95400", "O95433", "O95456",
"O95670", "O95801"), class = "factor"), strings2 = structure(c(4L,
2L, 6L, 5L, 3L, 1L, 1L, 1L), .Label = c("", "O75663", "O95433",
"O95456", "P00352", "P00492"), class = "factor"), strings3 = structure(c(4L,
6L, 7L, 8L, 2L, 3L, 5L, 1L), .Label = c("", "O75663", "O95400",
"O95456", "O95670", "O95801", "P00352", "P00492"), class = "factor"),
strings4 = structure(c(2L, 5L, 3L, 4L, 1L, 1L, 1L, 1L), .Label = c("",
"O95400", "O95456", "O95801", "P00492"), class = "factor"),
strings5 = structure(c(8L, 2L, 7L, 1L, 3L, 6L, 5L, 4L), .Label = c("O75663",
"O95400", "O95433", "O95456", "O95670", "O95801", "P00352",
"P00492"), class = "factor")), .Names = c("strings", "strings2",
"strings3", "strings4", "strings5"), class = "data.frame", row.names = c(NA,
-8L))
这就是我尝试做的事情
df<- setDT(strs)[, paste0('colids_',seq_along(strs)) := lapply(.SD, function(x) toString(which(colSums(lut == x, na.rm=TRUE) > 0))), by = 1:nrow(strs)][]
如果 strs 的长度相同,它可以工作,但当长度不同时,它不起作用,例如我在这里给出的示例
【问题讨论】:
-
错误很明显。试试这个
strs[c(1:3,5)] <- lapply(strs[c(1:3,5)], as.character)然后运行你的data.table语句。生成的df是否符合您的预期? -
@Sumedh 感谢您的留言,它并没有解决问题。我按照你说的做了,然后我做了 df 0))), by = 1:nrow(strs)][] 然后我得到了同样的错误。
-
@Sumedh 我一直在尝试网络上的每一条评论,但我不知道为什么它不起作用!!!
-
抱歉,我第一次使用
strs数据框一定是做了什么。尝试strs[,c(1:5)] <- lapply(strs[,c(1:5)], as.character),然后运行您的代码。简而言之,将strs数据集中的所有变量从factor转换为character类 -
@nik 你必须这样做 strs[]
标签: r