【发布时间】:2016-08-22 01:06:52
【问题描述】:
所以,我有两个数据集代表旧地址和当前地址。
> main
idspace id x y move
198 1238 33 4 stay
641 1236 36 12 move
1515 1237 30 28 move
> move
idspace id x y move
4 1236 4 1 move
我需要将新数据 (move) 与旧数据 (main) 合并,并在合并后更新 main。
我想知道是否有可能在一次手术中完成?
更新基于id,这是个人标识符。
idspace、x、y 是位置 ID。
所以,我需要的输出是
> main
idspace id x y move
198 1238 33 4 stay
4 1236 4 1 move # this one is updated
1515 1237 30 28 move
我不知道该怎么做。
有点像
merge(main, move, by = c('id'), all = T, suffixes = c('old', 'new'))
但是,这是错误的,因为我需要手动进行很多操作。
有什么解决办法吗?
数据
> dput(main)
structure(list(idspace = structure(c(2L, 3L, 1L), .Label = c("1515",
"198", "641"), class = "factor"), id = structure(c(3L, 1L, 2L
), .Label = c("1236", "1237", "1238"), class = "factor"), x = structure(c(2L,
3L, 1L), .Label = c("30", "33", "36"), class = "factor"), y = structure(c(3L,
1L, 2L), .Label = c("12", "28", "4"), class = "factor"), move = structure(c(2L,
1L, 1L), .Label = c("move", "stay"), class = "factor")), .Names = c("idspace",
"id", "x", "y", "move"), row.names = c(NA, -3L), class = "data.frame")
> dput(move)
structure(list(idspace = structure(1L, .Label = "4", class = "factor"),
id = structure(1L, .Label = "1236", class = "factor"), x = structure(1L, .Label = "4", class = "factor"),
y = structure(1L, .Label = "1", class = "factor"), move = structure(1L, .Label = "move", class = "factor")), .Names = c("idspace",
"id", "x", "y", "move"), row.names = c(NA, -1L), class = "data.frame")`
【问题讨论】:
-
我认为这是一个重复,因为
tmp <- rbind(move,main); tmp[!duplicated(tmp$id),]逻辑工作得很好,假设这里没有其他要求。 -
@thelatemail 我正在考虑使用
sqldf,但我不知道该 API 足以回答。 -
@TimBiegeleisen - 也许
sqldf(" select coalesce(b.idspace,a.idspace) as idspace, coalesce(b.id,a.id) as id, coalesce(b.x,a.x) as x, coalesce(b.y,a.y) as y, coalesce(b.move,a.move) as move from main a left join move b on a.id = b.id ")- 丑陋但确实有效。 -
它应该存在一个优雅的
sql,当然,我需要深入研究它,但感谢@TimBiegeleisen
标签: r merge sql-update