【问题标题】：How to integrate set of vector in multiple data.frame into one without duplication?如何将多个data.frame中的一组向量整合为一个而不重复？
【发布时间】：2016-08-12 10:44:59
【问题描述】：

我在 data.frame 对象中有位置索引向量，但是在每个 data.frame 对象中，位置索引向量的顺序非常不同。但是，我想以非常特定的顺序将这些 data.frame 对象对象集成/合并到一个常见的 data.frame 中，并且不允许其中有重复。有谁知道更容易做到这一点的任何技巧？任何人都可以提出可能的方法来完成这项任务吗？

数据

v1 <- data.frame(
  foo=c(1,2,3),
  bar=c(1,2,2),
  bleh=c(1,3,0))

v2 <-  data.frame(
  bar=c(1,2,3),
  foo=c(1,2,0),
  bleh=c(3,3,4))

v3 <-  data.frame(
  bleh=c(1,2,3,4),
  foo=c(1,1,2,0),
  bar=c(0,1,2,3))

整合后的初始输出：

initial_output <- data.frame(
  foo=c(1,2,3,1,2,0,1,1,2,0),
  bar=c(1,2,2,1,2,3,0,1,2,3),
  bleh=c(1,3,0,3,3,4,1,2,3,4)
)

删除重复

rmDuplicate_output <- data.frame(
  foo=c(1,2,3,1,0,1,1),
  bar=c(1,2,2,1,3,0,1),
  bleh=c(1,3,0,3,4,1,2)
)

最终期望的输出：

final_output <- data.frame(
  foo=c(1,1,1,1,2,3,0),
  bar=c(0,1,1,1,2,2,3),
  bleh=c(1,1,2,3,3,0,4)
)

如何轻松获得最终所需的输出？有什么有效的方法可以对 data.frame 对象进行这种操作吗？谢谢

【问题讨论】：

另外，library(data.table) ; unique(rbindlist(mget(ls()), use.names = TRUE))
@DavidArenburg：你能详细说明你的答案吗？
好的，我已经添加了一些解释的答案

标签： r vector dataframe merge

【解决方案1】：

您还可以使用 mget/ls 组合来以编程方式获取您的数据帧（无需输入个人名称），然后使用 data.tables rbindlist 和 unique 函数/方法以提高效率增益（见here和here）

library(data.table)
unique(rbindlist(mget(ls(pattern = "v\\d+")), use.names = TRUE))
#    foo bar bleh
# 1:   1   1    1
# 2:   2   2    3
# 3:   3   2    0
# 4:   1   1    3
# 5:   0   3    4
# 6:   1   0    1
# 7:   1   1    2

附带说明一下，通常最好将多个 data.frames 保存在一个列表中，这样您就可以更好地控制它们

【讨论】：

【解决方案2】：

我们可以使用dplyr中的bind_rows，用'bar'删除带有distinct和arrange的重复项

library(dplyr)
bind_rows(v1, v2, v3) %>%
             distinct %>%
             arrange(bar)
#    foo bar bleh
#1   1   0    1
#2   1   1    1
#3   1   1    3
#4   1   1    2
#5   2   2    3
#6   3   2    0
#7   0   3    4

【讨论】：

【解决方案3】：

这里有一个解决方案：

# combine dataframes
df = rbind(v1, v2, v3)

# remove duplicated
df = df[! duplicated(df),]

# sort by 'bar' column
df[order(df$bar),]
    foo bar bleh
7   1   0    1
1   1   1    1
4   1   1    3
8   1   1    2
2   2   2    3
3   3   2    0
6   0   3    4

【讨论】：