检查元素是否在两个向量中的最快方法答案

【问题标题】：Fastest way to check if element is in both vectors检查元素是否在两个向量中的最快方法
【发布时间】：2015-01-23 17:43:05
【问题描述】：

所以，假设我们有两个向量，vec1 和 vec2。仅对两个向量中的元素执行某些操作的最快方法是什么。到目前为止，我已经做到了。简单来说，我们如何才能更快地实现这一点，或者有什么办法：

vector<Test*> vec1;
vector<Test*> vec2;

//Fill both of the vectors, with vec1 containing all existing 
//objects of Test, and vec2 containing some of them.


for (Test* test : vec1){

    //Check if test is in vec2
    if (std::find(vec2.begin(), vec2.end(), test) != vec2.end){

        //Do some stuff

    }

}

【问题讨论】：

向量是否已排序？可以使用其他数据结构吗？
@Borgleader 即使它们不是，你也可以在 O(nlogn+mlogm) 时间内稳定地对它们进行排序，这肯定比 O(n*m) 更胜一筹
如果这些已排序，std::upper_bound 会很有帮助。如果没有，有很多方法可以做到这一点，std::unordered_set<Test*> 是值得考虑的一种方法。
@WhozCraig 如果它们已排序，您可以使用std::set_union
@IdeaHat 是的，<algorithms> 中的设置操作继续躲避我，因为我很少使用它们。一个很好的观点。顺便说一句，你的意思是std::set_intersection？

标签： c++ algorithm vector

【解决方案1】：

您的方法是 O(M*N)，因为对于 vec1 的每个元素，它在 vec2 的元素数量中调用 std::find 线性。您可以通过多种方式对其进行改进：

对vec2 进行排序可以让您将时间减少到 O((N+M)*Log M) - 即您可以在vec2.begin(), vec2.end() 范围内使用二分查找
对两个向量进行排序可以让您在 O(NLog N + MLog M) 中进行搜索 - 您可以使用类似于合并排序范围的算法在线性时间内找到匹配对
为vec2 元素使用哈希集可以让您将时间减少到 O(N+M) - 现在该集的构建时间和其中的搜索都是线性的。

【讨论】：

愚蠢地想...如果 N
@IdeaHat 一般来说，最好检查两个向量的相对大小，以及sort/hash/等。两者中较小的一个。

【解决方案2】：

一个简单的方法是std::unordered_set

vector<Test*> vec1;
vector<Test*> vec2;

//Fill both of the vectors, with vec1 containing all existing 
//objects of Test, and vec2 containing some of them.
std::unordered_set<Test*> set2(vec2.begin(),vec2.end());

for (Test* t : vec1) {
   //O(1) lookup in hash set
   if (set2.find(t)!=set2.end()) {
     //stuff
    }
 }

O(n+m)，其中n是vec1中的元素个数，m是vec2中的元素个数 }

【讨论】：