一个向量作为两个其他向量的拼凑而成答案

【问题标题】：A vector as a patchwork of two other vectors一个向量作为两个其他向量的拼凑而成
【发布时间】：2017-03-28 21:34:37
【问题描述】：

对向量进行子集化

以下是对向量子集的两种不同解决方案的基准

#include <vector>
#include <iostream>
#include <iomanip>
#include <sys/time.h>

using namespace std;

    int main()
    {
           struct timeval timeStart,
                    timeEnd;

           // Build the vector 'whole' to subset
           vector<int> whole;
           for (int i = 0 ; i < 10000000  ; i++)
           {
              whole.push_back(i);
           }

           // Solution 1 - Use a for loops
           gettimeofday(&timeStart, NULL);
           vector<int> subset1;
           subset1.reserve(9123000 - 1200);
           for (int i = 1200 ; i < 9123000 ; i++)
           {
               subset1.push_back(i);
           }
           gettimeofday(&timeEnd, NULL);

           cout << "Solution 1 took " << ((timeEnd.tv_sec - timeStart.tv_sec) * 1000000 + timeEnd.tv_usec - timeStart.tv_usec) << " us"  << endl;

           // Solution 2 - Use iterators and constructor
           gettimeofday(&timeStart, NULL);
           vector<int>::iterator first = whole.begin() + 1200;
           vector<int>::iterator last =  whole.begin() + 9123000;
           vector<int> subset2(first, last);
           gettimeofday(&timeEnd, NULL);

           cout << "Solution 2 took " << ((timeEnd.tv_sec - timeStart.tv_sec) * 1000000 + timeEnd.tv_usec - timeStart.tv_usec) << " us"  << endl;
 }

在我的旧笔记本电脑上，它输出

Solution 1 took 243564 us
Solution 2 took 164220 us

显然解决方案 2 更快。

拼凑两个向量

我想创建一个向量作为两个相同大小的不同向量的拼凑而成。向量从一个开始，然后来回取另一个的值。我想我不完全理解如何通过使用指向另一个向量中的元素的迭代器来将值复制到向量。我能想到的唯一实现需要使用类似于上述解决方案 1 的方法。比如……

#include <vector>
#include <iostream>
#include <cmath>
#include <iomanip>
#include <sys/time.h>
#include <limits.h>

using namespace std;

int main()
{

  // input 
  vector<int> breakpoints = {2, 5, 7, INT_MAX};
  vector<int> v1 = { 1,  2,  3,  4,  5,  6,  7,  8,  9 };
  vector<int> v2 = { 10, 20, 30, 40, 50, 60, 70, 80, 90 };

  // Create output
  vector<int> ExpectedOutput;
  ExpectedOutput.reserve(v1.size());
  int origin = 0;
  int breakpoints_index = 0;
  for (int i = 0 ; i < v1.size() ; i++)
  {
     if (origin)
     {
        ExpectedOutput.push_back(v1[i]);
     } else
     {
        ExpectedOutput.push_back(v2[i]);
     }
     if (breakpoints[breakpoints_index] == i)
     {
        origin = !origin;
        breakpoints_index++;
     }
  }


  // print output
  cout << "output: ";
  for (int i = 0 ; i < ExpectedOutput.size() ; i++)
  {
     cout << ExpectedOutput[i] << " ";
  }
  cout << endl;

  return 0;
}

哪个输出

output: 10 20 30 4 5 6 70 80 9

感觉必须有更好的解决方案，例如类似于上面的解决方案 2。有更快的解决方案吗？

【问题讨论】：

我怀疑解决方案 2 的速度要快得多，因为编译器设法将其优化为单个 memcpy 调用（或机器代码中的等效项）。对于交织两个向量的内容的任务，再多的聪明也无法做到这一点。
哦....所以我猜随着两个源向量的“块”变得足够大，制作一个 memcpy 会变得有益（我以前不知道那个函数阅读您的评论）调用每个块而不是分别复制每个元素（示例中的每个 int），对吗？

标签： c++ performance loops vector iterator

【解决方案1】：

重复push_back() 意味着每次循环时都会执行检查以确保capacity() 足够大（如果不是，则必须保留更多空间）。复制整个范围时，只需进行一次capacity() 检查。

您仍然可以通过复制块来更聪明地处理交错。这是非常基本的想法：

int from = 0;
for( int b : breakpoints )
{
    std::swap( v1, v2 );
    int to = 1 + std::min( b, static_cast<int>( v1.size() ) - 1 );
    ExpectedOutput.insert( ExpectedOutput.end(), v1.begin() + from, v1.begin() + to );
    from = to;
}

为简洁起见，此代码实际上交换了v1 和v2，因此始终在v1 上运行。我在插入之前进行了交换，以模拟代码中的逻辑（首先作用于v2）。如果需要，您可以以非修改方式执行此操作。

当然，您可以在此代码中看到更多内容。只有当断点比值少得多时，它才有意义。请注意，它还假设v1 和v2 的长度相同。

【讨论】：