为什么数组比向量快得多？答案

【问题标题】：Why is an array so much faster than a vector?为什么数组比向量快得多？
【发布时间】：2012-12-03 09:48:42
【问题描述】：

这是比较向量与数组的公平测试吗？速度上的差异似乎太大了。我的测试表明该阵列快 10 到 100 倍！

#include "stdafx.h"
#include <iostream>
#include <vector>
#include <windows.h>
#include <stdint.h>

using namespace std;

double PCFreq = 0.0;
__int64 CounterStart = 0;

using namespace std;

void StartCounter()
{
    LARGE_INTEGER li;
    if(!QueryPerformanceFrequency(&li))
    std:cout << "QueryPerformanceFrequency failed!\n";

    PCFreq = double(li.QuadPart)/1000000000;

    QueryPerformanceCounter(&li);
    CounterStart = li.QuadPart;
}
double GetCounter()
{
    LARGE_INTEGER li;
    QueryPerformanceCounter(&li);
    return double(li.QuadPart-CounterStart)/PCFreq;
}

int _tmain(int argc, _TCHAR* argv[])
{
    //Can do 100,000 but not 1,000,000
    const int vectorsize = 100000;
    cout.precision(10);

    StartCounter();
    vector<int> test1(vectorsize);
    for(int i=0; i<vectorsize; i++){
        test1[i] = 5;
    }
    cout << GetCounter() << endl << endl;


    StartCounter();
    int test2[vectorsize];
    for(int i=0; i<vectorsize; i++){
        test2[i] = 5;
    }
    cout << GetCounter() << endl << endl;

    cout << test2[0];

    int t = 0;
    cin >> t;
    return 0;
}

【问题讨论】：

看汇编代码。很有可能，几乎所有内容都在发布版本中进行了优化。
确实，您的第二个循环很可能已被编译器完全删除。
@user997112 正确。它被称为Dead Code Elimination。这是另一个例子：stackoverflow.com/questions/8841865/…
@user997112：您看到这种差异的事实可能表明您运行某种调试构建，未优化（甚至故意取消优化）并且断言严重超载（迭代器检查和类似的东西）。在调试版本中运行任何比较绝对没有意义。
一个区别是向量元素首先在构造函数中初始化为零，然后赋值为5。请尝试vector<int> test1(vectorsize, 5);。

标签： c++ arrays performance vector

【解决方案1】：

这取决于您要比较的内容。

您的基准测试同时测量设置时间和访问时间。毫无疑问，std::vector 的设置时间更昂贵。这是因为它需要分配内存，然后（根据标准的需要）在所有元素上调用默认构造函数。对于 POD 类型，这意味着归零。

因此，如果您尝试测量访问时间，那么您的基准测试并不准确。

这里有一些数字需要消化：

原始代码：

StartCounter();
vector<int> test1(vectorsize);

for(int i=0; i<vectorsize; i++){
    test1[i] = 5;
}
cout << GetCounter() << endl << endl;

时间：444353.5206

在声明和初始化vector之后开始计时：

vector<int> test1(vectorsize);

StartCounter();
for(int i=0; i<vectorsize; i++){
    test1[i] = 5;
}
cout << GetCounter() << endl << endl;

时间：15031.76101

对于数组：

StartCounter();
int test2[vectorsize];
for(int i=0; i<vectorsize; i++){
    test2[i] = 5;
}
cout << GetCounter() << endl << endl;

时间：38129.345

无论声明是否定时，时间大致相同。这可能是因为堆栈分配是在进入函数时立即完成的。

基本上，向量内存分配和初始化会花费不成比例的时间。但实际循环很快。

我还要指出，您当前的基准测试框架仍然存在明显缺陷。您只需对每个数组进行一次传递。所以缓存效果和延迟分配将很重要。

数组现在变慢的原因可能是由于延迟分配。数组已分配，但尚未提交。延迟分配意味着它在第一次访问时被提交 - 这涉及到一个页面错误和一个到内核的上下文切换。

这是一个更公平的测试，带有一个外部循环以增加基准测试时间：

vector<int> test1(vectorsize);

StartCounter();
for (int c = 0; c < 10000; c++){
    for(int i=0; i<vectorsize; i++){
        test1[i] = 5;
    }
}
cout << GetCounter() << endl << endl;

时间：227330454.6

int test2[vectorsize];
memset(test2,0,sizeof(test2));

StartCounter();
for (int c = 0; c < 10000; c++){
    for(int i=0; i<vectorsize; i++){
        test2[i] = 5;
    }
}
cout << GetCounter() << endl << endl;
cout << test2[0];

时间：212286228.2

因此，对于稳态访问，没有一个数组不比向量快。正确地进行基准测试很棘手。

【讨论】：

在声明之后也启动数组计数器呢？使两个循环在同一时间点启动计数器。
@Need4Sleep 我是双向的。没有性能差异。这是因为数组在堆栈上，并且在进入函数时分配。
从时序中删除两个初始化，为什么向量会比静态数组循环更快？我正在查看发布汇编器，静态数组循环只有 2x MOV、1x LEA 和 1 repo stos 指令？
@user997112 这可能是因为由于延迟分配，该数组可能尚未分页。 So the OS needs to zero it upon committing it due to security reasons.
@Mystical 如果您过度分配，您可能会将内核零池运行到空，此时您可能会发现您的线程被取消调度，直到池被补充。一个更好的测试是尝试对一页分配和一页空闲进行计时，然后在测试实例之间进行线程休眠，以给零池时间来补充。这种行为至少从 Windows 2000 开始就存在，并记录在 Windows Internals 系列书籍中。