【发布时间】:2018-04-21 21:43:04
【问题描述】:
我一直在使用 std::vector,想知道是否应该使用 std::map 进行键查找以提高性能。
这是我的完整测试代码。
#include <iostream>
#include <string>
#include <map>
#include <vector>
#include <ctime>
#include <chrono>
using namespace std;
vector<string> myStrings = {"aaa", "bbb", "ccc", "ddd", "eee", "fff", "ggg", "hhh", "iii", "jjj", "kkk", "lll", "mmm", "nnn", "ooo", "ppp", "qqq", "rrr", "sss", "ttt", "uuu", "vvv", "www", "xxx", "yyy", "zzz"};
struct MyData {
string key;
int value;
};
int findStringPosFromVec(const vector<MyData> &myVec, const string &str) {
auto it = std::find_if(begin(myVec), end(myVec),
[&str](const MyData& data){return data.key == str;});
if (it == end(myVec))
return -1;
return static_cast<int>(it - begin(myVec));
}
int main(int argc, const char * argv[]) {
const int testInstance = 10000; //HOW MANY TIMES TO PERFORM THE TEST
//----------------------------std::map-------------------------------
clock_t map_cputime = std::clock(); //START MEASURING THE CPU TIME
for (int i=0; i<testInstance; ++i) {
map<string, int> myMap;
//insert unique keys
for (int i=0; i<myStrings.size(); ++i) {
myMap[myStrings[i]] = i;
}
//iterate again, if key exists, replace value;
for (int i=0; i<myStrings.size(); ++i) {
if (myMap.find(myStrings[i]) != myMap.end())
myMap[myStrings[i]] = i * 100;
}
}
//FINISH MEASURING THE CPU TIME
double map_cpu = (std::clock() - map_cputime) / (double)CLOCKS_PER_SEC;
cout << "Map Finished in " << map_cpu << " seconds [CPU Clock] " << endl;
//----------------------------std::vector-------------------------------
clock_t vec_cputime = std::clock(); //START MEASURING THE CPU TIME
for (int i=0; i<testInstance; ++i) {
vector<MyData> myVec;
//insert unique keys
for (int i=0; i<myStrings.size(); ++i) {
const int pos = findStringPosFromVec(myVec, myStrings[i]);
if (pos == -1)
myVec.push_back({myStrings[i], i});
}
//iterate again, if key exists, replace value;
for (int i=0; i<myStrings.size(); ++i) {
const int pos = findStringPosFromVec(myVec, myStrings[i]);
if (pos != -1)
myVec[pos].value = i * 100;
}
}
//FINISH MEASURING THE CPU TIME
double vec_cpu = (std::clock() - vec_cputime) / (double)CLOCKS_PER_SEC;
cout << "Vector Finished in " << vec_cpu << " seconds [CPU Clock] " << endl;
return 0;
}
这就是我得到的结果。
Map Finished in 0.38121 seconds [CPU Clock]
Vector Finished in 0.346863 seconds [CPU Clock]
Program ended with exit code: 0
我通常在一个容器中存储少于 30 个元素。
这是否意味着在我的情况下使用 std::vector 而不是 std::map 更好?
编辑:当我在循环前移动 map<string, int> myMap; 时,std::map 比 std::vector 快。
Map Finished in 0.278136 seconds [CPU Clock]
Vector Finished in 0.328548 seconds [CPU Clock]
Program ended with exit code: 0
所以如果这是正确的测试,我猜 std::map 更快。
但是,如果我将元素数量减少到 10 个,std::vector 会更快,所以我猜这真的取决于元素的数量。
【问题讨论】:
-
您的地图计时包括多次填充地图。向量没有。要获得真实的比较,请在计时代码之外填充地图。
-
使用基准标记库,您无需考虑缓存加热等问题。
-
您将线性时间查找与对数时间查找进行比较。伙计来吧使用正确的数据结构,您不会将苹果与花生进行比较
-
您可能已经知道这一点,但
unordered_map和unordered_set在任何时候(对于大型数据集)查找时都比它们中的任何一个要快得多。
标签: c++ dictionary vector