最近在给自己的服务器框架加上统计信息,其中一项就是统计创建的对象数,以及当前还存在的对象数,那么自然以对象名字作key。但写着写着,忽然纠结是用std::string还是const char *作key,哪个效率高些。由于这服务器框架业务逻辑全在lua脚本,在C++需要统计的对象没几个,其实用哪个没多大区别。我纠结的是,很久之前就知道这两者效率区别不大,但直到现在我都还没搞清楚为啥,于是写些代码来测试。
V1版本的代码如下:
#ifndef __MAP_H__ #define __MAP_H__ //-------------------------------------------------------------------------- // MurmurHash2, by Austin Appleby // Note - This code makes a few assumptions about how your machine behaves - // 1. We can read a 4-byte value from any address without crashing // 2. sizeof(int) == 4 // And it has a few limitations - // 1. It will not work incrementally. // 2. It will not produce the same results on little-endian and big-endian // machines. static inline unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed ) { // 'm' and 'r' are mixing constants generated offline. // They're not really 'magic', they just happen to work well. const unsigned int m = 0x5bd1e995; const int r = 24; // Initialize the hash to a 'random' value unsigned int h = seed ^ len; // Mix 4 bytes at a time into the hash const unsigned char * data = (const unsigned char *)key; while(len >= 4) { unsigned int k = *(unsigned int *)data; k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; data += 4; len -= 4; } // Handle the last few bytes of the input array switch(len) { case 3: h ^= data[2] << 16; case 2: h ^= data[1] << 8; case 1: h ^= data[0]; h *= m; }; // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } /* 自定义类型hash也可以放到std::hash中,暂时不这样做 * https://en.cppreference.com/w/cpp/utility/hash */ /* the default hash function in libstdc++:MurmurHashUnaligned2 * https://sites.google.com/site/murmurhash/ by Austin Appleby * other hash function(djb2,sdbm) http://www.cse.yorku.ca/~oz/hash.html */ struct hash_c_string { size_t operator()(const char *ctx) const { return MurmurHash2(ctx,strlen(ctx),static_cast<size_t>(0xc70f6907UL)); } }; /* compare function for const char* */ struct cmp_const_char { bool operator()(const char *a, const char *b) const { return std::strcmp(a, b) < 0; } }; /* compare function for const char* */ struct equal_c_string { bool operator()(const char *a, const char *b) const { return 0 == std::strcmp(a, b); } }; /* 需要使用hash map,但又希望能兼容旧版本时使用map_t */ #if __cplusplus < 201103L /* -std=gnu99 */ #include <map> #define map_t std::map #define const_char_map_t(T) std::map<const char *,T,cmp_const_char> #else /* if support C++ 2011 */ #include <unordered_map> #define map_t std::unordered_map // TODO:template<class T> using const_char_map_t = ...,但03版本不支持 #define const_char_map_t(T) \ std::unordered_map<const char *,T,hash_c_string,equal_c_string> #endif #endif /* __MAP_H__ */