用于查找项目的字典与列表答案

【问题标题】：Dictionary vs List for looking up an item用于查找项目的字典与列表
【发布时间】：2013-08-15 08:08:40
【问题描述】：

我的印象是在字典中查找项目比在列表中查找项目更快，以下代码似乎暗示了其他情况：

字典：66 记号

列表：32 个滴答声

我假设我在某个地方搞砸了？

static void Main(string[] args)
    {
        // Speed test.
        Dictionary<string, int> d = new Dictionary<string, int>()
        {
            {"P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 2},
            {"P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 1},
            {"P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", -1},
            {"P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS", 2} 
        };

        List<string> l = new List<string>();
            l.Add("P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS");


        Stopwatch sw = new Stopwatch();

        string temp = "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS";

        bool inDictionary = false;

        sw.Start();
        if (d.ContainsKey(temp))
        {
            sw.Stop();
            inDictionary = true;
        }
        else sw.Reset();

        Console.WriteLine(sw.ElapsedTicks.ToString());
        Console.WriteLine(inDictionary.ToString());


        bool inList = false;

        sw.Reset();
        sw.Start();
        if (l.Contains(temp))
        {
            sw.Stop();
            inList = true;
        }
        else sw.Reset();

        Console.WriteLine(sw.ElapsedTicks.ToString());
        Console.WriteLine(inList.ToString());

        Console.ReadLine();
    }

编辑

修改 Matthew Watson 的代码。

【问题讨论】：

您是在调试器还是可执行文件中运行此测试？那么模式，调试/发布呢？
是的，它正在调试器（调试）中运行。 .net 4 客户端配置文件
Bad idea，你也应该在发布模式下运行它。
一个执行和案例很少提供有效数据。您应该运行搜索 10k 次并让它搜索不同的匹配项。您的匹配长度都相同，并且包含许多相同的数据，这可能会减少一些字典的好处。
@HansRudel 我刚刚在发布版中运行了测试，而不是在调试器中运行了超过 1000000 次迭代，字典肯定更快：Dict: 97628.10ns vs List: 10396.84ns 虽然差别不大。我怀疑这是由于每个集合的大小相对较小。

标签： c# performance list c#-4.0 dictionary

【解决方案1】：

这是一个正确的测试：

Dictionary<string, int> d = new Dictionary<string, int>()
{
    {"P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 2},
    {"P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 1},
    {"P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", -1},
    {"P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS", 2} 
};

List<string> l = new List<string>
{
    "P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS",
    "P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 
    "P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS",
    "P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS",
    "P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS"
};

int trials = 4;
int iters  = 1000000;

Stopwatch sw = new Stopwatch();

string target = "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS";

for (int trial = 0; trial < trials; ++trial)
{
    sw.Restart();

    for (int i = 0; i < iters; ++i)
        d.ContainsKey(target);

    sw.Stop();
    Console.WriteLine("Dictionary took " + sw.Elapsed);
    sw.Restart();

    for (int i = 0; i < iters; ++i)
        l.Contains(target);

    sw.Stop();
    Console.WriteLine("List took " + sw.Elapsed);
}

在任何调试器之外运行此版本的 RELEASE 版本。

我的结果：

Dictionary took 00:00:00.0587588
List took 00:00:00.2018361
Dictionary took 00:00:00.0578586
List took 00:00:00.2003057
Dictionary took 00:00:00.0611053
List took 00:00:00.2033325
Dictionary took 00:00:00.0583175
List took 00:00:00.2056591

字典显然更快，即使条目很少。

词条越多，Dictionary 会比 list 更快。

使用 Dictionary 的查找时间是 O(1)，而 List 需要 O(N)。这当然会对较大的 N 值产生巨大的影响。

【讨论】：

您好，感谢您的代码。我已经运行它并得到类似的时间给你。我刚刚对其进行了修改，以便在每个 For-Loop 中，它通过 target = l[random.Next(l.Count)]; 生成选择一个新目标；但现在时代都一样了。知道为什么吗？请参阅编辑原始问题的图片。
@HansRudel 好吧，首先你忽略了添加大括号{} 所以循环都只是访问列表（并且对循环内的字典不做任何事情）。其次，如果您确实在循环内添加随机查找，您还将在循环中添加索引List<> 所需的时间，这将扭曲结果。此外，一旦将随机组件添加到列表中，就会破坏任何一致性 - 为其中一个循环随机选择的目标可能比为另一个循环随机选择的目标“更糟糕”。
"你忽略了添加大括号 {}"... D'oh.
更新时间。关于您的第二点，id 认为由于有足够多的迭代次数，所以这个问题可以忽略不计。无论如何，谢谢你的帮助，我很感激。

【解决方案2】：

Dictionary 在单词的渐近意义方面比 List 快：O(1) 更少，比如 5，List 可能会更快）。 Dictionary 还有一个问题：它需要 GetHashCode() 方法；如果 GetHashCode() 实施不当，字典可能会非常慢。

【讨论】：

我是否正确假设“P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS”是一个糟糕的关键？我处理的每个对象都包含 x 个输入，并且这些输入的列表 = 1 个排列 => 是唯一的。所以我认为这可能是存储我已经阅读过关于哈希冲突的对象的好方法。
不，默认的 GetHashCode() 实现（在您的情况下 - String.GetHashCode()）非常好；似乎您用太少的项目测试字典/列表。尝试，比如说，将 100000 个不同的键/项目添加到列表和字典中......

【解决方案3】：

这是由于law of large numbers。总之

根据规律，从一个大的结果中得到的平均值试验次数应接近预期值，并且会趋于随着更多试验的进行，距离越来越近。

另一个约束是Big-O notation ，它实际上在小范围内是无用的。例如，对于小于某个小数字的给定 n，您可以说 O(1) ~ O(N) ~ O(n!)。

运行一个好的实验需要满足一些非常严格的条件，例如：

确保算法具有可比性
算法中有大量迭代
在相同的硬件上运行
以最大优化的发布模式运行
不应附加任何调试器或性能分析工具
进行多次实验并计算平均值 + 标准差
...

【讨论】：

所有非常有效的点。下次我会保留他们的清单。谢谢