什么时候 select 比 for 快？答案

【问题标题】：When select is faster than a for?什么时候 select 比 for 快？
【发布时间】：2014-10-13 18:54:21
【问题描述】：

想象一下这种情况，我有一个实体的 Id，并且我有一个喜欢的列表，其中包含同一类的一些实体。要找到实体，在链表中执行 for 或在数据库中执行 select 什么时候更快？

我的意思是，例如，当我在链接列表中有大约 5 个实体时，我想 for 会更快，而当我有数百万个实体时，select 会更快。

但是当select开始比for快的时候呢？

【问题讨论】：

最快的似乎是使用底层数据库的非标准递归选择机制（例如带有SQLServer的公用表表达式）。不确定hibernate是否可以被操纵以利用它。
内存中的数据结构比扫描列表更有效。事实上，您的数据库库最终将使用一些您可以直接在代码中使用的算法。
为什么你会有数百万个实体在内存中，在一个链表中？我无法想象有一个很好的理由会陷入这种境地。通常，对加载到内存中的对象进行一些测试比通过数据库、网络或至少跨不同进程执行 SQL 查询要快得多。对于数百万个实体来说可能不是这样（因为数据库可以使用比迭代更快地找到行的索引），但无论如何都不应该发生这种情况。
链表是最慢的搜索集合之一，我只是在考虑最坏的情况。但实际情况是具有实体列表的面孔组件。在我的转换器中，我可以通过此列表找到或在数据库中搜索。

标签： java database performance hibernate jpa

【解决方案1】：

内存中的数据结构总是比打开套接字、运行查询和发回响应快。

一个好的查询可能会在 10-100 毫秒内运行，而 Java 操作可能需要 100 纳秒。

使用 LinkedList 可能不会产生最佳性能。类似于使用数据库索引，您可以使用 HashMap 代替，并通过其 id 映射实体：

Map<Long, Entity> idEntityMap = new HashMap<>;
idEntityMap.put(entity.getId(), entity);

因此，当您搜索实体时，您只需运行：

Entity entity = idEntityMap.get(entityId);

调用将首先识别实体所在的 Map 存储桶，并且仅对该存储桶中包含的实体进行对象比较。

总而言之，内存中的操作非常快，但您需要使用适合您用例的数据结构。

【讨论】：

【解决方案2】：

在我的机器上使用 1e5 个元素后速度更快*。通过 jpa 从一个记录槽的表中选择的基线是 1ms。

置信区间 (99.9%)：[0,747, 1,087]

从 1e5 条记录中选择将需要相同的时间（1ms）下限增加 0.1ms。

在最后放置所需元素的 1e5 个元素列表中顺序搜索需要 2 毫秒。

置信区间 (99.9%)：[2073669,347, 2514442,020]

将这个除以二摊销为所需元素的随机位置。

仅供参考，从 1e5 个元素的映射中获取实体将花费不到 30ns

这是我的同行评审基准：

EntityManagerFactory emf;
List<Individual> list;
Map<Integer, Individual> map;

@Setup
public void setUp() {
    emf = Persistence.createEntityManagerFactory("postgres");
    list = new ArrayList<Individual>();
    map = new HashMap<Integer, Individual>();

    for (int i = (int)1e5; i > 0; i--) {
        Individual individual = new Individual(i);
        list.add(individual);
        map.put(i, individual);
    }
}

@TearDown
public void tearDown() {
    emf.close();
}

@Benchmark
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public Individual measureJpa() {
    EntityManager em = emf.createEntityManager();

    Query query = em.createQuery("select i from com.company.Individual i where i.id = 1");

    em.getTransaction().begin();

    Individual individual = (Individual) query.getSingleResult();

    em.getTransaction().commit();

    em.close();

    return individual;
}

@Benchmark
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public Individual measureList() {
    for (Individual i : list) {
        if (i.id == 1) {
            return i;
        }
    }
    throw new RuntimeException();
}

@Benchmark
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public Individual measureMap() {
    return map.get(1);
}

双核 1.5GHz CPU、7200rmp HDD、600MHz RAM、Windows 7 x64、JDK 1.8.0_05、EclipseLink 2.5.1、PostgesSql 9.3

【讨论】：