【问题标题】:Postgres Explain - how to optimizePostgres 解释 - 如何优化
【发布时间】:2021-11-21 12:53:51
【问题描述】:

性能越来越差。 使用解释,我看到嵌套循环中有顺序扫描——这可能是性能问题。我不知道的是:我该如何改进?

这里是查询和解释输出的链接:https://explain.depesz.com/s/zmzp 我也将它们包括在这里:

查询:

'''
SELECT
    "assets".*
FROM
    "assets"
    INNER JOIN "devices" ON "devices"."asset_id" = "assets"."id"
WHERE
    "assets"."archived_at" IS NULL
    AND "assets"."archive_number" IS NULL
    AND "assets"."assettype_id" = 3
    AND ((assets.lastseendate >= NOW() - INTERVAL '30 days')
        AND ((devices.stop_time IS NULL)
            OR (devices.stop_time >= NOW() - INTERVAL '30 days')
            OR (devices.launch_time IS NOT NULL
                AND devices.launch_time > devices.stop_time)))
'''

这里是解释输出:

Nested Loop  (cost=0.43..255815.01 rows=11889 width=218) (actual time=0.049..2187.719 rows=359445 loops=1)
  Buffers: shared hit=1499737 read=75
  I/O Timings: read=5.382
  ->  Seq Scan on assets  (cost=0.00..117666.24 rows=27484 width=218) (actual time=0.035..770.720 rows=359543 loops=1)
        Filter: ((archived_at IS NULL) AND (archive_number IS NULL) AND (assettype_id = 3) AND (lastseendate >= (now() - 'P30D'::interval)))
        Rows Removed by Filter: 2539219
        Buffers: shared hit=59691
  ->  Index Scan using devices_asset_id_ix on devices  (cost=0.43..5.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=359543)
        Index Cond: (asset_id = assets.id)
        Filter: ((stop_time IS NULL) OR (stop_time >= (now() - 'P30D'::interval)) OR ((launch_time IS NOT NULL) AND (launch_time > stop_time)))
        Rows Removed by Filter: 0
        Buffers: shared hit=1440046 read=75
        I/O Timings: read=5.382
Planning Time: 1.055 ms
Execution Time: 2264.396 ms

唯一相关的索引是这个:

devices_asset_id_ix

更新:我添加了这里列出的几个索引:

add_index :devices, [:asset_id, :stop_time, :launch_time], name: "device_online_idx"
add_index :devices, [:asset_id, :stop_time]
add_index :devices, [:asset_id, :launch_time]
add_index :devices, :stop_time
add_index :devices, :launch_time

add_index :assets, [:assettype_id, :archived_at, :archive_number, :lastseendate], name: "asset_unexpired_idx"
add_index :assets, :assettype_id
add_index :assets, :archived_at
add_index :assets, :archive_number
add_index :assets, :lastseendate

这已将解释更改为如下所示:

Nested Loop  (cost=0.99..179162.78 rows=11872 width=218) (actual time=0.050..1680.166 rows=359011 loops=1)
  Buffers: shared hit=1726893 read=33
  I/O Timings: read=0.226
  ->  Index Scan using asset_unexpired_idx on assets  (cost=0.56..41125.44 rows=27451 width=218) (actual time=0.037..315.869 rows=359110 loops=1)
        Index Cond: ((assettype_id = 3) AND (archived_at IS NULL) AND (archive_number IS NULL) AND (lastseendate >= (now() - 'P30D'::interval)))
        Buffers: shared hit=288537
  ->  Index Scan using devices_asset_id_ix on devices  (cost=0.43..5.02 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=359110)
        Index Cond: (asset_id = assets.id)
        Filter: ((stop_time IS NULL) OR (stop_time >= (now() - 'P30D'::interval)) OR ((launch_time IS NOT NULL) AND (launch_time > stop_time)))
        Rows Removed by Filter: 0
        Buffers: shared hit=1438356 read=33
        I/O Timings: read=0.226
Planning Time: 1.322 ms
Execution Time: 1757.047 ms

这得到了 25% 的改进。有什么方法可以进一步大幅改善这一点?

【问题讨论】:

  • 编辑您的问题以清楚地显示 SQL 查询本身。还包括所有索引定义。
  • 我们需要EXPLAIN (ANALYZE, BUFFERS) 输出。
  • 我添加了一个相关的索引定义并将说明更改为具有分析和缓冲区选项。注意:这是使用开发(测试)数据......不是实际的生产数据。
  • 您的测试数据库应该与生产数据库具有相同的大小。否则调优也没用。
  • 看起来我会受益于向asset.assettype_id、asset.lastseendate、asset.archived_at和asset.archived_number以及devices.stop_time和devices.launch_time添加索引

标签: postgresql query-optimization explain


【解决方案1】:

像这样尝试compound b-tree index

CREATE INDEX assets_type_archive_date
    ON assets
       (assettype_id, archived_at, archive_number, lastseendate)

它应该可以帮助您有效地过滤资产表。服务器可以随机访问第一个符合条件的行的索引,然后在lastseendate 值的范围内顺序扫描索引。

出于类似原因,请在设备上尝试此索引。

CREATE INDEX devices
    ON devices
       (asset_id, stop_time)

【讨论】:

  • 那么复合索引会比每个列上的 4 个单独索引更有效吗?两件事都做会更好吗?
  • 试试看。如果没有您的数据,很难知道。但是这种复合索引通常是全表扫描(seq scan)减速的解决方案。大量的单列索引通常没有多大帮助。
  • 更新说明以使用生产数据。
  • 我添加了您推荐的索引 - 这确实提供了一个小的改进。但是,查询仍然非常慢。
猜你喜欢
  • 1970-01-01
  • 2018-03-09
  • 1970-01-01
  • 2015-11-17
  • 2019-08-12
  • 1970-01-01
  • 1970-01-01
  • 2023-03-19
  • 1970-01-01
相关资源
最近更新 更多