【问题标题】:How can I get performance using PostgreSQL CTE recursive?如何使用 PostgreSQL CTE 递归获得性能?
【发布时间】:2018-03-23 22:25:29
【问题描述】:

我在同一个表中使用 id 和 parent_id 做了一个树结构。对于查询,我使用的是 PostgreSQL 提供的 CTE,但是要花费大量时间来执行递归结果的连接。例如,当我在 sadt_lot 表上有 100 条记录时,这个查询需要 8 秒才能返回结果。有人有更好的主意吗?

    WITH RECURSIVE downlots as (
        SELECT s1.sadt_lot_id, 0 AS level, s1.sadt_lot_id as root_id
        FROM sadt_lot s1
        WHERE s1.parent_lot_id IS NULL
    UNION
        SELECT s2.sadt_lot_id, d.level + 1, d.sadt_lot_id as root_id
        FROM sadt_lot s2
        INNER JOIN downlots d ON d.sadt_lot_id = s2.parent_lot_id
)

SELECT 
    "s"."sadt_lot_id", 
    "s"."name", concat(lpad(s.sadt_lot_id::TEXT, 3, '0'), '-', to_char(to_timestamp(s.created_at), 'DDMMYY')) sadt_lot_code, 
    "s"."created_at" AS "created_at", 
    "s"."version" AS "version", "s"."sadt_lot_status_id",
    SUM(procedure_performed.amount_requested) procedures_total, 
    SUM(procedure_performed.total_value) procedures_total_value
FROM "sadt_lot" "s" 

LEFT JOIN "sadt" ON sadt.sadt_lot_id = any(SELECT sadt_lot_id FROM downlots WHERE root_id = s.sadt_lot_id)
LEFT JOIN "procedure_auth" ON sadt.procedure_auth_id = procedure_auth.procedure_auth_id
LEFT JOIN "procedure_performed" ON procedure_auth.procedure_auth_id = procedure_performed.procedure_auth_id
WHERE "s"."parent_lot_id" IS NULL 
GROUP BY "s"."sadt_lot_id"
ORDER BY "created_at" DESC 

其他示例列出所有按根sadt_lot分组的sadt:

EXPLAIN ANALYZE WITH RECURSIVE downlots as (
        SELECT sl1.sadt_lot_id, 0 AS level, sl1.sadt_lot_id as root_id
        FROM sadt_lot sl1
        WHERE sl1.parent_lot_id IS NULL
    UNION
        SELECT sl2.sadt_lot_id, d.level + 1, d.sadt_lot_id as root_id
        FROM sadt_lot sl2
        INNER JOIN downlots d ON d.sadt_lot_id = sl2.parent_lot_id
)
SELECT sl.sadt_lot_id, array_agg(s.sadt_id)
FROM sadt_lot sl
LEFT JOIN sadt s ON s.sadt_lot_id = any(SELECT sadt_lot_id FROM downlots WHERE root_id = sl.sadt_lot_id)
WHERE sl.parent_lot_id IS NULL 
group by sl.sadt_lot_id
ORDEr By sl.sadt_lot_id

查询计划

GroupAggregate  (cost=42.53..15077.74 rows=1 width=36) (actual time=104.090..8436.505 rows=90 loops=1)
  Group Key: sl.sadt_lot_id
  CTE downlots
    ->  Recursive Union  (cost=0.00..42.39 rows=101 width=12) (actual time=0.006..0.104 rows=95 loops=1)
          ->  Seq Scan on sadt_lot sl1  (cost=0.00..2.94 rows=1 width=12) (actual time=0.005..0.019 rows=90 loops=1)
                Filter: (parent_lot_id IS NULL)
                Rows Removed by Filter: 5
          ->  Hash Join  (cost=0.33..3.74 rows=10 width=12) (actual time=0.027..0.028 rows=2 loops=2)
                Hash Cond: (sl2.parent_lot_id = d.sadt_lot_id)
                ->  Seq Scan on sadt_lot sl2  (cost=0.00..2.94 rows=94 width=8) (actual time=0.002..0.008 rows=95 loops=2)
                ->  Hash  (cost=0.20..0.20 rows=10 width=8) (actual time=0.010..0.010 rows=48 loops=2)
                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
                      ->  WorkTable Scan on downlots d  (cost=0.00..0.20 rows=10 width=8) (actual time=0.001..0.004 rows=48 loops=2)
  ->  Nested Loop Left Join  (cost=0.14..15004.14 rows=6242 width=8) (actual time=8.234..8434.229 rows=11345 loops=1)
        Join Filter: (SubPlan 2)
        Rows Removed by Join Filter: 1112125
        ->  Index Only Scan using sadt_lot_sadt_lot_id_parent_lot_id_idx on sadt_lot sl  (cost=0.14..12.86 rows=1 width=4) (actual time=0.011..0.252 rows=90 loops=1)
              Index Cond: (parent_lot_id IS NULL)
              Heap Fetches: 90
        ->  Seq Scan on sadt s  (cost=0.00..635.83 rows=12483 width=8) (actual time=0.002..1.785 rows=12483 loops=90)
        SubPlan 2
          ->  CTE Scan on downlots  (cost=0.00..2.27 rows=1 width=4) (actual time=0.003..0.007 rows=1 loops=1123470)
                Filter: (root_id = sl.sadt_lot_id)
                Rows Removed by Filter: 94
Planning time: 0.203 ms
Execution time: 8436.598 ms

【问题讨论】:

  • SQL中的FROM "s"在哪里?
  • 问题写错,已编辑
  • WITH RECURSIVE ... FROM sadt_lot s1 - 为什么没有WHERE parent_lot_id IS NULL
  • 与它相同的结果,我在主查询的 WHERE 上有这个语句

标签: postgresql tree common-table-expression


【解决方案1】:

尝试解释分析

WITH RECURSIVE downlots as (
        SELECT sl1.sadt_lot_id, 0 AS level, sl1.sadt_lot_id as root_id
        FROM sadt_lot sl1
        WHERE sl1.parent_lot_id IS NULL
    UNION
        SELECT sl2.sadt_lot_id, d.level + 1, d.sadt_lot_id as root_id
        FROM sadt_lot sl2
        INNER JOIN downlots d ON d.sadt_lot_id = sl2.parent_lot_id
)
SELECT downlots.root_id AS sadt_lot_id, array_agg(s.sadt_id)
FROM downlots 
LEFT JOIN sadt s ON s.sadt_lot_id = downlots.sadt_lot_id
GROUP BY downlots.root_id
ORDER BY downlots.root_id

【讨论】:

  • 但我需要与所有孩子一起获得所有悲伤的树行,我该怎么做?
  • 显示代码示例 - 获取树的所有悲伤行
  • 看问题的第二个例子
  • loops=1123470 in CTE Scan on downlots(成本=0.00..2.27 行=1 宽度=4)(实际时间=0.003..0.007 行=1 循环=1123470)。在答案中尝试新的 SQL
  • 是的,我看过了,tnks。
【解决方案2】:

我找到了解决方案。我正在使用递归表达式如何加入参数,它在加入时使用的表上做了几个循环,更好的方法是在加入这个表之前(sadt),用递归表达式(downlots“table”)和之后进行加入,使用结果,与sadt连接,查询从8秒跳到8毫秒。 按照解决方案:

EXPLAIN ANALYZE SELECT sl.sadt_lot_id, array_agg(s.sadt_id)
FROM sadt_lot sl
LEFT JOIN (WITH RECURSIVE downlots as (
        SELECT sl1.sadt_lot_id, 0 AS level, sl1.sadt_lot_id as root_id
        FROM sadt_lot sl1
        WHERE sl1.parent_lot_id IS NULL
    UNION
        SELECT sl2.sadt_lot_id, d.level + 1, d.sadt_lot_id as root_id
        FROM sadt_lot sl2
        INNER JOIN downlots d ON d.sadt_lot_id = sl2.parent_lot_id
)SELECT * FROM downlots) d ON d.sadt_lot_id = sl.sadt_lot_id
LEFT JOIN sadt s ON s.sadt_lot_id = d.root_id
WHERE sl.parent_lot_id IS NULL 
group by sl.sadt_lot_id
ORDEr By sl.sadt_lot_id

查询计划

Sort  (cost=1935.35..1935.56 rows=82 width=36) (actual time=8.230..8.234 rows=82 loops=1)
Sort Key: sl.sadt_lot_id
Sort Method: quicksort  Memory: 75kB
->  HashAggregate  (cost=1931.72..1932.74 rows=82 width=36) (actual time=8.085..8.197 rows=82 loops=1)
    Group Key: sl.sadt_lot_id
    ->  Hash Right Join  (cost=469.73..1839.25 rows=18493 width=8) (actual time=0.328..6.273 rows=10742 loops=1)
          Hash Cond: (s.sadt_lot_id = downlots.root_id)
          ->  Seq Scan on sadt s  (cost=0.00..645.78 rows=12678 width=8) (actual time=0.007..1.406 rows=12493 loops=1)
          ->  Hash  (cost=465.72..465.72 rows=321 width=8) (actual time=0.242..0.242 rows=82 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 12kB
                ->  Hash Right Join  (cost=432.42..465.72 rows=321 width=8) (actual time=0.049..0.232 rows=82 loops=1)
                      Hash Cond: (downlots.sadt_lot_id = sl.sadt_lot_id)
                      ->  CTE Scan on downlots  (cost=428.41..444.05 rows=782 width=12) (actual time=0.007..0.167 rows=96 loops=1)
                            CTE downlots
                              ->  Recursive Union  (cost=0.00..428.41 rows=782 width=12) (actual time=0.006..0.143 rows=96 loops=1)
                                    ->  Seq Scan on sadt_lot sl1  (cost=0.00..2.99 rows=82 width=12) (actual time=0.004..0.018 rows=82 loops=1)
                                          Filter: (parent_lot_id IS NULL)
                                          Rows Removed by Filter: 14
                                    ->  Hash Join  (cost=4.23..40.98 rows=70 width=12) (actual time=0.030..0.031 rows=5 loops=3)
                                          Hash Cond: (d.sadt_lot_id = sl2.parent_lot_id)
                                          ->  WorkTable Scan on downlots d  (cost=0.00..16.40 rows=820 width=8) (actual time=0.000..0.002 rows=32 loops=3)
                                          ->  Hash  (cost=2.99..2.99 rows=99 width=8) (actual time=0.069..0.069 rows=14 loops=1)
                                                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                ->  Seq Scan on sadt_lot sl2  (cost=0.00..2.99 rows=99 width=8) (actual time=0.004..0.061 rows=96 loops=1)
                      ->  Hash  (cost=2.99..2.99 rows=82 width=4) (actual time=0.039..0.039 rows=82 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 11kB
                            ->  Seq Scan on sadt_lot sl  (cost=0.00..2.99 rows=82 width=4) (actual time=0.014..0.028 rows=82 loops=1)
                                  Filter: (parent_lot_id IS NULL)
                                  Rows Removed by Filter: 14
Planning time: 0.225 ms
Execution time: 8.300 ms

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-03-17
    • 2020-06-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多