【问题标题】:High CPU utlization when using WITH RECURSIVE query in Postgres 11在 Postgres 11 中使用 WITH RECURSIVE 查询时 CPU 利用率高
【发布时间】:2019-09-10 13:59:56
【问题描述】:

我们正在使用 WITH Recursive CTE 选项构建分层查询。查询响应时间处于可接受的水平,但出现 CPU 峰值并达到 100 %

包含所有可能的过滤条件并在过滤列上添加索引

explain analyze WITH RECURSIVE childfolders AS (
SELECT f.id,  f.parent_folder_id 
FROM  f,  doc WHERE f.id = doc.id AND f.project_id = 'projxxx'
AND f.parent_folder_id IN ('docf141440')  
UNION ALL 
SELECT f.id,   f.parent_folder_id 
FROM folder f,  doc, childfolders 
WHERE f.parent_folder_id = childfolders.id AND f.is_deleted = FALSE AND f.id = doc.id    ) 
SELECT id, parent_folder_id FROM childfolders ORDER BY id;
                                                                               QUERY PLAN                                                                                
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=51140.27..51157.70 rows=6975 width=164) (actual time=10520.316..10547.033 rows=377436 loops=1)
   Sort Key: childfolders.id
   Sort Method: quicksort  Memory: 41776kB
   CTE childfolders
     ->  Recursive Union  (cost=0.85..50555.48 rows=6975 width=20) (actual time=0.040..7553.675 rows=377436 loops=1)
           ->  Nested Loop  (cost=0.85..121.45 rows=5 width=20) (actual time=0.038..0.039 rows=1 loops=1)
                 ->  Index Scan using folder_parent_folder_id on  f  (cost=0.42..81.49 rows=9 width=20) (actual time=0.021..0.022 rows=1 loops=1)
                       Index Cond: ((parent_folder_id)::text = 'docf141440'::text)
                       Filter: ((project_id)::text = 'proj2877'::text)
                 ->  Index Only Scan using document_folder_pk on  doc  (cost=0.42..4.44 rows=1 width=10) (actual time=0.016..0.016 rows=1 loops=1)
                       Index Cond: (id = (f.id)::text)
                       Heap Fetches: 0
           ->  Nested Loop  (cost=0.85..5029.45 rows=697 width=20) (actual time=326.462..830.750 rows=41937 loops=9)
                 ->  Nested Loop  (cost=0.42..4499.12 rows=1161 width=20) (actual time=326.452..445.050 rows=41937 loops=9)
                       ->  WorkTable Scan on childfolders childfolders_1  (cost=0.00..1.00 rows=50 width=82) (actual time=0.000..3.962 rows=41937 loops=9)
                       ->  Index Scan using folder_parent_folder_id on  f_1  (cost=0.42..89.73 rows=23 width=20) (actual time=0.009..0.010 rows=1 loops=377436)
                             Index Cond: ((parent_folder_id)::text = (childfolders_1.id)::text)
                             Filter: (NOT is_deleted)
                 ->  Index Only Scan using document_folder_pk on  doc_1  (cost=0.42..0.46 rows=1 width=10) (actual time=0.009..0.009 rows=1 loops=377435)
                       Index Cond: (id = (f_1.id)::text)
                       Heap Fetches: 0
   ->  CTE Scan on childfolders  (cost=0.00..139.50 rows=6975 width=164) (actual time=0.041..7683.095 rows=377436 loops=1)
 Planning Time: 0.520 ms
 Execution Time: 10577.198 ms

有没有办法在不影响查询响应时间的情况下减少 CPU 峰值

【问题讨论】:

    标签: postgresql common-table-expression cpu-usage


    【解决方案1】:

    估计是错误的,当不同的策略可能更好时,PostgreSQL 会选择嵌套循环连接。

    您可以尝试计算统计数据,也许会有所帮助:

    ANALYZE f_1;
    ANALYZE doc_1;
    

    除此之外,我能做的唯一补救措施是仅对该查询禁用嵌套循环连接:

    BEGIN;
    SET LOCAL enable_nestloop = off;
    SELECT /* your query */;
    COMMIT;
    

    【讨论】:

    • 感谢回复。我们尝试了 ANALYZE 选项,但没有按预期进行。如果我删除 f.parent_folder_id 列的索引,那么 CPU 利用率会有所提高,但查询响应时间会受到影响跨度>
    猜你喜欢
    • 2011-03-14
    • 1970-01-01
    • 2013-08-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-12-15
    • 2020-10-08
    • 2022-10-15
    相关资源
    最近更新 更多