【问题标题】:Why using same field when filtering cause different execution time? (different index usage)为什么过滤时使用相同的字段会导致不同的执行时间? (不同的索引用法)
【发布时间】:2021-05-11 18:45:12
【问题描述】:

当我通过agreement_id 运行查询和过滤时,速度很慢,
但是当我通过别名id 过滤时,它很快。 (查看查询末尾)

为什么过滤时使用相同的字段会导致执行时间不同?

链接到explain analyze

slow1, slow2
fast1, fast2

差异从 #20 开始:使用不同索引的地方:
Index Cond: (o.sys_period @> sys_time()) VS Index Cond: (o.agreement_id = 38)

附言。如果我能联系到这个功能的开发者就好了(我还有一个类似的问题)

UPD 我做了一些实验。当我从查询中删除窗口函数时,无论如何它都能快速运行。那么为什么窗口函数在某些情况下会停止使用索引?如何逃避/解决这个问题?

dbfiddle 用最少的测试用例
服务器版本为 v13.1
完整查询:

WITH gconf AS
  -- https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-SELECT
  NOT MATERIALIZED -- force it to be merged into the parent query
  -- it gives a net savings because each usage of the WITH query needs only a small part of the WITH query's full output.
( SELECT
  ocd.*, 
  tstzrange( '2021-05-01', '2021-05-01', '[]') AS acc_period,

  (o).agreement_id      AS id,       -- Required to passthrough WINDOW FUNCTION
  (o).id                AS order_id,
  (ic).consumed_period  AS consumed_period,
  dense_rank()  OVER ( PARTITION BY (o).agreement_id, (o).id ORDER BY (ic).consumed_period )  AS nconf,
  row_number()  OVER ( wconf ORDER BY (c).sort_order NULLS LAST                            )  AS nitem,

  (sum( ocd.item_cost )  OVER wconf)::numeric( 10, 2) AS conf_cost,
  max((ocd.ic).consumed) OVER wconf                   AS consumed,
  CASE WHEN true
    THEN (sum( ocd.item_suma )  OVER wconf)::numeric( 10, 2 )
    ELSE (sum( ocd.item_cost )  OVER wconf)::numeric( 10, 2 )
  END AS conf_suma
FROM order_cost_details( tstzrange( '2021-05-01', '2021-05-01', '[]') ) ocd
WHERE true  OR  (ocd.ic).consumed_period @> lower( tstzrange( '2021-05-01', '2021-05-01', '[]') )

WINDOW wconf AS ( PARTITION BY (o).agreement_id, (o).id, (ic).consumed_period )
),
gorder AS (
SELECT *,
  (conf_suma/6)::numeric( 10, 2 ) as conf_nds,
  sum( conf_suma ) FILTER (WHERE nitem = 1) OVER worder AS order_suma
FROM gconf
WINDOW worder AS ( PARTITION BY gconf.id, (o).id )
-- TODO: Ask PG developers: Why changing to (o).agreement_id slows down query?
-- WINDOW worder AS ( PARTITION BY (o).agreement_id, (o).id )
)
SELECT
  u.id, consumed_period, nconf, nitem,
  (c).id                                                as item_id,
  COALESCE( (c).sort_order,   pd.sort_order          )  as item_order,
  COALESCE( st.display, st.name, rt.display, rt.name )  as item_name,
  COALESCE( item_qty,         (c).amount/rt.unit     )  as item_qty,
  COALESCE( (p).label,        rt.label               )  as measure,
  item_price, item_cost, item_suma,
  conf_cost, consumed, conf_suma, conf_nds, order_suma,
  (order_suma/6)::numeric( 10, 2 )                      as order_nds,
  sum( conf_suma                        ) FILTER (WHERE nitem = 1                ) OVER wagreement  AS total_suma,
  sum( (order_suma/6)::numeric( 10, 2 ) ) FILTER (WHERE nitem = 1  AND  nconf = 1) OVER wagreement  AS total_nds,

  pkg.id                            as package_id,
  pkg.link_1c_id                    as package_1c_id,
  COALESCE( pkg.display, pkg.name ) as package,

  acc_period

FROM gorder u
LEFT JOIN resource_type rt ON rt.id  = (c).resource_type_id
LEFT JOIN service_type  st ON st.id  = (c).service_type_id
LEFT JOIN package      pkg ON pkg.id = (o).package_id
LEFT JOIN package_detail     pd  ON pd.package_id = (o).package_id
  AND pd.resource_type_id IS NOT DISTINCT FROM (c).resource_type_id
  AND pd.service_type_id  IS NOT DISTINCT FROM (c).service_type_id

-- WHERE (o).agreement_id = 38   --  slow
WHERE u.id = 38                  --  fast
WINDOW wagreement AS ( PARTITION BY (o).agreement_id )

【问题讨论】:

    标签: postgresql query-optimization postgresql-performance query-planner


    【解决方案1】:

    作为问题解决方法,我们还可以为SELECT 表达式中使用的列添加别名PARTITION BY。然后 PG 应用优化并使用索引。

    问题的答案可能是:如果使用复合类型,PG 不会应用优化。注意它的工作原理:

    PARTITION | FILTER | IS USED?
    ------------------------------
    ALIAS     | ORIG   | NO
    ALIAS     | ALIAS  | YES
    ORIG      | ALIAS  | NO
    ORIG      | ORIG   | NO
    

    this dbfiddle

    create table agreement ( ag_id int, name text, cost numeric(10,2) );
    create index ag_idx on agreement (ag_id);
    insert into agreement (ag_id, name, cost) values ( 1, '333', 22 ),
    (1,'333', 33), (1, '333', 7), (2, '555', 18 ), (2, '555', 2), (3, '777', 4);
    select * from agreement;
    
    create function initial () 
    returns table( agreement_id int, ag agreement ) language sql stable AS $$
    select ag_id, t from agreement t;
    $$;
    select * from initial() t;
    
    explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
      select 
        *,
        sum( (t.ag).cost ) over ( partition by agreement_id ) as total
      from initial() t
    )
    select * from totals_by_ag t
    where (t.ag).ag_id = 1; -- index is NOT USED
    
    explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
      select 
        *,
        sum( (t.ag).cost ) over ( partition by agreement_id ) as total
      from initial() t
    )
    select * from totals_by_ag t
    where agreement_id = 1; -- index is used when alias for column is used
    
    explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
      select 
        *,
        sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
      from initial() t
    )
    select * from totals_by_ag t
    where agreement_id = 1; -- index is NOT USED because grouping by original column
    
    explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
      select 
        *,
        sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
      from initial() t
    )
    select * from totals_by_ag t
    where (t.ag).ag_id = 1; -- index is NOT USED even if at both cases original column
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-10-01
      • 1970-01-01
      • 1970-01-01
      • 2012-09-20
      • 1970-01-01
      相关资源
      最近更新 更多