【问题标题】:Window functions SORT costly, can we overcome it?窗口函数 SORT 代价高昂,我们能克服它吗?
【发布时间】:2014-08-28 12:21:13
【问题描述】:

我的要求:找出DEPT_NUM的前10个帐户,按帐号升序排列。

查询:

SELECT * FROM
(
  select acctnum,dept_num,row_number() OVER (PARTITION BY DEPT_NUM ORDER BY ACCTNUM) as row_identifier
   FROM MYTABLE
)
WHERE row_identifier between 1 and 10;

追踪:

    7532 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1480074522

--------------------------------------------------------------------------------------------
| Id  | Operation                | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |         |   577K|    15M|       |  3855   (1)| 00:00:47 |
|*  1 |  VIEW                    |         |   577K|    15M|       |  3855   (1)| 00:00:47 |
|*  2 |   WINDOW SORT PUSHED RANK|         |   577K|  7890K|    13M|  3855   (1)| 00:00:47 |
|   3 |    INDEX FAST FULL SCAN  | IMTAB05 |   577K|  7890K|       |   987   (1)| 00:00:12 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=5)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY "ACCTNUM")<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          2  db block gets
       4298  consistent gets
          0  physical reads
          0  redo size
     144367  bytes sent via SQL*Net to client
        486  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       7532  rows processed

索引:

index scanDEPT_NUM 列上表示INDEX STORAGE

强制 Full Table 扫描使成本从 3855 变为 11092

表格总行数为632667;


以上都是测试区域结果。产量实际上是两倍。

我的数据库是 Exadata,Quarter RAC。运行 Oracle 11g R2。数据库足够强大,可以立即执行,但是 DBA 不愿意使用 13M 的 tempSpc。企业报告此报告的频率为每小时 4 次。主要的是,这张表有很多实时插入/更新

我们可以即兴创作这个过程吗
1) 增加会话的 PGA?(不确定,是否真的可能?)
2) 额外的索引会有帮助吗?

只是想让一些不同的人来看待这个问题,因为我们小组完全只关注 DBA 参数。

感谢您的任何建议!

【问题讨论】:

  • 我不确定但是 row_identifier >= 10;在这里工作得更好,因为它不必评估两个不同的东西。由于 between 语句本质上是在执行 =10。
  • 什么是可接受的 tempSpc 值?
  • 您可以考虑在dba.stackexchange.com 上询问这个问题,并寻求 DBA 建议。
  • @ConradFrix prod 环境指出用法为24M。我们还没有确定我们的阈值数字。将尽快更新相同的!我们的电子商务应用使用相同的数据库,并针对OLTP进行了优化
  • 我认为您的 dba 过于保守。也就是说,如果这真的是一个问题,我认为您唯一能做的就是重组数据,对每个组的前 n 个使用旧式光标解决方案或使用资源管理器来限制资源。

标签: sql performance oracle oracle11g exadata


【解决方案1】:

分析函数的性能可能取决于索引列的顺序。将索引从 (ACCTNUM,DEPT_NUM) 更改为 (DEPT_NUM,ACCTNUM) 可能会降低成本并消除对临时表空间的需求。

partition by COL_2 order by COL_1 => INDEX FAST FULL SCAN|WINDOW SORT PUSHED RANK
partition by COL_1 order by COL_2 => INDEX FULL SCAN|WINDOW NOSORT

INDEX FAST FULL SCAN 使用更快的多块 IO,但它还需要对数据进行排序,并且可能需要为排序区域提供临时表空间。

INDEX FULL SCAN 使用较慢的单块 IO,但它按顺序返回数据并避免排序。

示例架构和数据

--drop table mytable;
create table mytable(dept_num number not null, acctnum number not null
    ,a number, b number, c number, d number, e number);
insert into mytable
select 1 dept_num, 1 acctnum, 0,0,0,0,0 from dual union all
select 1 dept_num, 2 acctnum, 0,0,0,0,0 from dual union all
select 1 dept_num, 3 acctnum, 0,0,0,0,0 from dual union all
select 2 dept_num, 1 acctnum, 0,0,0,0,0 from dual union all
select 2 dept_num, 2 acctnum, 0,0,0,0,0 from dual union all
select 3 dept_num, 1 acctnum, 0,0,0,0,0 from dual;
--Create 600K similar rows.
insert into mytable
  select dept_num + rownumber*3, acctnum, a,b,c,d,e
  from mytable
  cross join (select level rownumber from dual connect by level <= 100000);
begin
    dbms_stats.gather_table_stats(user, 'mytable');
end;
/

(ACCTNUM,DEPT_NUM) = 窗口排序推送排名

create index mytable_idx on mytable(acctnum, dept_num);

explain plan for
select dept_num, acctnum from
(
    select dept_num, acctnum
        ,row_number() over (partition by dept_num order by acctnum) as row_identifier
    from mytable
)
where row_identifier between 1 and 10;

select * from table(dbms_xplan.display);

Plan hash value: 952182109

------------------------------------------------------------------------------------------------
| Id  | Operation                | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |             |   600K|    22M|       |  1625   (3)| 00:00:23 |
|*  1 |  VIEW                    |             |   600K|    22M|       |  1625   (3)| 00:00:23 |
|*  2 |   WINDOW SORT PUSHED RANK|             |   600K|  4687K|  9424K|  1625   (3)| 00:00:23 |
|   3 |    INDEX FAST FULL SCAN  | MYTABLE_IDX |   600K|  4687K|       |   239   (3)| 00:00:04 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=10)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY "ACCTNUM")<=10)

(DEPT_NUM,ACCTNUM) = WINDOW NOSORT

drop index mytable_idx;
create index mytable_idx on mytable(dept_num, acctnum);

explain plan for
select dept_num, acctnum from
(
    select dept_num, acctnum
        ,row_number() over (partition by dept_num order by acctnum) as row_identifier
    from mytable
)
where row_identifier between 1 and 10;

select * from table(dbms_xplan.display);

Plan hash value: 1773829932

---------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |   600K|    22M|   792   (2)| 00:00:12 |
|*  1 |  VIEW             |             |   600K|    22M|   792   (2)| 00:00:12 |
|*  2 |   WINDOW NOSORT   |             |   600K|  4687K|   792   (2)| 00:00:12 |
|   3 |    INDEX FULL SCAN| MYTABLE_IDX |   600K|  4687K|   792   (2)| 00:00:12 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ROW_IDENTIFIER">=1 AND "ROW_IDENTIFIER"<=10)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPT_NUM" ORDER BY 
              "ACCTNUM")<=10)

【讨论】:

  • 感谢您过来帮忙。而且您的解决方案很有意义,我将在我的工作场所创建索引并让您知道结果!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-05-21
  • 2012-03-21
  • 1970-01-01
  • 1970-01-01
  • 2013-12-24
  • 2019-09-21
相关资源
最近更新 更多