【问题标题】:Oracle query slow when joining, fast when splitted into two queries加入时Oracle查询慢,分成两个查询时快
【发布时间】:2014-05-03 00:05:21
【问题描述】:

我有一个 sql 查询,它在连接两个表时非常慢,但在我第一次查询一个表并使用它的输出查询另一个表时却很快。

环境/前提条件

甲骨文版本:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64 位生产

涉及的对象:
cfc_materialized(物化视图,21,5 Mio 行)
联系人(表格,12,6 Mio 行)

涉及的索引:
CREATE INDEX CONTACT_CLIENT ON CONTACT(CLIENT);
创建索引 CFC_MATERIALIZED_A_S_T ON CFC_MATERIALIZED(ASSOCIATION_TYPE, SOURCEID, TARGETID);

我已经用 cascade => true 重新计算了 2 个表的统计信息:

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
     OwnName        => '...'
    ,TabName        => '...'
    ,Estimate_Percent  => 0
    ,Degree            => 4
    ,Cascade           => TRUE
    ,No_Invalidate     => FALSE);
END;
/

问题

我有以下疑问:

SELECT c.*
FROM contact c
WHERE c.client in (
    SELECT cfc.targetid
    FROM cfc_materialized cfc
    WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
      AND cfc.association_type = 'ContactDataSharing'
  )
  AND c.deleted = 0;

解释计划:

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |    11M|  2214M| 38976   (1)| 00:07:48 |        |      |            |
|   1 |  PX COORDINATOR                |                  |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)          | :TQ10001         |       |       |            |          |  Q1,01 | P->S | QC (RAND)  |
|   3 |    NESTED LOOPS                |                  |       |       |            |          |  Q1,01 | PCWP |            |
|   4 |     NESTED LOOPS               |                  |    11M|  2214M| 38976   (1)| 00:07:48 |  Q1,01 | PCWP |            |
|   5 |      SORT UNIQUE               |                  |  2164 |   196K| 10672   (1)| 00:02:09 |  Q1,01 | PCWP |            |
|   6 |       PX RECEIVE               |                  |  2164 |   196K| 10672   (1)| 00:02:09 |  Q1,01 | PCWP |            |
|   7 |        PX SEND HASH            | :TQ10000         |  2164 |   196K| 10672   (1)| 00:02:09 |  Q1,00 | P->P | HASH       |
|   8 |         PX BLOCK ITERATOR      |                  |  2164 |   196K| 10672   (1)| 00:02:09 |  Q1,00 | PCWC |            |
|*  9 |          MAT_VIEW ACCESS FULL  | CFC_MATERIALIZED |  2164 |   196K| 10672   (1)| 00:02:09 |  Q1,00 | PCWP |            |
|* 10 |      INDEX RANGE SCAN          | CONTACT_CLIENT   |  5500 |       |    37   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|* 11 |     TABLE ACCESS BY INDEX ROWID| CONTACT          |  5474 |   550K|   973   (0)| 00:00:12 |  Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   9 - filter("CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc' AND "CFC"."ASSOCIATION_TYPE"='ContactDataSharing')
  10 - access("C"."CLIENT"="CFC"."TARGETID")
  11 - filter("C"."DELETED"=0)enter code here

这需要很长时间,我想知道为什么对 cfc_materialized 有完全访问权限。 当我将查询拆分为 2 个查询时,速度要快得多。 第一个查询:

SELECT cfc.targetid
FROM cfc_materialized cfc
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
  AND cfc.association_type = 'ContactDataSharing';

此查询返回 2 个目标 ID。 解释计划:

-------------------------------------------------------------------------------------------
| Id  | Operation        | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                        |  2164 |   196K|    36   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| CFC_MATERIALIZED_A_S_T |  2164 |   196K|    36   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND 
              "CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')

这里是第二个查询。我使用第一个查询的输出作为 IN 参数的输入:

SELECT *
FROM contact c
WHERE c.client in (
    'e95027f0-a83e-11e3-a0ae-005056aebabc',
    'eb37d3b0-a83e-11e3-a0ae-005056aebabc'
  )
  AND c.deleted = 0;

解释计划:

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |  2814 |   283K|   505   (0)| 00:00:07 |
|   1 |  INLIST ITERATOR             |                |       |       |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| CONTACT        |  2814 |   283K|   505   (0)| 00:00:07 |
|*  3 |    INDEX RANGE SCAN          | CONTACT_CLIENT |  2827 |       |    23   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("C"."DELETED"=0)
   3 - access("C"."CLIENT"='e95027f0-a83e-11e3-a0ae-005056aebabc' OR 
              "C"."CLIENT"='eb37d3b0-a83e-11e3-a0ae-005056aebabc')

问题

所以我的问题是,为什么 CBO 不以类似于手动执行 2 个查询的方式执行查询?我还尝试了带有提示的查询,以使用与 2 个查询中相同的索引:

SELECT /*+ index(c CONTACT_CLIENT) */ c.*
FROM contact c
WHERE c.client in (
    SELECT /*+ index(cfc CFC_MATERIALIZED_A_S_T) */ cfc.targetid
    FROM cfc_materialized cfc
    WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
      AND cfc.association_type = 'ContactDataSharing'
  )
  AND c.deleted = 0;

但后来我得到了一个更糟糕的解释计划:

-------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                        |    11M|  2214M|   305K  (1)| 01:01:09 |
|   1 |  NESTED LOOPS                |                        |       |       |            |          |
|   2 |   NESTED LOOPS               |                        |    11M|  2214M|   305K  (1)| 01:01:09 |
|   3 |    SORT UNIQUE               |                        |  2164 |   196K|    36   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN         | CFC_MATERIALIZED_A_S_T |  2164 |   196K|    36   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | CONTACT_CLIENT         |  5500 |       |    37   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID| CONTACT                |  5474 |   550K|   973   (0)| 00:00:12 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND 
              "CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')
   5 - access("C"."CLIENT"="CFC"."TARGETID")
   6 - filter("C"."DELETED"=0)

我也尝试过加入而不是像 Alexander 和 StanislavL 建议的那样使用 IN 子句:

SELECT c.*
FROM contact c
JOIN cfc_materialized cfc ON c.client = cfc.targetid
WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
  AND cfc.association_type = 'ContactDataSharing'
  AND c.deleted = 0;

并得到以下解释计划,这又是相当慢(比两个单独的查询慢):

---------------------------------------------------------------------------------------------
| Id  | Operation          | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                        |    11M|  2214M| 51745   (1)| 00:10:21 |
|*  1 |  HASH JOIN         |                        |    11M|  2214M| 51745   (1)| 00:10:21 |
|*  2 |   INDEX RANGE SCAN | CFC_MATERIALIZED_A_S_T |  2164 |   196K|    36   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| CONTACT                |    12M|  1237M| 51649   (1)| 00:10:20 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("C"."CLIENT"="CFC"."TARGETID")
   2 - access("CFC"."ASSOCIATION_TYPE"='ContactDataSharing' AND 
              "CFC"."SOURCEID"='e95027f0-a83e-11e3-a0ae-005056aebabc')
   3 - filter("C"."DELETED"=0)

【问题讨论】:

  • 另外,我相信这里需要cfc_materialized.targetid 上的索引

标签: sql performance oracle query-optimization sql-execution-plan


【解决方案1】:
SELECT c.*
FROM contact c
    join (
    SELECT cfc.targetid as client
    FROM cfc_materialized cfc
    WHERE cfc.sourceid = 'e95027f0-a83e-11e3-a0ae-005056aebabc'
      AND cfc.association_type = 'ContactDataSharing'
  ) sub ON c.client=sub.client
  AND c.deleted = 0;

使用联接而不是IN 以避免对WHERE 中的每一行计算相同的查询

【讨论】:

  • 为什么查询会被计算多次?
【解决方案2】:

主要是避免对Contact 表的全表访问。 避免c.*,指定具体的列列表。 重建索引

CREATE INDEX CFC_MATERIALIZED_A_S_T ON CFC_MATERIALIZED(TARGETID, ASSOCIATION_TYPE, SOURCEID);

控制加入订单访问。

告诉我。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-09-05
    • 2012-07-28
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多