【问题标题】:Why SQL query is running very slow in Oracle Database?Why SQL query is running very slow in Oracle Database?
【发布时间】:2022-12-31 16:09:12
【问题描述】:

I have a table with about 18 million records. I have to read the whole table in one query because all of the records were created on one date. So, firstly I defined an index on four columns that I have a query on them using the below command:

CREATE INDEX test_etl_index ON test_table(t_date,c_num,obc,bu);

After creating indexes, I must use this query for every columns of c_num,obc,bu. The query is below:

    SELECT t_date,
       c_num,
       pd,
       pds,
       uc,
       obc,
       t_id,
       da,
       ca,
       db,
       t_time,
       ibc,
       lc,
       lt,
       sts,
       wd,
       bu
FROM   test_table
WHERE  t_date = '20170628'
       AND c_num IN (SELECT KEY
                     FROM   c_g
                     WHERE  g_id = 1);

But, for each column, the query takes about 8 minutes which is very very slow!

Would you please guide me on how I can change the query to get better performance? Any help would be appreciated.

  • What if you change that in (select ...) to an INNER JOIN?
  • The same impression for me, maybe Hash Join will help here.
  • Please don't store dates as strings. Makes me cringe.
  • Before you start creating indexes check (and post) your execution plan. Most probably some index is your problem.
  • Index will probably not be useful in this query because of skip scans (you filter not by leading column, so it will require to read all the data in the index). Maybe separate indexes for each column may help here, if you will use a single query with filter for each column combined via OR

标签: sql oracle plsql oracle19c


【解决方案1】:

Close to no advice can be done based on the information you provided.

Except - as commended - to fix the data type for the DATE column, as the storage of dates as strings can realy confuse the optimizer.

The expected setup dependes on your data, here are some hints.

The DATE colums is Selective

In case your predicate date='20170628' (or better col_date = date'2017-06-28) returns only very few records you'll profite from an index on this column.

create index test_table_idx on test_table(col_date);

You can expect an execution plan as follows

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                |     4 |   472 |     5   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI                      |                |     4 |   472 |     5   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TEST_TABLE     |    10 |  1120 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | TEST_TABLE_IDX |    10 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL                  | C_G            |     3 |    18 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("C_NUM"="KEY")
   3 - access("COL_DATE"=TO_DATE(' 2002-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   4 - filter("G_ID"=1)

Note that Oracle rewrite your in (subquery) in a hash join semi, so no manual query rewrite is required.

C_NUM is Selective

If on the contrary the predicate c_num in (... returns very few records, define an index on c_num column.

create index test_table_idx2 on test_table(c_num);

You can expect an execution plan as follows

------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |     4 |   480 |    28   (4)| 00:00:01 |
|   1 |  NESTED LOOPS                |                 |     4 |   480 |    28   (4)| 00:00:01 |
|   2 |   NESTED LOOPS               |                 |    20 |   480 |    28   (4)| 00:00:01 |
|   3 |    SORT UNIQUE               |                 |     3 |    18 |     3   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL        | C_G             |     3 |    18 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | TEST_TABLE_IDX2 |    10 |       |     2   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID| TEST_TABLE      |     1 |   114 |    12   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - filter("G_ID"=1)
   5 - access("C_NUM"="KEY")
   6 - filter("COL_DATE"=TO_DATE(' 2000-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

No Selectivity

If neither of the above hold, forgot indexes and you should see an HASH JOIN SEMI that should not take as much time on a 18M table

-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            | 27273 |  3142K|  4516   (1)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI|            | 27273 |  3142K|  4516   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | C_G        |     3 |    18 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL  | TEST_TABLE | 90909 |  9943K|  4512   (1)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("C_NUM"="KEY")
   2 - filter("G_ID"=1)
   3 - filter("COL_DATE"=TO_DATE(' 2000-01-02 00:00:00', 'syyyy-mm-dd 
              hh24:mi:ss')) 

The important point is learn how to get the execution plan of the query, how to read it and how to understand where is the bottelneck.

  • Dear @Marmite Bomber, thank you for your answer. I defined Index just on (t_date and c_num), but the query run as slow as before.
【解决方案2】:

create index on date and c_num columns only rather than (date,c_num,obc,bu) Or create another index on date and c_num and call this idx2

  • Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
【解决方案3】:

Problem solved.

I create INDEX on columns separately, like this:

 CREATE INDEX t_date_inx ON test_table(t_date);
 CREATE INDEX c_num_inx ON test_table(c_num);

Then run the query. It run faster.

 SELECT t_date,
   c_num,
   pd,
   pds,
   uc,
   obc,
   t_id,
   da,
   ca,
   db,
   t_time,
   ibc,
   lc,
   lt,
   sts,
   wd,
   bu
   FROM   test_table
   WHERE  t_date = '20170628'
       AND c_num IN (SELECT KEY
                 FROM   c_g
                 WHERE  g_id = 1);

Hope it was useful for others.

  • This is diffierent to your original query.
  • Thank you for your feedback. I corrected it.
【解决方案4】:

use exists instead of in()

WHERE t_date = '20170628' AND exists(select 'x' FROM c_g WHERE g_id = 1 and KEY = test_table.c_num )

    猜你喜欢
    • 2022-12-19
    • 2022-12-02
    • 1970-01-01
    • 2022-12-19
    • 2022-12-02
    • 2020-02-11
    • 2022-11-20
    • 2022-12-02
    • 2022-08-05
    相关资源
    最近更新 更多