direct path read 测试

转自 http://blog.itpub.net/22034023/viewspace-688817

CRMG的数据库等待事件TOP 5里，总会出现一个10G以前几乎不可能在TOP里的等待，direct path read。

10G以前这个等待跟并行查询有关。

在执行parallel query时，slave session所执行的direct path I/O引发的。如果这个等待比较严重，需要提高parallel query本身性能。

让我不解的是，我们crmg上没有并行查询。

11G莫非发生了什么变化？

答案即将揭晓：

先看下下面一段话，摘抄地址：http://oraclue.com/2009/07/17/direct-path-reads-and-serial-table-scans-in-11g/

There have been changes in 11g in the heuristics to choose between direct path reads or reads through buffer cache for serial table scans.

In 10g, serial table scans for "large" tables used to go through cache (by default) which is not the case anymore.

In 11g, this decision to read via direct path or through cache is based on the size of the table, buffer cache size and various other stats.

Direct path reads are faster than scattered reads and have less impact on other processes because they avoid latches.

11G以后，ORACLE会根据表大小来智能的选择全表扫描是通过从buffer cache读还是以direct path read方式读。direct path read这种方式需要更少的latch。

我们来测试一把：

1）构建一个大表，足够大于_small_table_threshold的值。

SQL> set linesize 120

SQL> col name for a30

SQL> col value for a20

SQL> col describ for a60

SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ

2 FROM SYS.x$ksppi x, SYS.x$ksppcv y

3 WHERE x.inst_id = USERENV ('Instance')

4 AND y.inst_id = USERENV ('Instance')

5 AND x.indx = y.indx

6 AND x.ksppinm LIKE '%&par%'

7 /

输入 par 的值: small_table

原值 6: AND x.ksppinm LIKE '%&par%'

新值 6: AND x.ksppinm LIKE '%small_table%'

NAME VALUE DESCRIB

------------------------------ -------------------- ------------------------------------------------------------

_small_table_threshold 1300 threshold level of table size for direct reads

这里我选择了ord_order_item_detail，是crmg上一个非常大的表。

然后开两个session ,分别用10046跟一下：

在开始跟之前需要介绍一个事件，10949事件，这个事件可以取消direct path read。

SESSION 1, direct path read:

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

select /*+ full(a) */ count(*) from ord_order_item_detail a;

alter session set events '10046 trace name context off';

SESSION 2,取消direct path read，采取散列读。

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

alter session set events '10949 trace name context forever, level 1';

select /*+ full(a) */ count(*) from ord_order_item_detail a;

alter session set events '10046 trace name context off';

我们看一下两个session 的跟踪文件：

SESSION 1的跟踪文件：

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 1 0.00 0.00 0 0 0 0

Execute 1 0.00 0.00 0 0 0 0

Fetch 2 3.00 53.37 97515 97639 0 1

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 4 3.00 53.37 97515 97639 0 1

Elapsed times include waiting on following events:

Event waited on Times Max. Wait Total Waited

---------------------------------------- Waited ---------- ------------

SQL*Net message to client 3 0.00 0.00

SQL*Net message from client 3 0.00 0.00

db file sequential read 13 0.01 0.07

direct path read 12250 1.39 49.74

********************************************************************************

SESSION 2的跟踪文件：

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 1 0.00 0.00 0 0 0 0

Execute 1 0.00 0.00 0 0 0 0

Fetch 2 2.97 3.42 97515 98129 0 1

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 4 2.97 3.42 97515 98129 0 1

Elapsed times include waiting on following events:

Event waited on Times Max. Wait Total Waited

---------------------------------------- Waited ---------- ------------

SQL*Net message to client 7 0.00 0.00

SQL*Net message from client 7 0.00 0.00

db file sequential read 13 0.00 0.00

db file scattered read 6125 0.00 0.55

latch free 3 0.00 0.01

latch: cache buffers chains 1 0.00 0.00

不过不幸的是，单纯从我的示例中来看，我们发现direct path read比db file scattered read方式，性能差好多

不过latch的争用，direct path read方式没有产生latch争用，db file scattered read方式存在latch争用。

_serial_direct_read，这个参数，文档中记载，如果这个参数为true的话，将开启direct path read。但是我测试的结果是，不管这个参数的值为何，

只要表大小达到一个值，都会用direct path read方式来进行表扫描。

还有一点需要说明的是，direct path read方式，会触发对这个表的检查点动作，即把这个表的相关脏数据写回disk.

可能我的库有问题。Direct read的读取时间差异非常大，而scater 读的时间非常平。

WAIT #2: nam='db file scattered read' ela= 94 file#=53 block#=272 blocks=16 obj#=12153 tim=1299644760722303

WAIT #2: nam='db file scattered read' ela= 87 file#=54 block#=273 blocks=15 obj#=12153 tim=1299644760725285

WAIT #2: nam='db file scattered read' ela= 92 file#=55 block#=272 blocks=16 obj#=12153 tim=1299644760728224

WAIT #2: nam='db file scattered read' ela= 90 file#=56 block#=272 blocks=16 obj#=12153 tim=1299644760731193

WAIT #2: nam='direct path read' ela= 7997 file number=11 first dba=26186 block cnt=15 obj#=12153 tim=1299644699977054

WAIT #2: nam='direct path read' ela= 4 file number=10 first dba=26474 block cnt=15 obj#=12153 tim=1299644699977115

WAIT #2: nam='direct path read' ela= 8752 file number=12 first dba=26058 block cnt=15 obj#=12153 tim=1299644699988878

WAIT #2: nam='direct path read' ela= 11 file number=11 first dba=26186 block cnt=15 obj#=12153 tim=1299644699988963

WAIT #2: nam='direct path read' ela= 13831 file number=29 first dba=124378 block cnt=15 obj#=12153 tim=1299644700005480

在青岛测试库测试了一下，direct path read扫描时间远低于db file scattered read。

22G的表

direct path read扫描需要 Elapsed: 00:03:20.44

db file scattered read扫描需要 Elapsed: 00:05:20.44

刚好差了两分钟。

对比两种方式发生的IO次数，比较接近：

oracle11@qdstandby2:/opt/oracle/admin/diag/rdbms/crmgsb/crmg/trace>grep 'direct path read' crmg_ora_455120.trc| wc -l

17479

oracle11@qdstandby2:/opt/oracle/admin/diag/rdbms/crmgsb/crmg/trace> grep 'db file scattered read' crmg_ora_455120.trc| wc -l <

18068

不过IO响应时间的大范围抖动在direct path read里依然存在：

WAIT #2: nam='direct path read' ela= 21 file number=16 first dba=214529 block cnt=127 obj#=23236 tim=26576037710961

WAIT #2: nam='direct path read' ela= 30413 file number=17 first dba=214400 block cnt=128 obj#=23236 tim=26576037744576

WAIT #2: nam='direct path read' ela= 16 file number=18 first dba=214401 block cnt=127 obj#=23236 tim=26576037747520

WAIT #2: nam='direct path read' ela= 13121 file number=16 first dba=214656 block cnt=128 obj#=23236 tim=26576037763559

WAIT #2: nam='direct path read' ela= 15 file number=17 first dba=214530 block cnt=126 obj#=23236 tim=26576037766438

WAIT #2: nam='direct path read' ela= 29255 file number=18 first dba=214528 block cnt=128 obj#=23236 tim=26576037798557

WAIT #2: nam='direct path read' ela= 15 file number=16 first dba=214785 block cnt=127 obj#=23236 tim=26576037801488

WAIT #2: nam='direct path read' ela= 17960 file number=17 first dba=214656 block cnt=128 obj#=23236 tim=26576037822200

WAIT #2: nam='direct path read' ela= 1 file number=18 first dba=214657 block cnt=127 obj#=23236 tim=26576037825246

WAIT #2: nam='direct path read' ela= 22402 file number=16 first dba=214912 block cnt=128 obj#=23236 tim=26576037850470

WAIT #2: nam='direct path read' ela= 15 file number=17 first dba=214785 block cnt=127 obj#=23236 tim=26576037853535

WAIT #2: nam='direct path read' ela= 31519 file number=18 first dba=214784 block cnt=128 obj#=23236 tim=26576037892702

WAIT #2: nam='direct path read' ela= 16 file number=16 first dba=215041 block cnt=127 obj#=23236 tim=26576037895685

WAIT #2: nam='direct path read' ela= 24008 file number=17 first dba=214912 block cnt=128 obj#=23236 tim=26576037922490

WAIT #2: nam='direct path read' ela= 16 file number=18 first dba=214913 block cnt=127 obj#=23236 tim=26576037925382

而db file scattered read的IO响应时间比较平稳

WAIT #1: nam='db file scattered read' ela= 31296 file#=16 block#=202242 blocks=126 obj#=23236 tim=26576310523644

WAIT #1: nam='db file scattered read' ela= 25830 file#=17 block#=202114 blocks=126 obj#=23236 tim=26576310554176

WAIT #1: nam='db file scattered read' ela= 30326 file#=18 block#=202114 blocks=126 obj#=23236 tim=26576310588374

WAIT #1: nam='db file scattered read' ela= 19134 file#=16 block#=202370 blocks=126 obj#=23236 tim=26576310611310

WAIT #1: nam='db file scattered read' ela= 20454 file#=17 block#=202242 blocks=126 obj#=23236 tim=26576310635544

WAIT #1: nam='db file scattered read' ela= 22665 file#=18 block#=202242 blocks=126 obj#=23236 tim=26576310661976

WAIT #1: nam='db file scattered read' ela= 20304 file#=16 block#=202498 blocks=126 obj#=23236 tim=26576310687267

WAIT #1: nam='db file scattered read' ela= 7812 file#=17 block#=202370 blocks=126 obj#=23236 tim=26576310698848

WAIT #1: nam='db file scattered read' ela= 11577 file#=18 block#=202370 blocks=126 obj#=23236 tim=26576310715361

WAIT #1: nam='db file scattered read' ela= 10604 file#=16 block#=202626 blocks=126 obj#=23236 tim=26576310730434

WAIT #1: nam='db file scattered read' ela= 14844 file#=17 block#=202498 blocks=126 obj#=23236 tim=26576310749031

WAIT #1: nam='db file scattered read' ela= 14715 file#=18 block#=202498 blocks=126 obj#=23236 tim=26576310767578

WAIT #1: nam='db file scattered read' ela= 20183 file#=16 block#=202754 blocks=126 obj#=23236 tim=26576310792739

WAIT #1: nam='db file scattered read' ela= 14721 file#=17 block#=202626 blocks=126 obj#=23236 tim=26576310811619

WAIT #1: nam='db file scattered read' ela= 29155 file#=18 block#=202626 blocks=126 obj#=23236 tim=26576310845827

WAIT #1: nam='db file scattered read' ela= 22199 file#=16 block#=202882 blocks=126 obj#=23236 tim=26576310880130

WAIT #1: nam='db file scattered read' ela= 16471 file#=17 block#=202754 blocks=126 obj#=23236 tim=26576310901276

WAIT #1: nam='db file scattered read' ela= 22103 file#=18 block#=202754 blocks=126 obj#=23236 tim=26576310927107

WAIT #1: nam='db file scattered read' ela= 26420 file#=16 block#=203010 blocks=126 obj#=23236 tim=26576310958408

还测试了一下，表大小达到多大的时候会出现direct path read

结论是至少大于_small_table_threshold*5，也就是_small_table_threshold的五倍。

当是1----4倍的时候，SESSION的等待事件里都是：

EVENT TOTAL_WAITS TIME_WAITED

------------------------------ ----------- -----------

SQL*Net message from client 29 9752

db file scattered read 504 15

events in waitclass Other 2 11

db file sequential read 23 3

SQL*Net message to client 30 0

一达到五倍，就会出现direct path read

SQL*Net message from client 37 24145

events in waitclass Other 3 20

direct path read 448 13

db file sequential read 33 3

SQL*Net message to client 38 0

http://www.eygle.com/archives/2012/05/oracle_11g_direct_path_read.html

在Oracle Database 11g中有一个新特性，全表扫描可以通过直接路径读的方式来执行（Direct Path Read），这是一个合理的变化，如果全表扫描的大量数据读取是偶发性的，则直接路径读可以避免大量数据对于Buffer Cache的冲击。

当然对于小表来说，Oracle允许通过Buffer Cache来进行全表扫描，因为这可能更快，也对性能影响不大。
小表受到隐含参数：_small_table_threshold 影响。如果表大于 5 倍的小表限制，则自动会使用DPR替代FTS。
可以设置初始化参数： _serial_direct_read 来禁用串行直接路径读。

当然，Oracle通过一个内部的限制，来决定执行DPR的阈值。
可以通过设置10949事件屏蔽这个特性，返回到Oracle 11g之前的模式上：

alter session set events '10949 trace name context forever, level 1';

还有一个参数 _very_large_object_threshold 用于设定（MB单位）使用DPR方式的上限，这个参数需要结合10949事件共同发挥作用。
10949 事件设置任何一个级别都将禁用DPR的方式执行FTS，但是仅限于小于 5 倍 BUFFER Cache的数据表，同时，如果一个表的大小大于 0.8 倍的 _very_large_object_threshold 设置，也会执行DPR。

这些限定的目标在于：
对于大表的全表扫描，必须通过Direct Path Read方式执行，以减少对于Buffer Cache的冲击和性能影响。
但是我们可以通过参数调整来决定执行DPR的上限和下限。

Before 11gr2, full table scan access path read all the blocks of a table (or index fast full scan) under high water mark into the buffer cache unless either "_serial_direct_read" hidden parameter is set to true or the table/index have default parallelism set.

In 11gR2, Oracle will automatically decide whether to use direct path reads bypassing buffer cache for serial full table scans.

For using direct path reads, Oracle first has to write any dirty blocks of the table to disk by issuing object level checkpoint and then read all the blocks from disk into the PGA of server process. If it has to undo any uncommitted transactions to provide read consistency or to do delayed block clean out, Oracle will use server process PGA to construct read consistent block buffers.

If block clean out has to be performed by the server process using direct path reads for full table scans, it won't write those modified blocks back to disk, so every time you perform full table scan using direct path reads it'd have to keep doing the same work of cleaning out the blocks for every execution. For this reason, it is recommended to manually clean out those blocks by performing full table scan without using direct path reads.

Following behavior was observed with my testing on 11.2.0.1:

Hidden parameter "_small_table_threshold" defines the number of blocks to consider a table as small. Any table having more blocks (about 5 times the value of "_small_table_threshold" if you leave it at default value) will automatically use direct path reads for serial full table scans (FTS).

Hidden parameter "_very_large_object_threshold" defines the upper limit of object size in MB for using direct path reads. There is no effect on FTS behavior just by setting this parameter alone.

Event 10949 set to any level will disable direct path reads for serial scans only if the size of an object is less then 5 times the size of buffer cache.

Combination of event 10949 and "_very_large_object_threshold" parameter will disable direct path reads for serial scans if the size of an object is less than 5 times the size of buffer cache or the value of "_very_large_object_threshold" is less than about 0.8 times the size of an object.

So, if you want to disable direct path reads for serial scans for any object, then set event 10949 at any level and set "_very_large_object_threshold" to greater than the size of largest object in MB.

Hidden parameter "_serial_direct_read" (or event 10355 set at any level) set to TRUE will enable direct path reads for all serial scans, unless the table is considered as small table and it's caching attribute is set (by issuing alter table xxxx cache). Remember that any sql statement already parsed and not using direct path reads will continue to do so unless hard parse is forced after setting these parameters. For this reason, it is better not to set these parameters.

It is not recommended to set any of the above mentioned hidden parameters if you want direct path reads to be used for serial scans, let Oracle decide dynamically based on the size of an object.