凌晨收到同事电话,反馈应用程序访问Oracle数据库时报错,当时现场现象确认:

 

1. 应用程序访问不了数据库,使用SQL Developer测试发现访问不了数据库。报ORA-12570 TNS:packet reader failure

 

2. 使用lsnrctl status检查监听,一直没有响应,这个是极少见的情况。

 

3. 检查数据库状态为OPEN,使用nmon检查系统资源。如下一张截图所示,CPU利用率不高,但是CPU Wait%非常高。这意味着I/O不正常。可能出现了IO等待和争用(IO waits and contention)

    CPU Wait%:显示采集间隔内所有CPU处于空闲且等待I/O完成的时间比例,Wait%是CPU空闲状态的一种,当CPU处于空闲状态而又有进程处于D状态(不可中断睡眠)时,系统会统计这时的时间,并计算到Wait%里,Wait%不是一个时间值,而是时间的比例,因此在同样I/O Wait时间下,服务器CPU越多,Wait%越低,它体现了I/O操作与计算操作之间的比例。对I/O密集型的应用来说一般Wait%较高.)

ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166'

 

4.打开邮件发现收到大量的监控告警日志作业发出的邮件,检查告警日志,发现里面有大量ORA错误信息,部分内容如下:

3 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
10 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
17 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
24 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
31 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
38 | | ORA-00239: timeout waiting for control file enqueue: held by 'inst 1, osid 5166' for more than 900 seconds 
 
41 | | ORA-00494: enqueue [CF] held for too long (more than 900 seconds) by 'inst 1, osid 5166' 
 
48 | | ORA-00239: timeout waiting for control file enqueue: held by 'inst 1, osid 5166' for more than 900 seconds 

相关文章: