重启的reboot现象:

怀疑的可能:1.硬件:温度过高,内存故障。2.crontab定时任务。3.人为手工reboot 4.计划断电

最后一一排除。

ulimit设置不当,可能导致操作系统重启

操作系统日志:/var/log/messages无异常

ulimit设置不当,可能导致操作系统重启

机房检查硬件也正常,无报警。

然后沟通了解到,每次重启之前数据库是有跑大量的业务操作任务的。怀疑是oracle导致了系统重启。

alert检查:发现虽然重启之前没有任何告警,但是实例启动后有红色部分的警告如下:

2020-04-29T21:05:12.706017+08:00
Thread 1 advanced to log sequence 1906 (LGWR switch)
  Current log# 1 seq# 1906 mem# 0: /data1/oracle/oradata/orclts/redo01.log
2020-04-29T22:13:52.470497+08:00
Starting ORACLE instance (normal) (OS id: 11755)
2020-04-29T22:13:52.542739+08:00
CLI notifier numLatches:7 maxDescs:554
2020-04-29T22:13:52.630245+08:00
**********************************************************************
2020-04-29T22:13:52.630300+08:00
Dump of system resources acquired for SHARED GLOBAL AREA (SGA) 

2020-04-29T22:13:52.630391+08:00
 Per process system memlock (soft) limit = 64K
2020-04-29T22:13:52.630447+08:00
 Expected per process system memlock (soft) limit to lock
 SHARED GLOBAL AREA (SGA) into memory: 24G

2020-04-29T22:13:52.630557+08:00
 Available system pagesizes:
  4K, 2048K 
2020-04-29T22:13:52.630652+08:00
 Supported system pagesize(s):
2020-04-29T22:13:52.630701+08:00
  PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)
2020-04-29T22:13:52.630751+08:00
        4K       Configured               8         6291464        NONE
2020-04-29T22:13:52.630852+08:00
     2048K                0           12289               0        NONE
2020-04-29T22:13:52.630901+08:00
RECOMMENDATION:
2020-04-29T22:13:52.630950+08:00
 1. For optimal performance, configure system with expected number 
 of pages for every supported system pagesize prior to the next 
 instance restart operation.
2020-04-29T22:13:52.680828+08:00
 2. Increase per process memlock (soft) limit to at least 24GB
 to lock 100% of SHARED GLOBAL AREA (SGA) pages into physical memory
2020-04-29T22:13:52.680931+08:00

**********************************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 4
Number of processor cores in the system is 4
Number of processor sockets in the system is 1
Using LOG_ARCHIVE_DEST_1 parameter default value as /data1/oracle/product/12.2.0/db_1/dbs/arch
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =55
LICENSE_MAX_USERS = 0
SYS auditing is enabled
NOTE: remote asm mode is local (mode 0x1; from cluster type)
NOTE: Using default ASM root directory ASM
NOTE: Cluster configuration type = NONE [2]
Oracle Database 12c Standard Edition Release 12.2.0.1.0 - 64bit Production.
ORACLE_HOME:    /data1/oracle/product/12.2.0/db_1
System name:    Linux

 

尝试解决处理:

1.修改参数: /etc/security/limits.conf  原先没配置memlock

ulimit设置不当,可能导致操作系统重启

应调整如下:
oracle soft nproc 112047
oracle hard nproc 116384
oracle soft nofile 14096
oracle hard nofile 165536
oracle soft stack 110240
oracle hard stack 132768
oracle hard memlock unlimited
oracle soft memlock unlimited

2.配置开启Linux大页,原先没有配置大页,原则上值大于sga小于物理内存
# cat /proc/meminfo |grep Hug
AnonHugePages:    524288 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

vi /etc/sysctl.conf

增加以下内容

vm.nr_hugepages = 26000

相关文章: