【问题标题】:Nagios | Automatically fix a service if found in bad state纳吉奥斯 |如果发现状态不佳,自动修复服务
【发布时间】:2019-08-22 06:31:47
【问题描述】:

是否可以在 nagios 中配置命令或脚本以在发现服务处于不良状态时运行?

【问题讨论】:

    标签: nagios


    【解决方案1】:

    是的,这可以通过event handlers 实现。 这是服务定义的示例:

    define service {
        host_name               somehost
        service_description     HTTP
        max_check_attempts      4
        event_handler           restart-httpd
        ...
    }
    

    命令定义:

    define command {
        command_name    restart-httpd
        command_line    /usr/local/nagios/libexec/eventhandlers/restart-httpd  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
    }
    

    这是restart-httpd 脚本:

    #!/bin/sh    
    # What state is the HTTP service in?
    case "$1" in
    OK)
        # The service just came back up, so don't do anything...
        ;;
    WARNING)
        # We don't really care about warning states, since the service is probably still running...
        ;;
    UNKNOWN)
        # We don't know what might be causing an unknown error, so don't do anything...
        ;;
    CRITICAL)
        # Aha!  The HTTP service appears to have a problem - perhaps we should restart the server...
        # Is this a "soft" or a "hard" state?
        case "$2" in
    
        # We're in a "soft" state, meaning that Nagios is in the middle of retrying the
        # check before it turns into a "hard" state and contacts get notified...
        SOFT)
    
            # What check attempt are we on?  We don't want to restart the web server on the first
            # check, because it may just be a fluke!
            case "$3" in
    
            # Wait until the check has been tried 3 times before restarting the web server.
            # If the check fails on the 4th time (after we restart the web server), the state
            # type will turn to "hard" and contacts will be notified of the problem.
            # Hopefully this will restart the web server successfully, so the 4th check will
            # result in a "soft" recovery.  If that happens no one gets notified because we
            # fixed the problem!
            3)
                echo -n "Restarting HTTP service (3rd soft critical state)..."
                # Call the init script to restart the HTTPD server
                /etc/rc.d/init.d/httpd restart
                ;;
                esac
            ;;
    
        # The HTTP service somehow managed to turn into a hard error without getting fixed.
        # It should have been restarted by the code above, but for some reason it didn't.
        # Let's give it one last try, shall we?  
        # Note: Contacts have already been notified of a problem with the service at this
        # point (unless you disabled notifications for this service)
        HARD)
            echo -n "Restarting HTTP service..."
            # Call the init script to restart the HTTPD server
            /etc/rc.d/init.d/httpd restart
            ;;
        esac
        ;;
    esac
    exit 0
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2010-11-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-09-28
      相关资源
      最近更新 更多