【问题标题】:How to configure a CloudWatch alarm to evaluate once every X minutes如何将 CloudWatch 警报配置为每 X 分钟评估一次
【发布时间】:2021-11-21 20:32:05
【问题描述】:

我想将 CloudWatch 警报配置为:

  • 对 ApplicationRequestsTotal 指标的最后 30 分钟求和每 30 分钟一次
  • 总和等于 0 时报警

我已将自定义 CloudWatch ApplicationRequestsTotal 指标配置为每 60 秒为我的服务发出一次。

我已将警报配置为:

{
    "MetricAlarms": [
        {
            "AlarmName": "radio-silence-alarm",
            "AlarmDescription": "Alarm if 0 or less requests are received for 1 consecutive period(s) of 30 minutes.",
            "ActionsEnabled": true,
            "OKActions": [],
            "InsufficientDataActions": [],
            "MetricName": "ApplicationRequestsTotal",
            "Namespace": "AWS/ElasticBeanstalk",
            "Statistic": "Sum",
            "Dimensions": [
                {
                    "Name": "EnvironmentName",
                    "Value": "service-environment"
                }
            ],
            "Period": 1800,
            "EvaluationPeriods": 1,
            "Threshold": 0.0,
            "ComparisonOperator": "LessThanOrEqualToThreshold",
            "TreatMissingData": "missing"
        }
    ],
    "CompositeAlarms": []
}

我已经设置了很多这样的警报,每个警报似乎都是:

  • 对过去 30 分钟的 ApplicationRequestsTotal 指标求和一次每分钟

例如,此服务在上午 8:36 开始获得 0 ApplicationRequestsTotal,并在上午 9:06 CloudWatch 触发了警报。

上述时间段的 aws cloudwatch describe-alarm-history:

{
    "AlarmName": "radio-silence-alarm",
    "AlarmType": "MetricAlarm",
    "Timestamp": "2021-09-29T09:06:37.929000+00:00",
    "HistoryItemType": "StateUpdate",
    "HistorySummary": "Alarm updated from OK to ALARM",
    "HistoryData": "{
       "version":"1.0",
       "oldState":{
          "stateValue":"OK",
          "stateReason":"Threshold Crossed: 1 datapoint [42.0 (22/09/21 08:17:00)] was not less than or equal to the threshold (0.0).",
          "stateReasonData":{
             "version":"1.0",
             "queryDate":"2021-09-22T08:47:37.930+0000",
             "startDate":"2021-09-22T08:17:00.000+0000",
             "statistic":"Sum",
             "period":1800,
             "recentDatapoints":[
                42.0
             ],
             "threshold":0.0,
             "evaluatedDatapoints":[
                {
                   "timestamp":"2021-09-22T08:17:00.000+0000",
                   "sampleCount":30.0,
                   "value":42.0
                }
             ]
          }
       },
       "newState":{
          "stateValue":"ALARM",
          "stateReason":"Threshold Crossed: 1 datapoint [0.0 (29/09/21 08:36:00)] was less than or equal to the threshold (0.0).",
          "stateReasonData":{
             "version":"1.0",
             "queryDate":"2021-09-29T09:06:37.926+0000",
             "startDate":"2021-09-29T08:36:00.000+0000",
             "statistic":"Sum",
             "period":1800,
             "recentDatapoints":[
                0.0
             ],
             "threshold":0.0,
             "evaluatedDatapoints":[
                {
                   "timestamp":"2021-09-29T08:36:00.000+0000",
                   "sampleCount":30.0,
                   "value":0.0
                }
             ]
          }
       }
    }"
}

我的哪些配置不正确?

【问题讨论】:

    标签: amazon-web-services amazon-cloudwatch cloudwatch-alarms


    【解决方案1】:

    这不是 Amazon CloudWatch 的工作方式。

    在 CloudWatch 中创建警报时,您指定:

    • 指标(例如 CPU 利用率,或者可能是发送到 CloudWatch 的自定义指标)
    • 时间段(例如前 30 分钟)
    • 一种聚合方法(例如Average、Sum、Count)

    例如,如果在过去 30 分钟内超过指标的平均值,CloudWatch 可以触发警报。这被持续评估为滑动窗口。它确实查看不同的 30 分钟块中的指标。

    使用您的示例,只要指标总和在 前 30 分钟内为零,它就会持续发送警报。

    【讨论】:

    • 感谢 John,很高兴知道该指标会被持续评估。据你所知,没有任何方法可以改变它吗?
    • 没有办法改变这种行为。
    • 感谢约翰的确认,非常感谢。
    猜你喜欢
    • 2022-01-23
    • 2012-10-03
    • 2020-04-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-04-26
    • 2021-10-30
    • 1970-01-01
    相关资源
    最近更新 更多