【问题标题】:How to get a Route53 Health Check working with a CloudWatch alarm Terraform?如何使用 CloudWatch 警报 Terraform 进行 Route53 运行状况检查?
【发布时间】:2019-03-04 10:56:18
【问题描述】:

我正在尝试让 CloudWatch 警报与 Route53 运行状况检查一起使用。

我可以在 AWS GUI 中手动设置。

当我尝试使用 Terraform 时,运行状况检查显示“未配置警报”。

我已经看到有关对 http(或其他)端口使用健康检查的方法的建议,但我的服务是内部服务,因此不对 HTTP/TCP 端口检查开放,因此我正在查看 CloudWatch StatusCheckFailed 警报。

无论我做什么,我的运行状况检查似乎都以“未配置警报”告终(此屏幕转储显示手动创建的两个使用正常工作警报和两个通过 Terraform 创建的显示“未配置警报”)。

有人设法让它工作吗?

-=-=-=-=-

我在 AWS 控制台 GUI 中手动向上面的“没有配置健康检查的警报”之一添加了警报,它出现并更新了其状态。

在此过程中,我注意到运行状况检查描述是 CloudWatch 警报的名称,因此看起来至少有一些警报信息已由 Terraform 处理。

-=-=-=-=-=-=-=-=-

这是 Route53 运行状况检查和 CloudWatch 警报之一的 Terraform 代码。

CW 警报:

# This is a dummy alarm, for testing.
# CloudWatch alarm for use with Route 53 DNS health Check; this does not have an action.
resource "aws_cloudwatch_metric_alarm" "dummy_alarm" {
  provider                  = "aws.use1"
  alarm_name                = "smb-nfs-server-dummy-alarm"
  alarm_description         = "Check the SMB-NFS server is alarm"
  comparison_operator       = "GreaterThanOrEqualToThreshold"
  metric_name               = "StatusCheckFailed"
  namespace                 = "AWS/EC2"
  period                    = "60"
  evaluation_periods        = "2"
  statistic                 = "Maximum"
  threshold                 = "1"
  treat_missing_data        = "breaching"
  #insufficient_data_actions = []
  #alarm_actions             = []

  dimensions {
    InstanceId              = "${var.server_01_id}"
    #HealthCheckId           = "${var.dns_hc_01_id}"
  }
}

Route53 HC:

resource "aws_route53_health_check" "server_01_health" {
    provider                  = "aws.use1"
    child_health_threshold  = "0",
    #child_healthchecks.#    = "0",
    #cloudwatch_alarm_name  = "awsec2-i-03dc5080f7bd3037d-paul-smb-gw-02-a-High-Status-Check-Failed-Any-",
    #cloudwatch_alarm_region    = "eu-west-1",
    enable_sni  = "false",
    failure_threshold   = "0",
    fqdn    = "",
    #id = "6eb384bc-2129-47ff-9a7a-90adb9f9351f",
    #insufficient_data_health_status    = "LastKnownStatus",
    invert_healthcheck  = "false",
    #ip_address  = "",
    measure_latency = "false",
    port    = "0",
    #regions.#   = "0",
    request_interval    = "0",
    resource_path   = "",
    search_string   = "",
    #tags.% = "1",
    #tags.Name  = "smb-nfs-gw-02-a-OK",
    #type    = "CLOUDWATCH_METRIC"

    #----------------------
    cloudwatch_alarm_name           = "${aws_cloudwatch_metric_alarm.dummy_alarm.alarm_name}"
    #cloudwatch_alarm_name           = "${aws_cloudwatch_metric_alarm.smb_nfs_server_01_alarm.alarm_name}"
    cloudwatch_alarm_region         = "us-east-1"
    #cloudwatch_alarm_region         = "${var.aws_region}"
    insufficient_data_health_status = "LastKnownStatus"
    tags = "${merge(var.tags, map("Name", "${var.tags["Name"]}_server_01_health"))}"
    type                            = "CLOUDWATCH_METRIC"
}

(如您所见,我一直在尝试各种选项,包括地区。

-=-=-=-=-=-=-=-=-

【问题讨论】:

    标签: amazon-web-services terraform amazon-route53


    【解决方案1】:

    我已经通过将 HealthCheckId 放入警报的维度中来使其工作,而不是使用 aws_route53_health_check 中的 cloudwatch_alarm_name 属性

    resource "aws_cloudwatch_metric_alarm" "dummy_alarm" {
      ...
      dimensions {
        ...
        HealthCheckId           = "${aws_route53_health_check.server_01_health.id}"
      }
    }
    
    resource "aws_route53_health_check" "server_01_health" {
        ...
        #cloudwatch_alarm_name   =
        #cloudwatch_alarm_region =
    }
    

    【讨论】:

      【解决方案2】:

      @Doug 是正确的。只是为了进一步强调更正,该解决方案要求您将尺寸字典调整为:

      dimensions = {
                     'HealthCheckId' : "${aws_route53_health_check.server_01_health.id}"
                  },
      

      健康检查资源中围绕 cloudwatch 指标警报和区域的参数在您创建监控 cloudwatch 警报状态的健康检查时使用。

      【讨论】:

        猜你喜欢
        • 2017-02-15
        • 2014-03-16
        • 2020-04-17
        • 1970-01-01
        • 1970-01-01
        • 2021-12-08
        • 2021-05-25
        • 2019-07-14
        • 2021-03-31
        相关资源
        最近更新 更多