【发布时间】:2021-10-22 01:42:48
【问题描述】:
我已设置 Cloudwatch Metric 来监视日志文件:
resource "aws_cloudwatch_log_metric_filter" "log_errors" {
name = "${local.fullname}-log-errors"
log_group_name = "/aws/lambda/${local.fullname}"
pattern = "{ $._logLevel = \"error\" }"
metric_transformation {
name = "${local.fullname}-error-count"
namespace = "MyApp"
value = "1"
}
}
我可以看到该指标正在运行 - 请注意下面 13:15 的点(我手动创建了一个日志条目进行测试):
如果指标在一分钟内报告 1 个或多个事件,则会触发警报:
resource "aws_cloudwatch_metric_alarm" "log_errors_alarm" {
alarm_name = "${local.fullname}-log-errors"
alarm_description = "log.error() count for MyApp lambda ${local.fullname}"
metric_name = "${local.fullname}-error-count"
threshold = "0"
statistic = "Sum"
unit = "Count"
comparison_operator = "GreaterThanThreshold"
datapoints_to_alarm = "1"
evaluation_periods = "1"
period = "60"
namespace = "MyApp"
treat_missing_data = "notBreaching"
alarm_actions = [data.aws_ssm_parameter.sns_topic_arn.value]
ok_actions = [data.aws_ssm_parameter.sns_topic_arn.value]
}
但尽管指标有一个事件(如上所述),但警报永远不会触发:
我不确定如何调试它,因为所有 AWS 资源都已成功创建,我手动创建的错误将传递给指标,并且我在其他 lambdas 中成功使用了非常相似的警报配置,它会抛出警报。
为什么我的指标有效但我的警报没有警报?
【问题讨论】:
标签: amazon-web-services terraform amazon-cloudwatch cloudwatch-alarms