【发布时间】:2021-11-19 23:51:51
【问题描述】:
通过查看 Datadog AWS 集成文档,我发现 AWS 警报可以流式传输到 Datadog 中。据称,您可以选择两种不同的方法来将 AWS CloudWatch 警报发送到 警报集合 部分中的 Datadog 事件流right here。 但是没有关于如何做到这一点或应该设置什么来做到这一点的进一步解释。此外,尝试在 Google 上搜索“Datadog aws 警报轮询”之类的内容会给您一些其他功能的模糊描述,但 AWS CloudWatch 警报却没有。
我的问题是有可能吗?
到目前为止,我尝试的是设置升级的 DataDog Lambda 转发器,它将 CloudWatch 日志(我想也是指标和警报?)发送到 DD。我允许那个 lambda。我创建了一些 AWS 指标过滤器和 AWS 警报以在特定事件发生时触发。我运行了一些 lambda 代码来引发异常并触发 CloudWatch 警报以更改其状态。
我在 DD 中清楚地看到 lambda 日志,但在 DD 事件中找不到与我的警报相关的任何内容。而且我认为 DD-AWS 集成不是问题,因为我们在大型组织中使用它,并且在我加入公司之前很久就配置了它。 我做错了什么?
下面的 Cloudformation 脚本(我删除了一些部分,所以它不能按原样工作)
Resources:
DatadogForwarderLambda:
Type: AWS::Lambda::Function
Properties:
Description: Pushes logs, metrics and traces from AWS to Datadog.
Role: !GetAtt "DatadogForwarderLambdaRole.Arn"
Handler: lambda_function.lambda_handler
Code:
S3Bucket: config-sandbox
S3Key: 'aws-dd-forwarder-3.38.0.zip'
MemorySize: 1024
Runtime: python3.7
Timeout: 120
Tags:
- Key: "dd_forwarder_version"
Value: 3.38.0
Environment:
Variables:
DD_ENHANCED_METRICS: "false"
DD_API_KEY_SECRET_ARN:
Ref: DdApiKeySecret
DD_S3_BUCKET_NAME: config-sandbox
DD_SITE: datadoghq.com
DD_: datadoghq.com
DD_TAGS_CACHE_TTL_SECONDS: 300
DD_FETCH_LAMBDA_TAGS: true
DD_USE_TCP: false
DD_NO_SSL: false
REDACT_IP: false
REDACT_EMAIL: false
DD_USE_PRIVATE_LINK: false
DD_USE_VPC: false
ReservedConcurrentExecutions: 100
DatadogReadonlyPolicy:
Type: 'AWS::IAM::Policy'
Properties:
PolicyName: !Sub "DatadogReadonlyPolicy"
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- 'cloudwatch:Get*'
- 'cloudwatch:List*'
- 'cloudwatch:DescribeAlarmHistory'
- 'cloudtrail:LookupEvents'
- 'ec2:Describe'
- 's3:GetObject'
- 's3:PutObject'
- 's3:DeleteObject'
- 's3:ListBucket'
- 'lambda:List*'
- 'tag:GetResources'
- 'tag:GetTagKeys'
- 'tag:GetTagValues'
- 'support:*'
Resource: !GetAtt DatadogForwarderLambda.Arn
- Effect: Allow
Action:
- secretsmanager:GetSecretValue
Resource:
- Ref: DdApiKeySecret
Roles:
- !Ref DatadogForwarderLambdaRole
DatadogForwarderLambdaRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
AWS:
- Fn::Sub:
- "arn:aws:iam::${AccountId}:role/human-role/some-role-name"
- { AccountId: !Ref 'AWS::AccountId' }
Action:
- sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
- arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole
Path: /
PermissionsBoundary:
Fn::Join:
- ''
- - 'arn:aws:iam::'
- Ref: AWS::AccountId
- ':policy/some-organisation-permission-boundary'
RoleName:
Fn::Sub:
- 'a${AIID}-dd-forwarder-lambda-${StackID}'
- { StackID: !Select [4, !Split ["-", !Ref 'AWS::StackId']],
AIID: !Ref AIID }
IncomingQueueHasMessagesExceptionAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Incoming queue has unprocessed messages, new processing round can't be started
AlarmName: !Sub "IncomingQueueHasMessagesExceptionAlarm"
ComparisonOperator: GreaterThanThreshold
Threshold: 0 # no messages are allowed in queue if new round started
EvaluationPeriods: 1
Period: 10
Namespace: dev-logs
MetricName: QueueHasMessagesException
Statistic: Sum
TreatMissingData: missing
IncomingQueueHasMessagesExceptionMetricFilter:
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName:
!Sub '/aws/lambda/${SomeLambdaName}'
FilterPattern: "QueueHasMessagesException"
MetricTransformations:
-
MetricNamespace: dev-logs
MetricName: QueueHasMessagesException
MetricValue: 1
【问题讨论】:
标签: amazon-web-services aws-lambda amazon-cloudformation amazon-cloudwatch datadog