【发布时间】:2022-07-07 17:56:31
【问题描述】:
我正在尝试通过 CloudFormation 为自托管 Prometheus 设置 EFS 挂载。下面是我设置的 CloudFormation:
Resources:
ServiceSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: 'Prometheus SG'
VpcId:
Fn::ImportValue: !Sub '${NetworkStackName}-VPCID'
SecurityGroupIngress:
# Allow access from the Load Balancer only
- SourceSecurityGroupId:
Fn::ImportValue: !Sub '${LBStackName}-SG-LB'
IpProtocol: tcp
FromPort: 9090
ToPort: 9090
Tags:
- Key: Name
Value: !Sub 'SG-Prometheus-LB-${Stage}'
EFSSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: 'Prometheus EFS SG'
VpcId:
Fn::ImportValue: !Sub '${NetworkStackName}-VPCID'
SecurityGroupIngress:
# Allow access from the ECS Service only
- SourceSecurityGroupId: !Ref ServiceSecurityGroup
IpProtocol: tcp
FromPort: 2049
ToPort: 2049
Tags:
- Key: Name
Value: !Sub 'SG-Prometheus-EFS-${Stage}'
MountTarget1:
Type: AWS::EFS::MountTarget
Properties:
FileSystemId:
Fn::ImportValue: !Sub '${DataStackName}-EFSID'
SecurityGroups:
- !Ref EFSSecurityGroup
SubnetId:
Fn::ImportValue: !Sub '${NetworkStackName}-PRIVATE-SUBNET-A1'
MountTarget2:
Type: AWS::EFS::MountTarget
Properties:
FileSystemId:
Fn::ImportValue: !Sub '${DataStackName}-EFSID'
SecurityGroups:
- !Ref EFSSecurityGroup
SubnetId:
Fn::ImportValue: !Sub '${NetworkStackName}-PRIVATE-SUBNET-B1'
Prometheus:
Type: AWS::ECS::Service
DependsOn: ListenerRule
Properties:
Cluster: !Ref Cluster
LaunchType: FARGATE
DesiredCount: !FindInMap [ ECSTaskDefinition, !Ref Stage, DesiredTaskCount ]
TaskDefinition: !Ref TaskDefinition
HealthCheckGracePeriodSeconds: 300
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 100
LoadBalancers:
- ContainerName: 'Prometheus-Container'
ContainerPort: 9090
TargetGroupArn: !Ref TargetGroup
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref ServiceSecurityGroup
Subnets:
- Fn::ImportValue: !Sub '${NetworkStackName}-PRIVATE-SUBNET-A1'
- Fn::ImportValue: !Sub '${NetworkStackName}-PRIVATE-SUBNET-B1'
EnableECSManagedTags: true
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: Prometheus
RequiresCompatibilities:
- FARGATE
Cpu: !FindInMap [ ECSTaskDefinition, !Ref Stage, CPU ]
Memory: !FindInMap [ ECSTaskDefinition, !Ref Stage, Memory ]
ContainerDefinitions:
- Name: 'Prometheus-Container'
Essential: true
Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/${RepositoryName}:${ImageTag}'
Cpu: !FindInMap [ ECSTaskDefinition, !Ref Stage, CPU ]
Memory: !FindInMap [ ECSTaskDefinition, !Ref Stage, Memory ]
PortMappings:
- ContainerPort: 9090
MountPoints:
- SourceVolume: 'Prometheus-Volume'
ContainerPath: '/prometheus'
ReadOnly: false
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group:
Fn::ImportValue: !Sub '${DataStackName}-CW-LogsGroup'
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: 'PrometheusApp'
Volumes:
- Name: 'Prometheus-Volume'
EFSVolumeConfiguration:
FilesystemId:
Fn::ImportValue: !Sub '${DataStackName}-EFSID'
RootDirectory: "/"
TransitEncryption: ENABLED
NetworkMode: awsvpc
TaskRoleArn:
Fn::ImportValue: !Sub '${LBStackName}-TaskRoleArn'
ExecutionRoleArn: !GetAtt ExecutionRole.Arn
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
VpcId:
Fn::ImportValue: !Sub '${NetworkStackName}-VPCID'
TargetType: ip
Port: 9090
Protocol: HTTP
Matcher:
HttpCode: '200'
TargetGroupAttributes:
- Key: 'deregistration_delay.timeout_seconds'
Value: '60'
HealthCheckIntervalSeconds: 10
HealthCheckPath: /status
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: 2
HealthyThresholdCount: 2
ListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
Properties:
ListenerArn:
Fn::ImportValue: !Sub '${LBStackName}-LB-LISTENER'
Priority: 2
Conditions:
- Field: path-pattern
Values:
- /*
Actions:
- TargetGroupArn: !Ref TargetGroup
Type: forward
TaskPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: 'Prometheus-TaskPolicy'
Roles:
- Fn::ImportValue: !Sub '${LBStackName}-TaskRole'
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: EnablePutMetricData
Effect: Allow
Resource: '*'
Action:
- cloudwatch:PutMetricData
ExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: ["ecs-tasks.amazonaws.com"]
Action:
- sts:AssumeRole
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy'
Cluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "Prometheus-${Stage}"
一切都已成功部署,但容器因错误消息而死亡:ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: Failed to resolve "fs-xxxxxxxxxxx.efs.us-east-1.amazonaws.com。我已经检查过https://aws.amazon.com/premiumsupport/knowledge-center/fargate-unable-to-mount-efs/ 并且认为我们对区域没有问题。
欢迎任何想法。
【问题讨论】:
-
您的 VPC 是否启用了 DNS?如何在您的问题中包含您的 ECS 任务定义,以便我们了解您如何配置 EFS 挂载。您使用的是 EFS 接入点吗?您部署 ECS 任务的 VPC 子网是公有还是私有?如果是公共的,您是否在 ECS 任务上启用公共 IP?如果是私有的,它们是否有到 NAT 网关的路由?
-
- VPC 没有禁用 DNS - 已包含任务定义 - 不确定接入点 - 私有子网 - 它们是否应该有到 NAT 网关的路由?我原以为与 AWS 服务进行通信不需要这样做?
-
是的,私有子网需要路由到 NAT 网关或 VPC 终端节点,除非您已经在子网中启用了 EFS 访问点。您认为不需要与 AWS 服务进行通信的假设是不正确的。
标签: amazon-web-services yaml amazon-cloudformation amazon-ecs amazon-efs