【问题标题】:Kubernetes Autoscaler on AWS not workingAWS 上的 Kubernetes Autoscaler 不工作
【发布时间】:2018-09-21 03:15:51
【问题描述】:

我正在尝试使用 Amazon AWS 设置 Kubernetes autoscaler,如下所述:DOCS 但我的 cluster-autoscaler pod 日志中出现此错误:

E0411 09:23:25.529212   1 static_autoscaler.go:118] Failed to update node registry: RequestError: send request failed caused by: Post https://autoscaling.us-west-2a.amazonaws.com/: dial tcp: lookup autoscaling.us-west-2a.amazonaws.com on 10.96.0.10:53: no such host

上下文:

我使用我的自定义实例 AMI 从 Launch Configration 创建了名为 KubeAutoscale 的 AWS Autoscaling 组,该实例已安装 Ubuntu 服务器 16.04 LTS (HVM) 和 Docker 与 Kubernetes(只是原始安装)。

在 AWS Autoscaling Group 中,我已将 2 个实例作为最小和最大 5 个实例(它们位于 us-west-2a 区域),我登录其中一个并设置 Kubernetes 集群,登录另一个实例并将其添加到创建的集群并再次登录主(第一个)实例运行 Autoscaler 配置:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["events","endpoints"]
  verbs: ["create", "patch"]
- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["pods/status"]
  verbs: ["update"]
- apiGroups: [""]
  resources: ["endpoints"]
  resourceNames: ["cluster-autoscaler"]
  verbs: ["get","update"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["watch","list","get","update"]
- apiGroups: [""]
  resources: ["pods","services","replicationcontrollers","persistentvolumeclaims","persistentvolumes"]
  verbs: ["watch","list","get"]
- apiGroups: ["extensions"]
  resources: ["replicasets","daemonsets"]
  verbs: ["watch","list","get"]
- apiGroups: ["policy"]
  resources: ["poddisruptionbudgets"]
  verbs: ["watch","list"]
- apiGroups: ["apps"]
  resources: ["statefulsets"]
  verbs: ["watch","list","get"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["watch","list","get"]

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["cluster-autoscaler-status"]
  verbs: ["delete","get","update"]

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - image: k8s.gcr.io/cluster-autoscaler:v0.6.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --nodes=2:5:KubeAutoscale
          env:
            - name: AWS_REGION
              value: us-west-2a
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

【问题讨论】:

    标签: amazon-web-services kubernetes


    【解决方案1】:

    您有配置问题:

     env:
      - name: AWS_REGION
        value: us-west-2a
    

    您的 AWS 区域是 us-west-2,但 AZ 是 us-west-2a。这就是为什么当 Autoscaling 生成自动缩放端点的 URL 时,结果是 https://autoscaling.us-west-2a.amazonaws.com/ 而不是 https://autoscaling.us-west-2.amazonaws.com/ - 这是正确的。

    要修复它,只需将 AWS_REGION 设置为 us-west-2 而不是 us-west-2a

    【讨论】:

    • 嗨安东。感谢您的快速回复,您是正确的,这是一个问题。它已修复,但现在我收到另一个错误:检查 的节点组时出错:错误 id: 预期格式 aws:////, got
    • 检查您的kubelet 守护进程选项,--cloud-provider=aws 选项应该存在。如果不是 - 只需添加它,它应该可以解决该问题。并且不要忘记检查节点的 IAM 角色。它至少应该允许访问标签。
    • 安东,感谢您的帮助。我已经将该标志添加到 kubelet 中,但是当说 AWS 没有找到集群 ID 时它失败了。然后我添加了 --cloud-config 标志和具有 KubernetesClusterTag、KubernetesClusterID 和 Zone 属性的配置,然后我做了:kubelet --cloud-provider=aws --cloud-config=/etc/kubernetes/cloud-config.conf 结束了with line: listen tcp 0.0.0.0:10255: bind: address already in use ... 并且我之前提到的错误正在重复(我猜是因为 kubelet 没有更新,因为最后一行关于正在使用的地址)。关于这个主题的文档真的很差:/
    • 您的回答为我节省了很多时间。
    猜你喜欢
    • 2020-09-30
    • 2021-06-01
    • 1970-01-01
    • 1970-01-01
    • 2018-10-09
    • 1970-01-01
    • 1970-01-01
    • 2019-05-19
    • 2021-05-12
    相关资源
    最近更新 更多