【问题标题】:Pods not replaced if MaxUnavailable set to 0 in Kubernetes如果在 Kubernetes 中将 MaxUnavailable 设置为 0,则不会替换 Pod
【发布时间】:2019-05-29 17:47:01
【问题描述】:

我想为我的 pod 回滚部署。我正在 CI 环境中使用 set Image 更新我的 pod。当我将 Deployment/web 文件上的 maxUnavailable 设置为 1 时,我会停机。但是当我将 maxUnavailable 设置为 0 时,Pod 不会被替换并且容器/应用程序不会重新启动。

我在 Kubernetes 集群中有一个节点,这是它的信息

    Allocated resources:
      (Total limits may be over 100 percent, i.e., overcommitted.)
      CPU Requests  CPU Limits  Memory Requests  Memory Limits
      ------------  ----------  ---------------  -------------
      881m (93%)    396m (42%)  909712Ki (33%)   1524112Ki (56%)
    Events:         <none>

这是完整的 YAML 文件。我确实有准备好的探针组。

            apiVersion: extensions/v1beta1
            kind: Deployment
            metadata:
              annotations:
                deployment.kubernetes.io/revision: "10"
                kompose.cmd: C:\ProgramData\chocolatey\lib\kubernetes-kompose\tools\kompose.exe
                  convert
                kompose.version: 1.14.0 (fa706f2)
                kubectl.kubernetes.io/last-applied-configuration: |
                  {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{"kompose.cmd":"C:\\ProgramData\\chocolatey\\lib\\kubernetes-kompose\\tools\\kompose.exe convert","kompose.version":"1.14.0 (fa706f2)"},"creationTimestamp":null,"labels":{"io.kompose.service":"dev-web"},"name":"dev-web","namespace":"default"},"spec":{"replicas":1,"strategy":{},"template":{"metadata":{"labels":{"io.kompose.service":"dev-web"}},"spec":{"containers":[{"env":[{"name":"JWT_KEY","value":"ABCD"},{"name":"PORT","value":"2000"},{"name":"GOOGLE_APPLICATION_CREDENTIALS","value":"serviceaccount/quick-pay.json"},{"name":"mongoCon","value":"mongodb://quickpayadmin:quickpay1234@ds121343.mlab.com:21343/quick-pay-db"},{"name":"PGHost","value":"173.255.206.177"},{"name":"PGUser","value":"postgres"},{"name":"PGDatabase","value":"quickpay"},{"name":"PGPassword","value":"z33shan"},{"name":"PGPort","value":"5432"}],"image":"gcr.io/quick-pay-208307/quickpay-dev-node:latest","imagePullPolicy":"Always","name":"dev-web-container","ports":[{"containerPort":2000}],"readinessProbe":{"failureThreshold":3,"httpGet":{"path":"/","port":2000,"scheme":"HTTP"},"initialDelaySeconds":5,"periodSeconds":5,"successThreshold":1,"timeoutSeconds":1},"resources":{"requests":{"cpu":"20m"}}}]}}}}
              creationTimestamp: 2018-12-24T12:13:48Z
              generation: 12
              labels:
                io.kompose.service: dev-web
              name: dev-web
              namespace: default
              resourceVersion: "9631122"
              selfLink: /apis/extensions/v1beta1/namespaces/default/deployments/web
              uid: 5e66f7b3-0775-11e9-9653-42010a80019d
            spec:
              progressDeadlineSeconds: 600
              replicas: 2
              revisionHistoryLimit: 10
              selector:
                matchLabels:
                  io.kompose.service: web
              strategy:
                rollingUpdate:
                  maxSurge: 1
                  maxUnavailable: 0
                type: RollingUpdate
              template:
                metadata:
                  creationTimestamp: null
                  labels:
                    io.kompose.service: web
                spec:
                  containers:
                  - env:
                    - name: PORT
                      value: "2000"

                    image: gcr.io/myimagepath/web-node
                    imagePullPolicy: Always
                    name: web-container
                    ports:
                    - containerPort: 2000
                      protocol: TCP
                    readinessProbe:
                      failureThreshold: 10
                      httpGet:
                        path: /
                        port: 2000
                        scheme: HTTP
                      initialDelaySeconds: 10
                      periodSeconds: 10
                      successThreshold: 1
                      timeoutSeconds: 10
                    resources:
                      requests:
                        cpu: 10m
                    terminationMessagePath: /dev/termination-log
                    terminationMessagePolicy: File
                  dnsPolicy: ClusterFirst
                  restartPolicy: Always
                  schedulerName: default-scheduler
                  securityContext: {}
                  terminationGracePeriodSeconds: 30
            status:
              availableReplicas: 2
              conditions:
              - lastTransitionTime: 2019-01-03T05:49:46Z
                lastUpdateTime: 2019-01-03T05:49:46Z
                message: Deployment has minimum availability.
                reason: MinimumReplicasAvailable
                status: "True"
                type: Available
              - lastTransitionTime: 2018-12-24T12:13:48Z
                lastUpdateTime: 2019-01-03T06:04:24Z
                message: ReplicaSet "dev-web-7bd498fc74" has successfully progressed.
                reason: NewReplicaSetAvailable
                status: "True"
                type: Progressing
              observedGeneration: 12
              readyReplicas: 2
              replicas: 2
              updatedReplicas: 2

我已尝试使用 1 个副本,但仍然无法正常工作。

【问题讨论】:

    标签: deployment kubernetes


    【解决方案1】:

    在第一种情况下,kubernetes 删除一个 pod (maxUnavailable: 1) 并使用新图像启动 pod 并等待约 110 秒(基于您的就绪探测)以检查新 pod 是否能够服务请求。新的 pod 无法提供请求,但 pod 处于运行状态,因此它删除了第二个旧 pod 并使用新图像启动它,第二个 pod 再次等待就绪探测完成。这就是在两个容器都没有准备好服务请求之间有一段时间的原因,因此会导致停机。

    在第二个场景中,你有maxUnavailable:0,kubernetes 首先用新图像启动 pod,它无法在大约 110 秒内处理请求(基于你的就绪探测),因此它超时并且它使用新图像删除新 pod。第二个吊舱也一样。因此,您的两个 pod 都没有更新

    所以原因是您没有给应用程序足够的时间来启动并开始服务请求。因此问题。请在您的就绪探测中增加failureThresholdmaxUnavailable: 0 的值,它会起作用。

    【讨论】:

    • 我的 readinessProbe 是否可能不起作用?我该如何测试呢?它是一个节点应用程序,它启动速度很快。我也将 failureThreshold 增加到 10。
    • 您可以在另一个窗口滚动更新时并行运行kubectl rollout status deployment.v1.apps/nginx-deployment,它会显示您的应用更新或失败的操作。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-02-16
    • 2020-11-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多