Google Container Engine 上的 Kubernetes pod 不断重启，从未准备好答案

【问题标题】：Kubernetes pod on Google Container Engine continually restarts, is never readyGoogle Container Engine 上的 Kubernetes pod 不断重启，从未准备好
【发布时间】：2017-11-12 23:12:42
【问题描述】：

我正在尝试在 GKE 上部署一个幽灵博客，使用 persistent disks with WordPress tutorial。我有一个可以在 GKE 节点上手动运行的工作容器：

docker run -d --name my-ghost-blog -p 2368:2368 -d us.gcr.io/my_project_id/my-ghost-blog

我也可以使用另一个教程中的以下方法正确创建一个 pod：

kubectl run ghost --image=us.gcr.io/my_project_id/my-ghost-blog --port=2368

当我这样做时，我可以从集群内部将博客卷曲到内部 IP 上，并从 kubectl get pod 获得以下输出：

Name:       ghosty-nqgt0
Namespace:      default
Image(s):     us.gcr.io/my_project_id/my-ghost-blog
Node:       very-long-node-name/10.240.51.18
Labels:       run=ghost
Status:       Running
Reason:
Message:
IP:       10.216.0.9
Replication Controllers:  ghost (1/1 replicas created)
Containers:
  ghosty:
    Image:  us.gcr.io/my_project_id/my-ghost-blog
    Limits:
      cpu:    100m
    State:    Running
      Started:    Fri, 04 Sep 2015 12:18:44 -0400
    Ready:    True
    Restart Count:  0
Conditions:
  Type    Status
  Ready   True
Events:
  ...

根据 Wordpress 教程，当我尝试从 yaml 文件创建 pod 时，就会出现问题。。这是yaml：

metadata:
  name: ghost
  labels:
    name: ghost
spec:
  containers:
    - image: us.gcr.io/my_project_id/my-ghost-blog
      name: ghost
      env:
        - name: NODE_ENV
          value: production
        - name: VIRTUAL_HOST
          value: myghostblog.com
      ports:
        - containerPort: 2368

当我运行 kubectl create -f ghost.yaml 时，pod 已创建，但从未准备好：

> kubectl get pod ghost
NAME      READY     STATUS    RESTARTS   AGE
ghost     0/1       Running   11         3m

Pod 不断重启，kubectl describe pod ghost 的输出证实了这一点：

Name:       ghost
Namespace:      default
Image(s):     us.gcr.io/my_project_id/my-ghost-blog
Node:       very-long-node-name/10.240.51.18
Labels:       name=ghost
Status:       Running
Reason:
Message:
IP:       10.216.0.12
Replication Controllers:  <none>
Containers:
  ghost:
    Image:  us.gcr.io/my_project_id/my-ghost-blog
    Limits:
      cpu:    100m
    State:    Running
      Started:    Fri, 04 Sep 2015 14:08:20 -0400
    Ready:    False
    Restart Count:  10
Conditions:
  Type    Status
  Ready   False
Events:
  FirstSeen       LastSeen      Count From              SubobjectPath       Reason    Message
  Fri, 04 Sep 2015 14:03:20 -0400 Fri, 04 Sep 2015 14:03:20 -0400 1 {scheduler }                      scheduled Successfully assigned ghost to very-long-node-name
  Fri, 04 Sep 2015 14:03:27 -0400 Fri, 04 Sep 2015 14:03:27 -0400 1 {kubelet very-long-node-name} implicitly required container POD created   Created with docker id dbbc27b4d280
  Fri, 04 Sep 2015 14:03:27 -0400 Fri, 04 Sep 2015 14:03:27 -0400 1 {kubelet very-long-node-name} implicitly required container POD started   Started with docker id dbbc27b4d280
  Fri, 04 Sep 2015 14:03:27 -0400 Fri, 04 Sep 2015 14:03:27 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      created   Created with docker id ceb14ba72929
  Fri, 04 Sep 2015 14:03:27 -0400 Fri, 04 Sep 2015 14:03:27 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      started   Started with docker id ceb14ba72929
  Fri, 04 Sep 2015 14:03:27 -0400 Fri, 04 Sep 2015 14:03:27 -0400 1 {kubelet very-long-node-name} implicitly required container POD pulled    Pod container image "gcr.io/google_containers/pause:0.8.0" already present on machine
  Fri, 04 Sep 2015 14:03:30 -0400 Fri, 04 Sep 2015 14:03:30 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      started   Started with docker id 0b8957fe9b61
  Fri, 04 Sep 2015 14:03:30 -0400 Fri, 04 Sep 2015 14:03:30 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      created   Created with docker id 0b8957fe9b61
  Fri, 04 Sep 2015 14:03:40 -0400 Fri, 04 Sep 2015 14:03:40 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      created   Created with docker id edaf0df38c01
  Fri, 04 Sep 2015 14:03:40 -0400 Fri, 04 Sep 2015 14:03:40 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      started   Started with docker id edaf0df38c01
  Fri, 04 Sep 2015 14:03:50 -0400 Fri, 04 Sep 2015 14:03:50 -0400 1 {kubelet very-long-node-name} spec.containers{ghost}      started   Started with docker id d33f5e5a9637
...

如果我不杀死 pod，这个创建/启动的循环将永远持续下去。与成功的 pod 的唯一区别是缺少复制控制器。我不认为这是问题所在，因为教程中没有提到任何关于 rc 的内容。

为什么会这样？如何从配置文件创建成功的 pod？我在哪里可以找到关于正在发生的事情的更详细的日志？

【问题讨论】：

标签： kubernetes google-kubernetes-engine

【解决方案1】：

如果同一个 docker 映像通过 kubectl run 工作，但不能在 pod 中工作，那么 pod 规范有问题。比较从 spec 创建的 pod 和 rc 创建的 pod 的完整输出，通过运行 kubectl get pods <name> -o yaml 来查看两者有什么不同。在黑暗中射击：Pod 规范中指定的环境变量是否可能导致它在启动时崩溃？

【讨论】：

【解决方案2】：

也许您可以在 yaml 文件中使用不同的重启策略？

你所拥有的我相信等同于

- restartPolicy: Never

没有复制控制器。您可以尝试将此行添加到 yaml 并将其设置为 Always（这将为您提供 RC）或 OnFailure。

https://github.com/kubernetes/kubernetes/blob/master/docs/user-guide/pod-states.md#restartpolicy

【讨论】：

【解决方案3】：

容器日志可能有用，有 kubectl 日志

用法：

kubectl 日志 [-p] POD [-c CONTAINER]

http://kubernetes.io/v1.0/docs/user-guide/kubectl/kubectl_logs.html

【讨论】：