【发布时间】:2019-07-08 20:06:02
【问题描述】:
我正在学习位于 https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/#before-you-begin 的粗略并行处理 Kubernetes 教程。我使用 EC2 实例在 AWS 上使用 Rancher 设置了我的集群。当我跑步时
kubectl apply -f ./job.yaml
kubectl describe jobs/job-wq-1
我收到以下输出
Name: job-wq-1
Namespace: default
Selector: controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
Labels: controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
job-name=job-wq-1
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"job-wq-1","namespace":"default"},"spec":{"completions":8,"paral...
Parallelism: 2
Completions: 8
Start Time: Mon, 08 Jul 2019 15:48:35 -0400
Pods Statuses: 0 Running / 0 Succeeded / 2 Failed
Pod Template:
Labels: controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
job-name=job-wq-1
Containers:
c:
Image: mgladden/job-wq-1
Port: <none>
Host Port: <none>
Environment:
BROKER_URL: amqp://guest:guest@rabbitmq-service:5672
QUEUE: job1
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 10m job-controller Created pod: job-wq-1-z8kn6
Normal SuccessfulCreate 10m job-controller Created pod: job-wq-1-lqcfs
Normal SuccessfulDelete 9m35s job-controller Deleted pod: job-wq-1-z8kn6
Normal SuccessfulDelete 9m35s job-controller Deleted pod: job-wq-1-lqcfs
目前我不确定如何进行故障排除。似乎没有一个成功。可能是由于我的 Rancher 设置造成的吗?我确实在教程中注意到注释是空白的,并且我已经从我的工作中得到了输出。
【问题讨论】:
-
你好,检查 pod 的日志,
kubectl logs -f $pod并检查 pod 的状态部分kubectl get pod $POD_NAME -o yaml -
您是否验证了 rabbitmq-service DNS 服务名称是否可以在 Rancher 集群中使用 nslookup 解析?
标签: kubernetes rancher