作业应该用于处理工作队列。使用工作队列时,您应该不设置.spec.comletions(或将其设置为null)。在这种情况下,Pod 将继续创建,直到其中一个 Pod 成功退出。故意以失败状态从(主)容器退出有点尴尬,但这是规范。无论此设置如何,您都可以根据自己的喜好设置.spec.parallelism;我已将其设置为 1,因为您似乎不需要任何并行性。
在您的问题中,如果工作队列为空,您没有指定要做什么,因此我将提供两种解决方案,一种是您想等待新项目(无限),另一种是如果您想结束工作,如果工作队列变空(有限但不确定的项目数)。
这两个示例都使用了 redis,但您可以将此模式应用到您最喜欢的队列中。请注意,从队列中弹出项目的部分是不安全的;如果您的 Pod 在弹出项目后由于某种原因死亡,则该项目将保持未处理或未完全处理。请参阅reliable-queue pattern 以获得正确的解决方案。
为了在每个工作项上执行顺序步骤,我使用了init containers。请注意,这确实是一个原始解决方案,但如果您不想使用某些框架来实现适当的管道,则选择有限。
有一个asciinema 如果有人想在不部署 redis 等的情况下看到它的工作。
Redis
要对此进行测试,您至少需要创建一个 redis Pod 和一个服务。我正在使用来自fine parallel processing work queue 的示例。你可以部署它:
kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-pod.yaml
kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-service.yaml
此解决方案的其余部分要求您在与 Job 相同的命名空间中拥有一个服务名称 redis,并且它不需要身份验证和一个名为 redis-master 的 Pod。
插入项目
要在工作队列中插入一些项目,请使用此命令(您需要 bash 才能工作):
echo -ne "rpush job "{1..10}"\n" | kubectl exec -it redis-master -- redis-cli
无限版
如果队列为空,此版本会等待,因此它永远不会完成。
apiVersion: batch/v1
kind: Job
metadata:
name: primitive-pipeline-infinite
spec:
parallelism: 1
completions: null
template:
metadata:
name: primitive-pipeline-infinite
spec:
volumes: [{name: shared, emptyDir: {}}]
initContainers:
- name: pop-from-queue-unsafe
image: redis
command: ["sh","-c","redis-cli -h redis blpop job 0 >/shared/item.txt"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-1
image: busybox
command: ["sh","-c","echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-2
image: busybox
command: ["sh","-c","echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-3
image: busybox
command: ["sh","-c","echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
containers:
- name: done
image: busybox
command: ["sh","-c","echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
volumeMounts: [{name: shared, mountPath: /shared}]
restartPolicy: Never
有限版本
如果队列为空,此版本会停止作业。请注意,pop init 容器检查队列是否为空,并且所有后续的 init 容器和如果确实为空,则主容器立即退出 - 这是向 Kubernetes 发出 Job 的信号的机制已完成,无需为其创建新的 Pod。
apiVersion: batch/v1
kind: Job
metadata:
name: primitive-pipeline-finite
spec:
parallelism: 1
completions: null
template:
metadata:
name: primitive-pipeline-finite
spec:
volumes: [{name: shared, emptyDir: {}}]
initContainers:
- name: pop-from-queue-unsafe
image: redis
command: ["sh","-c","redis-cli -h redis lpop job >/shared/item.txt; grep -q . /shared/item.txt || :>/shared/done.txt"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-1
image: busybox
command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-2
image: busybox
command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
- name: step-3
image: busybox
command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
volumeMounts: [{name: shared, mountPath: /shared}]
containers:
- name: done
image: busybox
command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
volumeMounts: [{name: shared, mountPath: /shared}]
restartPolicy: Never