由于连接被拒绝错误，minikube 集群 pod 不健康并正在重新启动答案

【问题标题】：minikube cluster pods are unhealthy and restarting because of connection refused error由于连接被拒绝错误，minikube 集群 pod 不健康并正在重新启动
【发布时间】：2021-09-08 04:35:07
【问题描述】：

我在 centos7 虚拟机上安装了 minikube 集群。安装后，由于不健康的 pod 状态，很少有 pod 正在重新启动。所有失败的 pod 都有类似的原因：

来自 pod 的错误（描述命令）：

Readiness probe failed: Get "http://172.17.0.4:6789/readyz": dial tcp 172.17.0.4:6789: connect: connection refused

Liveness probe failed: Get "http://172.17.0.4:6789/healthz": dial tcp 172.17.0.4:6789: connect: connection refused

Liveness probe failed: HTTP probe failed with statuscode: 500
Readiness probe failed: HTTP probe failed with statuscode: 500

Liveness probe failed: HTTP probe failed with statuscode: 503

由于这些错误，pod 中的应用程序无法正常工作。我是 kubernates 新手，无法理解调试此错误

UPDATE请看下方pod(che-operator)的错误日志

time="2021-09-09T15:37:13Z" level=info msg="Deployment plugin-registry is in the rolling update state."
I0909 15:37:15.651964 1 request.go:655] Throttling request took 1.04725976s, request: GET:https://10.96.0.1:443/apis/extensions/v1beta1?timeout=32s
time="2021-09-09T15:37:16Z" level=info msg="Deployment plugin-registry is in the rolling update state."
I0909 15:37:37.710602 1 request.go:655] Throttling request took 1.046990691s, request: GET:https://10.96.0.1:443/apis/apiextensions.k8s.io/v1?timeout=32s
W0909 15:43:23.172829 1 warnings.go:70] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
E0909 15:47:05.403189 1 leaderelection.go:325] error retrieving resource lock eclipse-che/e79b08a4.org.eclipse.che: Get "https://10.96.0.1:443/api/v1/namespaces/eclipse-che/configmaps/e79b08a4.org.eclipse.che": context deadline exceeded
I0909 15:47:05.403334 1 leaderelection.go:278] failed to renew lease eclipse-che/e79b08a4.org.eclipse.che: timed out waiting for the condition
{"level":"info","ts":1631202425.4036877,"logger":"controller","msg":"Stopping workers","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster"}
{"level":"error","ts":1631202425.4034257,"logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/che-operator/vendor/github.com/go-logr/zapr/zapr.go:132\nmain.main\n\t/che-operator/main.go:254\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:204"}

【问题讨论】：

您能提供一种重现此错误的方法吗？最好查看到目前为止您已采取的确切步骤。您究竟是如何安装 Minikube 的？在尝试安装 eclipse-che 之前，您是否尝试过在其上运行其他任何东西，例如简单的 nginx pod？
mkdir kube cd kube curl -LO "dl.k8s.io/release/$(curl -L -s dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" curl -LO "dl.k8s.io/$(curl -L -s dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256" curl -LO storage.googleapis.com/minikube/releases/latest/… sudo install minikube-linux-amd64 /usr/local/bin/minikube chmod u+x kubectl minikube start --addons=ingress --vm=true --memory=8192 --driver=none minikube kubectl –get po -A chectl server：部署 --platform minikube
你知道如何在operator pod上禁用leader选举
很遗憾无法帮助您解决这个问题，这是否与主要问题有关？也许你已经设法解决了？

标签： kubernetes minikube kubernetes-pod

【解决方案1】：

您的就绪和活跃度探测因错误而失败

就绪探测失败：获取“http://172.17.0.4:6789/readyz”

Readiness & liveness 探针用于检查 POD 内的应用程序状态。如果应用程序没有响应，它会持续检查一个端点上的应用程序状态，Kubernetes 会自动重启你的 POD。

在这种情况下，我建议检查您在 POD 内运行的应用程序状态。 readyz & health 由于这个原因你的 pod 失败了。

在https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

上了解更多关于准备和活跃度的信息

您也可以使用以下方式查看应用程序日志：

kubectl get logs <POD name>

【讨论】：

你知道如何在operator pod上禁用leader选举