【发布时间】:2019-08-01 21:46:57
【问题描述】:
我在 Open Telekom Cloud 中有一个名为 Cluster Container Engine (CCE) 的托管 kubernetes 设置。他们的文档可以在online找到。
我的 CCE 有一个主节点和三个节点,运行 k8s 版本1.9.2(更多详细信息如下)。我可以通过kubectl 访问 CCE 并在其上部署新的 pod。
CCE 预装了heapster 的部署。但是,尝试检查节点资源使用情况失败(我可以观察到 pod 使用情况相同的效果):
$ kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get services http:heapster:)
我已经尝试了所有我能想到的调试步骤(见下文),但在修复这个问题时我仍然迷失了方向。有什么建议吗?
存在heapster 的deployment、pod 和service 项目(过滤后的输出仅包含heapster):
$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
heapster-apiserver-84b844ffcf-lzh4b 1/1 Running 0 47m
$ kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
heapster ClusterIP 10.247.150.244 <none> 80/TCP 19d
$ kubectl get deploy -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
heapster-apiserver 1/1 1 1 19d
为了检查 heapster 是否确实正确收集了指标,我已经 ssh 进入其中一个节点并执行:
$ curl -k http://10.247.150.244:80/api/v1/model/metrics/
[
"cpu/usage_rate",
"memory/usage",
"cpu/request",
"cpu/limit",
"memory/request",
"memory/limit"
]
Pod 日志输出
最后,我检查了heapster-apiserver-84b844ffcf-lzh4b pod 的日志输出:
$ kubectl logs -n kube-system heapster-apiserver-84b844ffcf-lzh4b
I0311 13:38:18.334525 1 heapster.go:78] /heapster --source=kubernetes.summary_api:''?kubeletHttps=true&inClusterConfig=false&insecure=true&auth=/srv/config --api-server --secure-port=6443
I0311 13:38:18.334718 1 heapster.go:79] Heapster version v1.5.3
I0311 13:38:18.340912 1 configs.go:61] Using Kubernetes client with master "https://192.168.1.228:5443" and version <nil>
I0311 13:38:18.340996 1 configs.go:62] Using kubelet port 10255
I0311 13:38:18.358918 1 heapster.go:202] Starting with Metric Sink
I0311 13:38:18.510751 1 serving.go:327] Generated self-signed cert (/var/run/kubernetes/apiserver.crt, /var/run/kubernetes/apiserver.key)
E0311 13:38:18.540860 1 heapster.go:128] Could not create the API server: missing clientCA file
I0311 13:38:18.558944 1 heapster.go:112] Starting heapster on port 8082
集群信息
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.2-CCE2.0.7-B003", GitCommit:"302f471a1e2caa114c9bb708c077fbb363aa2f13", GitTreeState:"clean", BuildDate:"2018-06-20T03:27:16Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl get nodes
192.168.1.163 Ready worker 19d v1.9.2-CCE2.0.7-B003
192.168.1.211 Ready nfs-server 19d v1.9.2-CCE2.0.7-B003
192.168.1.227 Ready worker 19d v1.9.2-CCE2.0.7-B003
所有节点都使用EulerOS_2.0_SP2,内核版本为3.10.0-327.59.59.46.h38.x86_64。
【问题讨论】:
标签: kubernetes