【问题标题】:Kubernetes Metric Server not able to Collect MetricKubernetes Metric Server 无法收集 Metric
【发布时间】:2021-04-01 21:41:38
【问题描述】:

我有一个包含 1 个主节点和两个工作节点的测试环境集群,所有基本 pod 都已启动并运行。

root@master:~/pre-release# kubectl get pods -n kube-system
NAME                              READY   STATUS    RESTARTS   AGE
coredns-74ff55c5b-jn4pl           1/1     Running   0          23h
coredns-74ff55c5b-lz5pq           1/1     Running   0          23h
etcd-master                       1/1     Running   0          23h
kube-apiserver-master             1/1     Running   0          23h
kube-controller-manager-master    1/1     Running   0          23h
kube-flannel-ds-c7czv             1/1     Running   0          150m
kube-flannel-ds-kz74g             1/1     Running   0          150m
kube-flannel-ds-pb4f2             1/1     Running   0          150m
kube-proxy-dbmjn                  1/1     Running   0          23h
kube-proxy-kfrdd                  1/1     Running   0          23h
kube-proxy-wj4rk                  1/1     Running   0          23h
kube-scheduler-master             1/1     Running   0          23h
metrics-server-67fb68f54c-4hnt7   1/1     Running   0          9m

接下来,当我检查指标服务器的 pod 日志时,我也没有看到任何错误消息

root@master:~/pre-release# kubectl -n kube-system logs -f metrics-server-67fb68f54c-4hnt7
I0330 09:53:15.286101       1 serving.go:325] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0330 09:53:15.767767       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0330 09:53:15.767790       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0330 09:53:15.767815       1 secure_serving.go:197] Serving securely on [::]:4443
I0330 09:53:15.767823       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0330 09:53:15.767835       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0330 09:53:15.767857       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0330 09:53:15.767865       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0330 09:53:15.767878       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0330 09:53:15.767897       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0330 09:53:15.867954       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0330 09:53:15.868014       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0330 09:53:15.868088       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file

然后我验证了指标服务

root@master:~/pre-release# kubectl describe apiservice v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2021-03-30T09:53:13Z
  Resource Version:    126838
  UID:                 6da11b3f-87d5-4de4-92a0-463219b23301
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2021-03-30T09:53:13Z
    Message:               failing or missing response from https://10.108.112.196:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.112.196:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

最终状态为 false 并以上述错误结束。

这里是部署规范文件

 spec:
      containers:
      - args:
        - /metrics-server
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls

kubectl 顶级节点

root@master:~# kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

现在仍然可以从昨天开始找到你的解决方案,请你帮我解决这个问题

【问题讨论】:

  • 你有什么版本的 Kubernetes?您在裸机或托管 Kubernetes 服务上有 Kubernetes 吗?您是如何安装 metrics-server 的?你能运行kubectl top nodes 命令吗?
  • 服务器版本v1.20.5和客户端版本也一样
  • 感谢您的回复。你是如何安装metrics-server 的?
  • 我已尝试重现您的问题,但它似乎对我来说按预期工作。您是否可以将hostNetwork:true 添加到spec.template.spec 下的metrics-server 部署中?
  • 伟大的先生,它现在工作了

标签: kubernetes


【解决方案1】:

在这种情况下,将hostNetwork:true 添加到spec.template.spec 下的metrics-server 部署可能会有所帮助。

...
 spec:
      hostNetwork: true
      containers:
      - args:
        - /metrics-server
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
...

我们可以在Kubernetes Host namespaces documentation 中找到:

HostNetwork - 控制 pod 是否可以使用节点网络命名空间。这样做可以让 pod 访问回送设备、侦听 localhost 的服务,并可用于窥探同一节点上其他 pod 的网络活动。

【讨论】:

  • 非常感谢您的支持,您今天解决了一周的问题
猜你喜欢
  • 2022-08-19
  • 2021-04-18
  • 1970-01-01
  • 2022-07-25
  • 1970-01-01
  • 1970-01-01
  • 2018-09-01
  • 2023-03-20
  • 1970-01-01
相关资源
最近更新 更多