【发布时间】:2020-09-10 00:44:54
【问题描述】:
我知道有一些现有的问题,他们通常参考这个https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#i-have-a-couple-of-nodes-with-low-utilization-but-they-are-not-scaled-down-why
但我仍然无法调试。我的集群上只运行了 1 个 pod,所以我不明白为什么它不能扩展到 1 个节点。我该如何进一步调试?
这里有一些信息:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-qua-gke-foobar1234-default-pool-6302174e-4k84 Ready <none> 4h14m v1.14.10-gke.27
gke-qua-gke-foobar1234-default-pool-6302174e-6wfs Ready <none> 16d v1.14.10-gke.27
gke-qua-gke-foobar1234-default-pool-6302174e-74lm Ready <none> 4h13m v1.14.10-gke.27
gke-qua-gke-foobar1234-default-pool-6302174e-m223 Ready <none> 4h13m v1.14.10-gke.27
gke-qua-gke-foobar1234-default-pool-6302174e-srlg Ready <none> 66d v1.14.10-gke.27
kubectl get pods
NAME READY STATUS RESTARTS AGE
qua-gke-foobar1234-5959446675-njzh4 1/1 Running 0 14m
nodePools:
- autoscaling:
enabled: true
maxNodeCount: 10
minNodeCount: 1
config:
diskSizeGb: 100
diskType: pd-standard
imageType: COS
machineType: n1-highcpu-32
metadata:
disable-legacy-endpoints: 'true'
oauthScopes:
- https://www.googleapis.com/auth/datastore
- https://www.googleapis.com/auth/devstorage.full_control
- https://www.googleapis.com/auth/pubsub
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
serviceAccount: default
shieldedInstanceConfig:
enableIntegrityMonitoring: true
initialNodeCount: 1
instanceGroupUrls:
- https://www.googleapis.com/compute/v1/projects/fooooobbbarrr-dev/zones/us-central1-a/instanceGroupManagers/gke-qua-gke-foobar1234-default-pool-6302174e-grp
locations:
- us-central1-a
management:
autoRepair: true
autoUpgrade: true
name: default-pool
podIpv4CidrSize: 24
selfLink: https://container.googleapis.com/v1/projects/ffoooobarrrr-dev/locations/us-central1/clusters/qua-gke-foobar1234/nodePools/default-pool
status: RUNNING
version: 1.14.10-gke.27
kubectl describe horizontalpodautoscaler
Name: qua-gke-foobar1234
Namespace: default
Labels: <none>
Annotations: autoscaling.alpha.kubernetes.io/conditions:
[{"type":"AbleToScale","status":"True","lastTransitionTime":"2020-03-17T19:59:19Z","reason":"ReadyForNewScale","message":"recommended size...
autoscaling.alpha.kubernetes.io/current-metrics:
[{"type":"External","external":{"metricName":"pubsub.googleapis.com|subscription|num_undelivered_messages","metricSelector":{"matchLabels"...
autoscaling.alpha.kubernetes.io/metrics:
[{"type":"External","external":{"metricName":"pubsub.googleapis.com|subscription|num_undelivered_messages","metricSelector":{"matchLabels"...
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"qua-gke-foobar1234","namespace":...
CreationTimestamp: Tue, 17 Mar 2020 12:59:03 -0700
Reference: Deployment/qua-gke-foobar1234
Min replicas: 1
Max replicas: 10
Deployment pods: 1 current / 1 desired
Events: <none>
【问题讨论】:
-
检查
kubectl get pods --all-namespaces -
HPA 用于 pod 自动缩放,而不是节点。您是否启用了节点自动缩放器。设置为缩小的最小节点数是多少?
-
您应该检查日志以了解自动缩放器正在做出什么决定cloud.google.com/kubernetes-engine/docs/how-to/…
-
检查所有工作区中的 pod 并在您的问题中提供更多详细信息
-
@coderanger 啊,我看到了gist.github.com/danielyaa5/0779e29ca72869e7b290ae33c6817157,所以其中一些可能会阻止节点关闭
标签: kubernetes google-cloud-platform google-kubernetes-engine autoscaling