这是我们用于映射容器内存指标的方法
按(容器、pod、命名空间、节点、作业)求和(container_memory_rss{container != "POD", image != "", container != ""})
要回答您的具体问题,为什么价值更高?那是因为它包括节点内存本身。
kubelet (cadvisor) 报告多个组的内存指标,例如,id="/" 是根 cgroup(即整个节点)的指标
例如在我的设置中,以下指标是节点内存
{endpoint="https-metrics", id="/", instance="10.0.84.2:10250", job="kubelet", metrics_path="/metrics/cadvisor", node="ip-10-xx-x-x.us-west-2.compute.internal", service="kube-prometheus-stack-kubelet"}
同样在www.asserts.ai,我们使用 rss 的最大值、工作和使用指标来得出容器使用的实际内存。
请参阅下面对我们的记录规则的参考
#
- record: asserts:container_memory
expr: sum by (container, pod, namespace, node, job, asserts_env, asserts_site)(container_memory_rss{container != "POD", image != "", container != ""})
labels:
source: rss
- record: asserts:container_memory
expr: sum by (container, pod, namespace, node, job, asserts_env, asserts_site)(container_memory_working_set_bytes{container != "POD", image != "", container != ""})
labels:
source: working
- record: asserts:container_memory
# why sum ? multiple copies of same container may be running on same pod
expr: sum by (container, pod, namespace, node, job, asserts_env, asserts_site)
(
container_memory_usage_bytes {container != "POD", image != "", container != ""} -
container_memory_cache {container != "POD", image != "", container != ""}-
container_memory_swap {container != "POD", image != "", container != ""}
)
labels:
source: usage
# For KPI Rollup Purposes
- record: asserts:resource:usage
expr: |-
max without (source) (asserts:container_memory)
* on (namespace, pod, asserts_env, asserts_site) group_left(workload) asserts:mixin_pod_workload