【问题标题】:Configure prometheus to collect custom metrics from dockerized nodejs pod配置 prometheus 以从 dockerized nodejs pod 收集自定义指标
【发布时间】:2020-06-18 14:43:02
【问题描述】:

我已经设置了 prom-client(prometheus 的非官方客户端库)来收集我需要的自定义指标。 我在eks setup guide 之后从 helm 部署了 prometheus 服务器。现在我正在尝试编辑默认配置映射以收集我的应用程序指标,但出现错误

parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\n line 22: field cluster_ip not found in type kubernetes.plain\n line 25: cannot unmarshal !!strdefaultinto []string

这是我按照文档所做的 prometheus.yaml 配置映射文件

apiVersion: v1
data:
  alerting_rules.yml: |
    {}
  alerts: |
    {}
  prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    ...DEFAULT CONFIGS...
    - job_name: my_metrics
      scrape_interval: 5m
      scrape_timeout: 10s
      honor_labels: true
      metrics_path: /api/metrics
      kubernetes_sd_configs:
        - role: service
          cluster_ip: 10.100.200.92
          namespaces:
            names:
              default
  recording_rules.yml: |
    {}
  rules: |
    {}
kind: ConfigMap
metadata:
  creationTimestamp: "2020-06-08T09:26:38Z"
  labels:
    app: prometheus
    chart: prometheus-11.3.0
    component: server
    heritage: Helm
    release: prometheus
  name: prometheus-server
  namespace: prometheus
  uid: 8fadb17a-f5c5-4f9d-a931-fa1f77684847

这里的 clusterIP 是分配给我的服务以公开部署的 IP。

我的 deployment.yaml 文件

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
          name: myapp
          resources:
              limits:
                cpu: "1000m"
                memory: "2400Mi"
              requests:
                cpu: "500m"
                memory: "2000Mi"
          imagePullPolicy: IfNotPresent
          ports:
              - containerPort: 5000
                name: myapp

我的 service.yaml 文件暴露了部署

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    deploy: staging
    name: myapp
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 5000
      protocol: TCP

是否有一些不同/有效的方法可以针对我的应用进行指标收集,请告诉我。谢谢

【问题讨论】:

    标签: docker kubernetes prometheus kubernetes-helm


    【解决方案1】:

    这是我用来在集群内启用 prometheus 抓取的工具。

    在抓取配置中,我有这个 sn-p:

          - job_name: 'kubernetes-pods'
            kubernetes_sd_configs:
              - role: pod
            relabel_configs:
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                action: keep
                regex: true
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                action: replace
                target_label: __metrics_path__
                regex: (.+)
              - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                action: replace
                regex: ([^:]+)(?::\d+)?;(\d+)
                replacement: $1:$2
                target_label: __address__
              - action: labelmap
                regex: __meta_kubernetes_pod_label_(.+)
              - source_labels: [__meta_kubernetes_namespace]
                action: replace
                target_label: kubernetes_namespace
              - action: labeldrop
                regex: '(kubernetes_pod|app_kubernetes_io_instance|app_kubernetes_io_name|instance)'
    
    

    这直接取自 prometheus helm 图表的默认值:https://github.com/helm/charts/blob/master/stable/prometheus/values.yaml#L1452

    它的作用是指示 prometheus 抓取每个具有注释的 pod: prometheus.io/scrape: "true" 放。使用 pod 上的这些注释,您可以配置抓取的端口和路径:

    prometheus.io/path: "/metrics"
    prometheus.io/port: "9090"
    

    因此,您还需要修改 deployment.yaml 以指定这些注释:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: myapp
    spec:
      replicas: 2
      strategy:
        type: RollingUpdate
      selector:
        matchLabels:
          name: myapp
      template:
        metadata:
          labels:
            name: myapp
        annotations:
          prometheus.io/scrape: "true"
          prometheus.io/port: "<enter port of pod to scrape>"
          prometheus.io/path: "<enter path to scrape>"
        spec:
          containers:
            - image: IMAGE_URL:BUILD_NUMBER
    ...
    

    【讨论】:

    • 谢谢你的回答,它工作得很好。刚刚还有一个问题,我是否必须编写那些重新标记配置,或者仅注释就可以工作?
    • @warl0ck 我相信relabel_configs 是注释工作所必需的,但是我还没有测试过在没有这些的情况下运行它。所以不要让我对那个声明负责:)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-02-12
    • 2019-09-07
    • 2018-10-28
    • 2020-02-20
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多