skywalking是一款国产的开源的链路追踪软件,那么链路追踪、监控系统、日志系统的区别是什么呢。本质上链路追踪也算是一种监控,而链路追踪跟监控系统都是日志。

skywalking中文文档: https://skyapm.github.io/document-cn-translation-of-skywalking/zh/8.0.0/

与日常监控不同的是我们对监控得出的结果处理可以更主动。以prometheus为例,prometheus收集了数据在grafana上展出出来,并且按制定的规则报警,但是我们一般不会主动去看prometheus的线图然后得出哪里哪里马上要出问题了,我们得提前处理,都是它报警了我去看下情况,然后再去看看日志,根据经验,进行处理以及后续的优化。在常规运维中,这是一个被动的行为,可以理解为“亡羊补牢”。

而链路追踪软件在启用后,就可以看到哪个调用链用得频率高,哪个函数方法执行的慢,跟XXX的连接延时比较大,此时就可以根据实际排期进行更高性价比的调整优化,此时业务并没有出问题,可能就是稍慢一点。当然了,也会出现某个业务使用过程中慢,才要对此进行分析的,这个行为可以理解成普通的被动监控了。不过在在常规运维中,我们对链路追踪的期望是前者,这是一个主动的行为,可以理解为“未雨绸缪”。

那么日志系统呢?日志系统收集了很多日志,而监控跟链路追踪其实是对自己所需要的日志进行了收集及聚合处理后得出了自己所需要的数值、目标等等,最后进行了不同的展示。所以日志系统是最底层的东西,监控报警我只看线条没有用,我得去看当时的日志,到底系统、业务是因为什么才波动了;链路追踪也一样,函数运行的慢,那我去看这个函数的处理逻辑,处理流程都经历了什么才能去调优。

目前,APM中skywalking与pinpoint是实现了对代码完全无任何侵入,这样比较符合运维人员的想法,毕竟Zipkin类的对代码侵入了,那么那就需要有风险担责,这个业务运行时的锅我们还是不要轻易背。具体的对比大家可以看https://www.jianshu.com/p/626cae6c0522 这篇文章。

我们使用k8s内运行的方式来安装skywalking,官方指引是用helm安装,这边笔者已经将yaml导出并进行修改调整

elasticsearch:skywalking可以对接的后端很多:https://skyapm.github.io/document-cn-translation-of-skywalking/zh/8.0.0/setup/backend/backend-storage.html,当然了你的elasticsearch不用跑在容器里,所以这是一个非必要操作,如果跑在容器里记得要分配对应的存储进行持久化。下面这个文件在只有一个节点时重启后会起不来,因为他无法变成green状态不符合健康检查,所以在单独测试时将健康检查的那段注释掉即可。

apiVersion: v1
kind: Service
metadata:
  name: skywalking-elasticsearch
  namespace: default
  labels:
    app: skywalking-elasticsearch
spec:
  ports:
  - name: http
    port: 9200
    protocol: TCP
    targetPort: 9200
  - name: transport
    port: 9300
    protocol: TCP
    targetPort: 9300
  selector:
    app: skywalking-elasticsearch
---
apiVersion: v1
kind: Service
metadata:
  name: skywalking-elasticsearch-headless
  namespace: default
  labels:
    app: skywalking-elasticsearch
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
  - name: http
    port: 9200
    protocol: TCP
    targetPort: 9200
  - name: transport
    port: 9300
    protocol: TCP
    targetPort: 9300
  selector:
    app: skywalking-elasticsearch
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: skywalking-elasticsearch
  namespace: default
  labels:
    app: skywalking-elasticsearch
spec:
  replicas: 1
  podManagementPolicy: Parallel
  selector:
    matchLabels:
      app: skywalking-elasticsearch
  serviceName: skywalking-elasticsearch-headless
  template:
    metadata:
      name: skywalking-elasticsearch
      labels:
        app: skywalking-elasticsearch
    spec:
#       affinity:
#         podAntiAffinity:
#           requiredDuringSchedulingIgnoredDuringExecution:
#           - labelSelector:
#               matchExpressions:
#               - key: app
#                 operator: In
#                 values:
#                 - skywalking-elasticsearch
#             topologyKey: kubernetes.io/hostname
      initContainers:
      - command:
        - sysctl
        - -w
        - vm.max_map_count=262144
        image: docker.elastic.co/elasticsearch/elasticsearch:7.5.1
        imagePullPolicy: IfNotPresent
        name: configure-sysctl
        resources: {}
        securityContext:
          privileged: true
          runAsUser: 0
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      containers:
      - env:
        - name: node.name
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: cluster.initial_master_nodes
          value: skywalking-elasticsearch-0
        - name: discovery.seed_hosts
          value: skywalking-elasticsearch-headless
        - name: cluster.name
          value: skywalking-elasticsearch
        - name: network.host
          value: 0.0.0.0
        - name: ES_JAVA_OPTS
          value: -Xmx1g -Xms1g
        - name: node.data
          value: "true"
        - name: node.ingest
          value: "true"
        - name: node.master
          value: "true"
        name: skywalking-elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.5.1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9200
          name: http
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        resources:
          limits:
            cpu: "1"
            memory: 2Gi
          requests:
            cpu: 100m
            memory: 2Gi          
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e
              # If the node is starting up wait for the cluster to be ready (request params: 'wait_for_status=green&timeout=1s' )
              # Once it has started only check that the node itself is responding
              START_FILE=/tmp/.es_start_file

              http () {
                  local path="${1}"
                  if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
                    BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
                  else
                    BASIC_AUTH=''
                  fi
                  curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
              }

              if [ -f "${START_FILE}" ]; then
                  echo 'Elasticsearch is already running, lets check the node is healthy and there are master nodes available'
                  http "/_cluster/health?timeout=0s"
              else
                  echo 'Waiting for elasticsearch cluster to become cluster to be ready (request params: "wait_for_status=green&timeout=1s" )'
                  if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
                      touch ${START_FILE}
                      exit 0
                  else
                      echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )'
                      exit 1
                  fi
              fi
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 3
          timeoutSeconds: 5
        securityContext:
          capabilities:
            drop:
            - ALL
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - name: skywalking-elasticsearch
          mountPath: /usr/share/elasticsearch/data
      terminationGracePeriodSeconds: 120
  volumeClaimTemplates:
  - metadata:
      name: skywalking-elasticsearch
    spec:
      accessModes: 
        - ReadWriteOnce
      storageClassName: yizhuang-nfs
      resources:
        requests:
          storage: 100Gi
View Code

相关文章:

  • 2019-12-23
  • 2022-01-08
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2021-09-28
  • 2022-02-03
猜你喜欢
  • 2022-12-23
  • 2021-05-17
  • 2022-02-25
  • 2021-08-07
  • 2021-09-26
  • 2022-01-21
  • 2021-12-25
相关资源
相似解决方案