node-exporter纳入promethues-operator

# Prometheus operator 如何纳入exporter ## 1.架构 ### 1.1 传统prometheus架构 ![image.png](https://cos.easydoc.net/97954506/files/lajbawt7.png) 传统Prometheus架构中,部署了Prometheus-server和exporter之后,两者之间的关联通过修改 prometheus.yaml文件来实现,下图是添加mysqld_exporter的例子: ``` - job_name: mysqld_exporter static_configs: - targets: ['192.168.8.121:9104'] labels: instance: mysql ``` 传统的方式存在一个缺点,即每次新加如exporter,就要修改一次配置文件,也就意味着 Prometheus 需要重启,这显然不太合理。 ### 1.2 Prometheus operator架构 ![image.png](https://cos.easydoc.net/97954506/files/lajbefi8.png) prometheus operator架构,prometheus-server中通过服务发现的方式将exporter关联起来。其运行方式如下,为exporter创建带标签的svc,然后创建serviceMonitor 并通过标签匹配相应svc,operator监听到servicemonitor后便将对应的exporter动态应用到prometheus-server中。 ## 2 最佳实践 ### [helm3部署Prometheus operator](https://blog.csdn.net/weixin_47003048/article/details/111397787) ### 2.1 node-exporter #### 2.1.1 部署node-exporter K8S集群内的node节点在部署Prometheus operator的时候就部署好了node-exporter,这里就不详细说明;对于像探针这种非K8S集群内的机器,只能通过执行node-exporter程序来获取服务器指标,部署也很简单,只需把二进制程序放到服务器上,执行就可以了,可以参考 [node_exporter 配置](https://www.jianshu.com/p/7bec152d1a1f) 。 #### 2.1.2 为node-exporter创建带标签的svc ``` kind: Service apiVersion: v1 metadata: name: other #job名 namespace: default labels: app: other spec: ports: - port: 9100 name: metrics protocol: TCP targetPort: 9100 --- kind: Endpoints apiVersion: v1 metadata: name: other namespace: default subsets: - addresses: - ip: 10.1.125.60 - ip: 10.1.125.24 - ip: 10.1.125.32 ports: - port: 9100 name: metrics protocol: TCP --- #建立一个 ServiceMonitor 对象,用于 Prometheus 添加监控项 #为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象 #确保 Service 对象可以正确获取到 metrics 数据 apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: easyviews-other-nodeexporter #servicemonitor的名称 namespace: default #servicemonitor的命名空间 labels: app: other #servicemonitor的标签 spec: endpoints: - port: metrics #创建other的svc时配的9100端口的名字 interval: 30s namespaceSelector: matchNames: - default #匹配的svc所在的命名空间 selector: matchLabels: app: other #svc的标签 ``` `kubectl apply -f xxx.yaml` 2.1.4 查看 创建后我们即可在Prometheus-server上发现新增的三台探针的target了,servicemonitor扫描svc有时间间隔,所以需要稍等一会才能看到。 ![image.png](https://cos.easydoc.net/97954506/files/lajbm7od) grafana界面可以看到解码的三台也加入进来了。 ![image.png](https://cos.easydoc.net/97954506/files/lajbmswg) # 监控Redis ## 安装redis_exporter 其实redis_exporter部署在哪台服务器上都是可以的,因为之后会在启动redis_exporter的时候配置所要监控的redis的连接地址。 ``` cd /usr/local/src wget https://github.com/oliver006/redis_exporter/releases/download/v1.6.1/redis_exporter-v1.6.1.linux-amd64.tar.gz tar -zxf redis_exporter-v1.6.1.linux-amd64.tar.gz mv redis_exporter-v1.6.1.linux-amd64 /usr/local/redis_exporter ``` redis_exporter 运行参数 可以通过./redis_exporter --help命令查看各个参数的含义,比较常用的参数如下 : ``` -redis.addr string:Redis实例的地址,可以使一个或者多个,多个节点使用逗号分隔,默认为 "redis://localhost:6379" -redis.password string:Redis实例的密码 -web.listen-address string:服务监听的地址,默认为 0.0.0.0:9121 ``` 启动 redis_exporter 服务 创建启动文件(使用systemd管理) ``` cat > /usr/lib/systemd/system/redis_exporter.service <<EOF [Unit] Description=redis_exporter Documentation=https://github.com/oliver006/redis_exporter After=network.target [Service] Type=simple User=root ExecStart=/usr/local/redis_exporter/redis_exporter -redis.addr 10.1.128.85:6379 -redis.password qwer1234 Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start redis_exporter systemctl status redis_exporter systemctl enable redis_exporter ss -tln | grep 9121 ``` ## 添加监控目标 需要把redis_exporter监控目标添加到prometheus server中。 ``` vim /usr/local/prometheus/prometheus.yml - job_name: 'redis' scrape_interval: 10s static_configs: - targets: ['192.168.0.184:9121'] labels: instance: redis-01 - job_name: 'redis-node' scrape_interval: 10s static_configs: - targets: ['192.168.0.184:9100'] labels: instance: redis-01 ``` 重启 Prometheus server ``` systemctl restart prometheus # 或者热加载 curl -X POST localhost:9090/-/reload ``` 每次添加监控项都要重启服务这样显然不合理 ,采用operator形式 ## 纳入prometheus-operator ``` cat > redis-exporter.yaml <<EOF kind: Service apiVersion: v1 metadata: name: redis-exporter #job名 namespace: default labels: app: redis-exporter spec: ports: - port: 19121 name: metrics protocol: TCP targetPort: 9121 --- kind: Endpoints apiVersion: v1 metadata: name: redis-exporter namespace: default subsets: - addresses: - ip: 10.1.128.85 ports: - port: 9121 name: metrics protocol: TCP --- #建立一个 ServiceMonitor 对象,用于 Prometheus 添加监控项 #为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象 #确保 Service 对象可以正确获取到 metrics 数据 apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: easyviews-redis-exporter #servicemonitor的名称 namespace: default #servicemonitor的命名空间 labels: app: redis-exporter #servicemonitor的标签 spec: endpoints: - port: metrics #创建other的svc时配的9100端口的名字 interval: 30s namespaceSelector: matchNames: - default #匹配的svc所在的命名空间 selector: matchLabels: app: redis-exporter #svc的标签 EOF kubectl apply -f redis-exporter.yaml ``` 创建后我们即可在Prometheus-server上发现新增的redis-exporter的target了,servicemonitor扫描svc有时间间隔,所以需要稍等一会才能看到。 ![image.png](https://cos.easydoc.net/97954506/files/lajde3h4.png) 导入`grafana`的`763`面板 ![image.png](https://cos.easydoc.net/97954506/files/lajdx12d.png) # kafka纳入监控 `wget https://github.com/danielqsj/kafka_exporter/releases/download/v1.4.2/kafka_exporter-1.4.2.linux-amd64.tar.gz` ``` tar -xf kafka_exporter-1.4.2.linux-amd64.tar.gz mkdir -p /opt/kafka_exporter/ cp ./kafka_exporter-1.4.2.linux-amd64/kafka_exporter /opt/kafka_exporter/ #创建启动脚本 cat > /opt/start_kafka_exporter.sh <<EOF #!/bin/bash ps -ef|grep kafka_exporter |grep -v grep if [[ \$? == 0 ]];then kill -9 \`ps -ef|grep kafka_exporter |egrep -v "grep|.sh"|awk '{print \$2}'\` else echo "kafka_exporter is closed!" fi nohup /opt/kafka_exporter/kafka_exporter --kafka.server=localhost:19092 & EOF #关闭服务脚本 cat > /opt/stop_kafka_exporter.sh << EOF #!/bin/bash ps -ef|grep kafka_exporter |grep -v grep if [[ \$? == 0 ]];then kill -9 \`ps -ef|grep kafka_exporter |egrep -v "grep|.sh"|awk '{print \$2}'\` else echo "kafka_exporter is closed!" fi EOF #添加执行权限 chmod +x /opt/s*_kafka_exporter.sh cat > /usr/lib/systemd/system/kafka-export.service <<EOF [Unit] Description=kafka_exporter stats exporter for Prometheus Documentation=Prometheus exporter for various metrics about kafka_exporter, https://github.com/danielqsj/kafka_exporter/. [Service] Type=forking ExecStart=/opt/start_kafka_exporter.sh ExecStop=/opt/stop_kafka_exporter.sh Group=root User=root [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kafka-export.service systemctl enable kafka-export.service systemctl status kafka-export.service ``` 查看 http://10.1.128.87:9308/metrics ![image.png](https://cos.easydoc.net/97954506/files/laclb8ww.png) #建立一个 ServiceMonitor 对象,用于 Prometheus 添加监控项 #为 ServiceMonitor 对象关联 metrics 数据接口的一个 Service 对象 #确保 Service 对象可以正确获取到 metrics 数据 ``` kind: Service apiVersion: v1 metadata: name: kafka-exporter #job名 namespace: default labels: app: kafka-exporter spec: ports: - port: 9308 name: metrics protocol: TCP targetPort: 9308 --- kind: Endpoints apiVersion: v1 metadata: name: kafka-exporter namespace: default subsets: - addresses: - ip: 10.1.128.87 - ip: 10.1.128.86 - ip: 10.1.128.89 ports: - port: 9308 name: metrics protocol: TCP --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: easyviews-kafka-exporter #servicemonitor的名称 namespace: default #servicemonitor的命名空间 labels: app: kafka-exporter # servicemonitor 的标签 spec: endpoints: - port: metrics #创建other的svc时配的端口的名字 interval: 30s namespaceSelector: matchNames: - default #匹配的svc所在的命名空间 selector: matchLabels: app: kafka-exporter #svc的标签 ``` ``` [root@master01 export]# kubectl apply -f kafkaexp.yaml service/kafka-exporter created endpoints/kafka-exporter created servicemonitor.monitoring.coreos.com/easyviews-kafka-exporter created ``` 稍等片刻就可以看到已加入到 ![image.png](https://cos.easydoc.net/97954506/files/lacld0rk.png) grafana添加数据源 导入dashboard `7589` ![image.png](https://cos.easydoc.net/97954506/files/laclfkoe.png) ![image.png](https://cos.easydoc.net/97954506/files/lad9z4rk.png)