prometheus服务发现

使用 static_configs 的方式静态的定义监控目标不太方便。

引入一个中间的代理人(服务注册中心),这个代理人掌握着当前所有监控目标的访问信息,Prometheus只需要向这个代理人询问有哪些监控目标即可,而不需要重启, 这种模式被称为服务发现。

基于文件的服务发现

/etc/prometheus/prometheus.yml

设置每两分钟基于文件更新

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# my global config
global:
scrape_timeout: 10s
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

- job_name: 'prometheus'
file_sd_configs:
- files:
- targets/prometheus-*.yaml
refresh_interval: 2m

- job_name: 'nodes'
file_sd_configs:
- files:
- targets/nodes-*.yaml
refresh_interval: 2m

新建文件,发现可以自动更新

/etc/prometheus/targets/nodes-linux.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
- targets:
- 10.0.10.192:9100
- 10.0.10.141:9100
labels:
app: node-exporter
job: node
env: other

- targets:
- localhost:9100
labels:
app: node-exporter
job: node
env: main

/etc/prometheus/targets/nodes-linux.yaml

1
2
3
4
5
- targets:
- localhost:9090
labels:
app: prometheus
job: prometheus

容器生命周期