prometheus监控实践

Power your metrics and alerting with a leading open-source monitoring solution.

prometheus server端配置范例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
# my global config
global:
scrape_interval: 60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 60s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# - compute_metrics.rules

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
# ================================================================
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

# static_configs:

file_sd_configs:
- files: ['/opt/prometheus/conf/node.d/*.yml']

- job_name: 'metricslogfile'
metrics_path: '/metrics/logfile'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

# static_configs:

file_sd_configs:
- files: ['/opt/prometheus/conf/node.d/*.yml']

# ================================================================
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx] # Look for a HTTP 200 response.
# static_configs:
# - targets:
# - https://123.xxxx.com # Target to probe with http.
# - http://map.xxxx.com
file_sd_configs:
- files: ['/opt/prometheus/conf/blackbox.d/*.yml']
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox.server.local:19115 # The blackbox exporter's real hostname:port.
# ================================================================
- job_name: 'mobile.xxxx.com'
metrics_path: /probe
params:
module: [http_2xx_mobile_xxxx_com] # Look for a HTTP 200 response.
file_sd_configs:
- files: ['/opt/prometheus/conf/importantServices.d/xxxx.yml']
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox.server.local:19115 # The blackbox exporter's real hostname:port.
# ================================================================
- job_name: 'nginxstatus'
file_sd_configs:
- files: ['/opt/prometheus/conf/nginx.d/*.yml']
# ================================================================
- job_name: 'jenkins'
metrics_path: /prometheus
static_configs:
- targets:
- '10.142.112.144:8080'
labels:
cluster: '业务'
group: '产品'
# ================================================================
- job_name: 'prometheustatus'
metrics_path: /metrics
static_configs:
- targets:
- '127.0.0.1:19090'
labels:
cluster: 'Prometheus'
group: '生产环境'
# ================================================================
- job_name: 'etcd'
metrics_path: /metrics
file_sd_configs:
- files: ['/opt/prometheus/conf/etcd.d/*.yml']

子配置范例

  • node

    1
    2
    3
    4
    5
    6
    7
    8
    - targets: [
    '10.0.0.1:9100',
    '10.0.0.2:9100',
    '10.0.0.3:9100',
    ]
    labels:
    cluster: '业务'
    group: '产品'
  • blackbox

    1
    2
    3
    4
    5
    6
    - targets: [
    'http://www.xxxx.com/',
    ]
    labels:
    cluster: '业务'
    group: '产品'
  • etcd监控

    1
    2
    3
    4
    5
    6
    7
    8
    9
    - targets: [
    'etcd01.localhost:2379',
    'etcd02.localhost:2379',
    'etcd03.localhost:2379',
    ]
    labels:
    cluster: 'etcd.业务'
    group: '产品'

相关资料

prometheus

blackbox

telegraf + influxdb