Commit graph

119 commits

Author SHA1 Message Date
Selçuk Arıbalı
c98a04784e
FIX KubernetesPodnothealthy Alert
Kube state metrics assigns value of current pod phase with 1, so according to that Kubernetes Pod not healthy fixed.
2020-04-02 21:01:04 +03:00
Samuel Berthe
c20227b458
oops: adding one-to-one vector matching to mysql subqueries 2020-03-31 16:02:28 +02:00
Matthias Crauwels
79b5ad3b5d
removed avg grouping where possible 2020-03-31 11:42:05 +02:00
Matthias Crauwels
4860250360
added some extra MySQL checks 2020-03-30 11:24:58 +02:00
Samuel Berthe
d9286f6c39
doc: add instructions to rules yaml file 2020-03-28 15:12:21 +01:00
Samuel Berthe
2cda73aa3a
fix(kubernetes): min_over_time takes a time range as paremeter 2020-03-26 16:19:26 +01:00
Samuel Berthe
329583ac36
Fix typo and make pg and mysql similar 2020-03-25 16:44:49 +01:00
luhellma
5559e0140b fix: double usage in query and alert configuration 2020-03-25 16:34:04 +01:00
luhellma
5d8f911d97 feat: Add new rules for MySQLd_exporter from prometheus 2020-03-25 11:57:29 +01:00
luhellma
a4fc086b9a fix wrong number of equal sign in query 2020-03-20 15:22:20 +01:00
luhellma
3d41e2b3ca Add rules for apache 2020-03-20 15:08:13 +01:00
Alexander Knipping
caaea2eeb7 Fix typo in DeadManSwitch alert
Rename it from snitch into switch.
2020-03-18 15:21:38 +01:00
Samuel Berthe
34e62cb327
nginx: adding latency metric 2020-03-17 22:26:46 +01:00
Samuel Berthe
07dde61116
elasticsearch: adding disk watermark alerts 2020-03-17 21:19:58 +01:00
Samuel Berthe
2ecdb636b2
oops 2020-03-17 21:08:09 +01:00
Samuel Berthe
c653b37e15
adding rules to prometheus self monitoring 2020-03-17 20:56:49 +01:00
Samuel Berthe
fc3e72041c
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts 2020-03-17 19:05:57 +01:00
Samuel Berthe
5125c683c5
adding alerts for Ceph 2020-03-17 18:50:08 +01:00
Alexander Knipping
c82df5d005 Fix PrometheusRuleEvaluationSlow
Fixes the rule PrometheusRuleEvaluationSlow as it should fire if
prometheus_rule_group_last_duration_seconds takes longer than
prometheus_rule_group_interval_seconds.

prometheus_rule_group_last_duration_seconds: The duration of the last rule group evaluation.
prometheus_rule_group_interval_seconds: The interval of a rule group.
2020-03-17 15:14:40 +01:00
Samuel Berthe
5b457b0e52
adding github buttons to layout 2020-03-09 23:31:27 +01:00
Samuel Berthe
f554b72671
Add alert for kubernetes api latency 2020-03-09 21:55:17 +01:00
Samuel Berthe
0b89a764ee
Adding exporters: sidekiq, pgbouncer and thanos.
Adding rules to: prometheus, kubernetes, redis, docker and postgresql.
Arranging exporters into categories.
Showing number of rules.
Thanks to Gitlab for opensourcing alerting rules!
2020-03-09 21:18:56 +01:00
Samuel Berthe
affacde49b
adding prometheus internal alerts 2020-03-09 00:16:17 +01:00
Samuel Berthe
99e3e64252
Insert Commit Message Here 2020-03-08 22:21:30 +01:00
Samuel Berthe
77eccab0e9
some random changes on rules 2020-03-08 20:30:22 +01:00
Samuel Berthe
542adc3ca7
Adding minio rules 2020-03-08 18:55:53 +01:00
Samuel Berthe
b5469f2a59
Doc: organizing sections 2020-03-08 17:39:49 +01:00
Samuel Berthe
5bace11107
data: ensure alert name prefix 2020-03-08 17:24:39 +01:00
Samuel Berthe
953878df03
HAProxy 1.*: adding rules 2020-03-08 17:17:06 +01:00
Samuel Berthe
7dbbbb0e09
Doc: organizing lb and reverse proxy 2020-03-08 16:10:33 +01:00
Samuel Berthe
718a039313
Adding an alert for prometheus internals: rule evaluation slowing down 2020-03-08 15:08:11 +01:00
Samuel Berthe
072a435f32
Fixing @jpds queries ;) 🚀 2020-03-08 14:41:36 +01:00
Samuel Berthe
f620fe31ee
Merge pull request #36 from jpds/prom-errors
_data/rules.yml: Added Prometheus error alerts.
2020-03-08 14:29:18 +01:00
Samuel Berthe
6ba051d747
doc: adding a comment to PostgresqlReplicationLag alert 2020-03-07 19:30:58 +01:00
Samuel Berthe
05a2c9604b
Renaming some alert categories 2020-03-07 19:06:54 +01:00
Samuel Berthe
6edcdc75af
my brain is out for vacation, please forgive me 2020-03-07 18:57:09 +01:00
Samuel Berthe
b97ece8c69
Adding alerts for criteo/cassandra_exporter 2020-03-07 18:51:34 +01:00
Samuel Berthe
cde4e243ae
no quotes no cry 2020-03-07 17:59:42 +01:00
Samuel Berthe
0add8466c6
Merge pull request #82 from samber/feat-nodeexporter-raid
Added RAID alerts (node-exporter)
2020-03-07 17:51:39 +01:00
Samuel Berthe
ab477bb21e
Added RAID alerts 2020-03-07 17:50:41 +01:00
Danilo Magalhães
5bd2e03c51
Update rules.yml
Group by instance and name instead of only instance.  
Change from container_spec_memory_limit_bytes to correct max memory metric container_spec_memory_limit_bytes.
2020-02-27 11:08:09 +00:00
Samuel Berthe
a9c9629cb5 oops 2020-01-25 00:16:49 +01:00
Samuel Berthe
134264026a
Does not alert on tmpfs volume filling-up. Closing #77 2020-01-25 00:13:01 +01:00
iamdenchik
29b66f9b3e fix check free disk space 2020-01-15 12:40:19 +05:00
Mateusz Legięcki
a72feb4ff6
Fix Etcd rule: Insufficient Members 2020-01-03 12:58:25 +01:00
Mahesh Paolini-Subramanya
88b55f1dee Replace 'ip' by 'instance' in some rules
The metrics return 'instance', not 'ip'
This PR fixes the rules to use 'instance'
2019-12-27 09:18:16 -05:00
Rob Brown
ce51db2a6f Added Prometheus Not connected to alertmanager alert 2019-12-18 15:38:23 +00:00
Rob Brown
97ecdab26c Added "Disk will fill in 4 hours" alert 2019-12-18 15:32:52 +00:00
Rob Brown
58f843dbc6 Added hardware temperature alerts 2019-12-12 17:29:23 +00:00
Josef Kříž
d10e30aed0
Fixed rabbitmq cluster down rule 2019-12-02 13:12:02 +01:00