Rob Brown
4b22c078ea
Align EDAC errors with comments
2020-05-04 18:47:20 +01:00
Rob Brown
981e82d649
Add HostEDACUncorrectableErrorsdetected and HostEDACCorrectableErrorsdetected rules
2020-04-30 13:27:30 +01:00
Samuel Berthe
951d80121f
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts
2020-04-06 09:13:29 +02:00
Samuel Berthe
e97023d2a4
linkerd2: adding first rule
2020-04-06 09:01:51 +02:00
Selçuk Arıbalı
c98a04784e
FIX KubernetesPodnothealthy Alert
...
Kube state metrics assigns value of current pod phase with 1, so according to that Kubernetes Pod not healthy fixed.
2020-04-02 21:01:04 +03:00
Samuel Berthe
c20227b458
oops: adding one-to-one vector matching to mysql subqueries
2020-03-31 16:02:28 +02:00
Matthias Crauwels
79b5ad3b5d
removed avg grouping where possible
2020-03-31 11:42:05 +02:00
Matthias Crauwels
4860250360
added some extra MySQL checks
2020-03-30 11:24:58 +02:00
Samuel Berthe
d9286f6c39
doc: add instructions to rules yaml file
2020-03-28 15:12:21 +01:00
Samuel Berthe
2cda73aa3a
fix(kubernetes): min_over_time takes a time range as paremeter
2020-03-26 16:19:26 +01:00
Samuel Berthe
329583ac36
Fix typo and make pg and mysql similar
2020-03-25 16:44:49 +01:00
luhellma
5559e0140b
fix: double usage in query and alert configuration
2020-03-25 16:34:04 +01:00
luhellma
5d8f911d97
feat: Add new rules for MySQLd_exporter from prometheus
2020-03-25 11:57:29 +01:00
luhellma
a4fc086b9a
fix wrong number of equal sign in query
2020-03-20 15:22:20 +01:00
luhellma
3d41e2b3ca
Add rules for apache
2020-03-20 15:08:13 +01:00
Alexander Knipping
caaea2eeb7
Fix typo in DeadManSwitch alert
...
Rename it from snitch into switch.
2020-03-18 15:21:38 +01:00
Samuel Berthe
34e62cb327
nginx: adding latency metric
2020-03-17 22:26:46 +01:00
Samuel Berthe
07dde61116
elasticsearch: adding disk watermark alerts
2020-03-17 21:19:58 +01:00
Samuel Berthe
2ecdb636b2
oops
2020-03-17 21:08:09 +01:00
Samuel Berthe
c653b37e15
adding rules to prometheus self monitoring
2020-03-17 20:56:49 +01:00
Samuel Berthe
fc3e72041c
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts
2020-03-17 19:05:57 +01:00
Samuel Berthe
5125c683c5
adding alerts for Ceph
2020-03-17 18:50:08 +01:00
Alexander Knipping
c82df5d005
Fix PrometheusRuleEvaluationSlow
...
Fixes the rule PrometheusRuleEvaluationSlow as it should fire if
prometheus_rule_group_last_duration_seconds takes longer than
prometheus_rule_group_interval_seconds.
prometheus_rule_group_last_duration_seconds: The duration of the last rule group evaluation.
prometheus_rule_group_interval_seconds: The interval of a rule group.
2020-03-17 15:14:40 +01:00
Samuel Berthe
5b457b0e52
adding github buttons to layout
2020-03-09 23:31:27 +01:00
Samuel Berthe
f554b72671
Add alert for kubernetes api latency
2020-03-09 21:55:17 +01:00
Samuel Berthe
0b89a764ee
Adding exporters: sidekiq, pgbouncer and thanos.
...
Adding rules to: prometheus, kubernetes, redis, docker and postgresql.
Arranging exporters into categories.
Showing number of rules.
Thanks to Gitlab for opensourcing alerting rules!
2020-03-09 21:18:56 +01:00
Samuel Berthe
affacde49b
adding prometheus internal alerts
2020-03-09 00:16:17 +01:00
Samuel Berthe
99e3e64252
Insert Commit Message Here
2020-03-08 22:21:30 +01:00
Samuel Berthe
77eccab0e9
some random changes on rules
2020-03-08 20:30:22 +01:00
Samuel Berthe
542adc3ca7
Adding minio rules
2020-03-08 18:55:53 +01:00
Samuel Berthe
b5469f2a59
Doc: organizing sections
2020-03-08 17:39:49 +01:00
Samuel Berthe
5bace11107
data: ensure alert name prefix
2020-03-08 17:24:39 +01:00
Samuel Berthe
953878df03
HAProxy 1.*: adding rules
2020-03-08 17:17:06 +01:00
Samuel Berthe
7dbbbb0e09
Doc: organizing lb and reverse proxy
2020-03-08 16:10:33 +01:00
Samuel Berthe
718a039313
Adding an alert for prometheus internals: rule evaluation slowing down
2020-03-08 15:08:11 +01:00
Samuel Berthe
072a435f32
Fixing @jpds queries ;) 🚀
2020-03-08 14:41:36 +01:00
Samuel Berthe
f620fe31ee
Merge pull request #36 from jpds/prom-errors
...
_data/rules.yml: Added Prometheus error alerts.
2020-03-08 14:29:18 +01:00
Samuel Berthe
6ba051d747
doc: adding a comment to PostgresqlReplicationLag alert
2020-03-07 19:30:58 +01:00
Samuel Berthe
05a2c9604b
Renaming some alert categories
2020-03-07 19:06:54 +01:00
Samuel Berthe
6edcdc75af
my brain is out for vacation, please forgive me
2020-03-07 18:57:09 +01:00
Samuel Berthe
b97ece8c69
Adding alerts for criteo/cassandra_exporter
2020-03-07 18:51:34 +01:00
Samuel Berthe
cde4e243ae
no quotes no cry
2020-03-07 17:59:42 +01:00
Samuel Berthe
0add8466c6
Merge pull request #82 from samber/feat-nodeexporter-raid
...
Added RAID alerts (node-exporter)
2020-03-07 17:51:39 +01:00
Samuel Berthe
ab477bb21e
Added RAID alerts
2020-03-07 17:50:41 +01:00
Danilo Magalhães
5bd2e03c51
Update rules.yml
...
Group by instance and name instead of only instance.
Change from container_spec_memory_limit_bytes to correct max memory metric container_spec_memory_limit_bytes.
2020-02-27 11:08:09 +00:00
Samuel Berthe
a9c9629cb5
oops
2020-01-25 00:16:49 +01:00
Samuel Berthe
134264026a
Does not alert on tmpfs volume filling-up. Closing #77
2020-01-25 00:13:01 +01:00
iamdenchik
29b66f9b3e
fix check free disk space
2020-01-15 12:40:19 +05:00
Mateusz Legięcki
a72feb4ff6
Fix Etcd rule: Insufficient Members
2020-01-03 12:58:25 +01:00
Mahesh Paolini-Subramanya
88b55f1dee
Replace 'ip' by 'instance' in some rules
...
The metrics return 'instance', not 'ip'
This PR fixes the rules to use 'instance'
2019-12-27 09:18:16 -05:00