Samuel Berthe
45103f0a0d
Merge branch 'master' into master
2020-10-11 17:10:20 +02:00
Samuel Berthe
7a609adf18
adding comment to container OOM killer warning
2020-10-11 16:11:44 +02:00
Samuel Berthe
cf70272309
fix(container memory limit): filter by containers having max memory setting
2020-10-11 16:08:54 +02:00
Samuel Berthe
4128004475
Merge pull request #119 from fernandocarletti/patch-1
...
fix: container ContainerMemoryUsage alert
2020-10-11 16:06:33 +02:00
Samuel Berthe
f67162bf57
Merge pull request #148 from fsschmitt/fix/disk-latency-unit
...
Fix time unit on disk read/write latency rule
2020-10-11 15:49:15 +02:00
fsschmitt
4266b4d326
Fix time unit on disk read/write latency rule
2020-10-06 14:36:22 +01:00
fsschmitt
5288c9a2f5
Fix node_md_disks state from fail to failed
2020-10-06 13:33:50 +01:00
Daniel Andrzejewski
fc4797db9e
small fix
2020-09-17 15:19:14 +02:00
Daniel Andrzejewski
6c5f708179
node_disk_write_time_seconds_total is in seconds, not in milliseconds. node_disk_write_time_seconds_total should be grater than 0, otherwise you get +Inf result.
2020-09-17 15:13:42 +02:00
Fernando Carletti
e6de413146
fix: container ContainerMemoryUsage alert
2020-05-18 17:38:05 -05:00
Samuel Berthe
da1e4f6301
💄 replacing "error" severity by "critical", repo wide
2020-05-14 17:20:19 +02:00
Samuel Berthe
7293bca720
Merge pull request #107 from robert-will-brown/NetworkTransmitErrors
2020-05-09 21:32:40 +02:00
Samuel Berthe
b081f28f5d
Merge pull request #112 from robert-will-brown/SpeedTestExporter
2020-05-09 21:31:33 +02:00
Samuel Berthe
660312d0ea
fix OOM killer threshold
2020-05-09 21:25:13 +02:00
Samuel Berthe
6d6b41e241
Merge pull request #108 from robert-will-brown/EdacMemoryErrors
2020-05-09 21:23:01 +02:00
Rob Brown
8faa295745
Add SpeedTest stanza
2020-05-09 10:20:55 +01:00
Rob Brown
ee4e046c66
Add "> 0" at the end of NetworkTransmitErrors queries
2020-05-09 10:18:21 +01:00
Samuel Berthe
d5f6388899
renaming some mysql alerts
2020-05-09 02:11:18 +02:00
Rob Brown
5d83e393cc
Add initial Speedtest Exporter rules
2020-05-08 15:25:54 +01:00
Rob Brown
8912db93bc
Fix "greater than" value
2020-05-04 19:04:52 +01:00
Rob Brown
4b22c078ea
Align EDAC errors with comments
2020-05-04 18:47:20 +01:00
Samuel Berthe
718cd2188c
shame on me
2020-05-04 00:10:43 +02:00
Samuel Berthe
eb8dc736a3
improve acuracy for context switching query
2020-05-04 00:05:33 +02:00
Samuel Berthe
790139211e
fix typo: postgresql replication lag
2020-05-03 23:23:21 +02:00
Samuel Berthe
648b83250a
improve accuracy "Kubernetes Pod not healthy" query
2020-05-03 18:01:25 +02:00
Ondrej Zalesky
d3d13946e6
fix "Kubernetes Pod not healthy" query
2020-04-30 22:53:25 +02:00
Rob Brown
981e82d649
Add HostEDACUncorrectableErrorsdetected and HostEDACCorrectableErrorsdetected rules
2020-04-30 13:27:30 +01:00
Rob Brown
f87e6d300d
Added spacing as per standard
2020-04-30 12:39:12 +01:00
Rob Brown
c57a5e6e36
Add HostNetworkReceiveErrors and HostNetworkTransmitErrors rules
2020-04-30 12:38:23 +01:00
Samuel Berthe
951d80121f
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts
2020-04-06 09:13:29 +02:00
Samuel Berthe
e97023d2a4
linkerd2: adding first rule
2020-04-06 09:01:51 +02:00
Selçuk Arıbalı
c98a04784e
FIX KubernetesPodnothealthy Alert
...
Kube state metrics assigns value of current pod phase with 1, so according to that Kubernetes Pod not healthy fixed.
2020-04-02 21:01:04 +03:00
Samuel Berthe
c20227b458
oops: adding one-to-one vector matching to mysql subqueries
2020-03-31 16:02:28 +02:00
Matthias Crauwels
79b5ad3b5d
removed avg grouping where possible
2020-03-31 11:42:05 +02:00
Matthias Crauwels
4860250360
added some extra MySQL checks
2020-03-30 11:24:58 +02:00
Samuel Berthe
d9286f6c39
doc: add instructions to rules yaml file
2020-03-28 15:12:21 +01:00
Samuel Berthe
2cda73aa3a
fix(kubernetes): min_over_time takes a time range as paremeter
2020-03-26 16:19:26 +01:00
Samuel Berthe
329583ac36
Fix typo and make pg and mysql similar
2020-03-25 16:44:49 +01:00
luhellma
5559e0140b
fix: double usage in query and alert configuration
2020-03-25 16:34:04 +01:00
luhellma
5d8f911d97
feat: Add new rules for MySQLd_exporter from prometheus
2020-03-25 11:57:29 +01:00
luhellma
a4fc086b9a
fix wrong number of equal sign in query
2020-03-20 15:22:20 +01:00
luhellma
3d41e2b3ca
Add rules for apache
2020-03-20 15:08:13 +01:00
Alexander Knipping
caaea2eeb7
Fix typo in DeadManSwitch alert
...
Rename it from snitch into switch.
2020-03-18 15:21:38 +01:00
Samuel Berthe
34e62cb327
nginx: adding latency metric
2020-03-17 22:26:46 +01:00
Samuel Berthe
07dde61116
elasticsearch: adding disk watermark alerts
2020-03-17 21:19:58 +01:00
Samuel Berthe
2ecdb636b2
oops
2020-03-17 21:08:09 +01:00
Samuel Berthe
c653b37e15
adding rules to prometheus self monitoring
2020-03-17 20:56:49 +01:00
Samuel Berthe
fc3e72041c
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts
2020-03-17 19:05:57 +01:00
Samuel Berthe
5125c683c5
adding alerts for Ceph
2020-03-17 18:50:08 +01:00
Alexander Knipping
c82df5d005
Fix PrometheusRuleEvaluationSlow
...
Fixes the rule PrometheusRuleEvaluationSlow as it should fire if
prometheus_rule_group_last_duration_seconds takes longer than
prometheus_rule_group_interval_seconds.
prometheus_rule_group_last_duration_seconds: The duration of the last rule group evaluation.
prometheus_rule_group_interval_seconds: The interval of a rule group.
2020-03-17 15:14:40 +01:00