Commit graph

261 commits

Author SHA1 Message Date
Samuel Berthe
2e6e46da45
Merge branch 'master' into master 2020-10-11 17:42:51 +02:00
Samuel Berthe
c469d26c4d
Merge pull request #137 from Ozarklake/sql_server_rules 2020-10-11 17:37:40 +02:00
Samuel Berthe
bafcd1e922
Update rules.yml 2020-10-11 17:35:46 +02:00
Samuel Berthe
e60fc805f6
Merge pull request #138 from nirav-chotai/nchotai/fix-hpa-alerts
[PLEASE_MERGE] Fix HPA alerts
2020-10-11 17:24:13 +02:00
Samuel Berthe
6defbb18ca
Merge pull request #146 from sys-ops/master 2020-10-11 17:12:22 +02:00
Samuel Berthe
45103f0a0d
Merge branch 'master' into master 2020-10-11 17:10:20 +02:00
Samuel Berthe
7a609adf18
adding comment to container OOM killer warning 2020-10-11 16:11:44 +02:00
Samuel Berthe
7cf383234a
Merge pull request #149 from samber/fix-container-memory-usage-filter-limited
Fix container memory limit: filter containers having memory limit
2020-10-11 16:10:08 +02:00
Samuel Berthe
cf70272309
fix(container memory limit): filter by containers having max memory setting 2020-10-11 16:08:54 +02:00
Samuel Berthe
4128004475
Merge pull request #119 from fernandocarletti/patch-1
fix: container ContainerMemoryUsage alert
2020-10-11 16:06:33 +02:00
Samuel Berthe
f67162bf57
Merge pull request #148 from fsschmitt/fix/disk-latency-unit
Fix time unit on disk read/write latency rule
2020-10-11 15:49:15 +02:00
Samuel Berthe
ccc485f86d
Merge pull request #147 from fsschmitt/master
Fix node_md_disks state from fail to failed
2020-10-11 15:47:19 +02:00
fsschmitt
4266b4d326 Fix time unit on disk read/write latency rule 2020-10-06 14:36:22 +01:00
fsschmitt
5288c9a2f5 Fix node_md_disks state from fail to failed 2020-10-06 13:33:50 +01:00
Daniel Andrzejewski
fc4797db9e small fix 2020-09-17 15:19:14 +02:00
Daniel Andrzejewski
6c5f708179 node_disk_write_time_seconds_total is in seconds, not in milliseconds. node_disk_write_time_seconds_total should be grater than 0, otherwise you get +Inf result. 2020-09-17 15:13:42 +02:00
Nirav Chotai
8fb5da83de
Fix HPA alerts
- Fixing KubernetesHpaMetricAvailability
- Fixing KubernetesHpaScalingAbility
2020-07-24 13:32:44 +08:00
Ozarklake
88e812c78e add sql server rules 2020-07-17 15:02:41 +08:00
Ozarklake
4e66d17d01 add sql server rules 2020-07-17 14:58:26 +08:00
Ozarklake
e009c5d8b5
Optimizing mysql slow query alert rules 2020-07-14 12:55:17 +08:00
Fernando Carletti
e6de413146
fix: container ContainerMemoryUsage alert 2020-05-18 17:38:05 -05:00
Samuel Berthe
4cd3ff1d4a
Merge pull request #117 from samber/replace-severity-error-critical 2020-05-14 17:22:37 +02:00
Samuel Berthe
da1e4f6301
💄 replacing "error" severity by "critical", repo wide 2020-05-14 17:20:19 +02:00
Samuel Berthe
7293bca720
Merge pull request #107 from robert-will-brown/NetworkTransmitErrors 2020-05-09 21:32:40 +02:00
Samuel Berthe
b081f28f5d
Merge pull request #112 from robert-will-brown/SpeedTestExporter 2020-05-09 21:31:33 +02:00
Samuel Berthe
660312d0ea
fix OOM killer threshold 2020-05-09 21:25:13 +02:00
Samuel Berthe
6d6b41e241
Merge pull request #108 from robert-will-brown/EdacMemoryErrors 2020-05-09 21:23:01 +02:00
Rob Brown
8faa295745 Add SpeedTest stanza 2020-05-09 10:20:55 +01:00
Rob Brown
ee4e046c66 Add "> 0" at the end of NetworkTransmitErrors queries 2020-05-09 10:18:21 +01:00
Samuel Berthe
d5f6388899
renaming some mysql alerts 2020-05-09 02:11:18 +02:00
Rob Brown
5d83e393cc Add initial Speedtest Exporter rules 2020-05-08 15:25:54 +01:00
Rob Brown
8912db93bc Fix "greater than" value 2020-05-04 19:04:52 +01:00
Rob Brown
4b22c078ea Align EDAC errors with comments 2020-05-04 18:47:20 +01:00
Samuel Berthe
718cd2188c
shame on me 2020-05-04 00:10:43 +02:00
Samuel Berthe
eb8dc736a3
improve acuracy for context switching query 2020-05-04 00:05:33 +02:00
Samuel Berthe
790139211e
fix typo: postgresql replication lag 2020-05-03 23:23:21 +02:00
Samuel Berthe
773b3456d2
renaming sms to pager 2020-05-03 21:40:45 +02:00
Samuel Berthe
648b83250a
improve accuracy "Kubernetes Pod not healthy" query 2020-05-03 18:01:25 +02:00
Samuel Berthe
81349d939f
Merge pull request #111 from zales/master 2020-05-03 18:00:41 +02:00
Ondrej Zalesky
d3d13946e6 fix "Kubernetes Pod not healthy" query 2020-04-30 22:53:25 +02:00
Rob Brown
981e82d649 Add HostEDACUncorrectableErrorsdetected and HostEDACCorrectableErrorsdetected rules 2020-04-30 13:27:30 +01:00
Rob Brown
f87e6d300d Added spacing as per standard 2020-04-30 12:39:12 +01:00
Rob Brown
c57a5e6e36 Add HostNetworkReceiveErrors and HostNetworkTransmitErrors rules 2020-04-30 12:38:23 +01:00
Samuel Berthe
951d80121f
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts 2020-04-06 09:13:29 +02:00
Samuel Berthe
e97023d2a4
linkerd2: adding first rule 2020-04-06 09:01:51 +02:00
Samuel Berthe
a8a8950c01
Merge pull request #101 from Seljuke/master
FIX KubernetesPodnothealthy Alert
2020-04-02 21:22:20 +02:00
Selçuk Arıbalı
c98a04784e
FIX KubernetesPodnothealthy Alert
Kube state metrics assigns value of current pod phase with 1, so according to that Kubernetes Pod not healthy fixed.
2020-04-02 21:01:04 +03:00
Samuel Berthe
c20227b458
oops: adding one-to-one vector matching to mysql subqueries 2020-03-31 16:02:28 +02:00
Samuel Berthe
7f05b0cbc4
Merge pull request #98 from mcrauwel/extra-mysql-checks
added some extra MySQL checks
2020-03-31 16:00:13 +02:00
Matthias Crauwels
79b5ad3b5d
removed avg grouping where possible 2020-03-31 11:42:05 +02:00