Rob Brown
58f843dbc6
Added hardware temperature alerts
2019-12-12 17:29:23 +00:00
Josef Kříž
d10e30aed0
Fixed rabbitmq cluster down rule
2019-12-02 13:12:02 +01:00
Maxime Brunet
1e2a35e058
elasticsearch: Alert for no new docs on data nodes only
...
We can have nodes that are not masters, but don not hold any data. For example the client/coordinating nodes set up by the `stable/elasticsearch` helm chart:
https://github.com/helm/charts/tree/master/stable/elasticsearch#client-and-coordinating-nodes
And we can also have nodes being data and master nodes simultaneously.
So I think, this alert has to look for `es_data_node="true"` to be correct.
2019-11-06 15:23:26 -05:00
Samuel Berthe
9306d8947f
PG: Alert in case of high rollback ratio ( #64 )
...
PG: Alert in case of high rollback ratio
2019-10-31 12:02:03 +01:00
Samuel Berthe
0c9a24a4e7
feat(pg): alert in case of high rollback ratio
2019-10-31 12:00:53 +01:00
Samuel Berthe
cca2872ade
typo
2019-10-31 11:47:57 +01:00
Samuel Berthe
768fac56ae
Merge pull request #62 from jdorel/patch-1
...
SllCertificateExpired synthax
2019-10-29 12:15:15 +01:00
Samuel Berthe
20744c3d3d
Update rules.yml
2019-10-29 12:12:43 +01:00
Jonas DOREL
80aebe84e9
Add Kubernetes alerts from kube-state-metric exporter
2019-10-29 11:59:14 +01:00
Jonas DOREL
267a064d26
SllCertificateExpired synthax
...
Match other alert names, without the `has` part.
2019-10-29 11:39:01 +01:00
Samuel Berthe
82cf3ac1ef
adding cassandra
2019-10-26 17:48:22 +02:00
Samuel Berthe
4f9e88bad4
improving blackbox alerts
2019-10-26 17:43:18 +02:00
Samuel Berthe
dfa5446cd5
adding comments in data structure
2019-10-26 17:25:35 +02:00
Samuel Berthe
8f6c85774a
Clean data file
2019-09-25 16:36:10 +02:00
olivier beyler
e3628c5ba8
Add OpenEBS and Minio alert
...
Signed-off-by: olivier beyler <olivier.beyler@orange.com>
2019-09-25 16:13:44 +02:00
Samuel Berthe
1f4a1f8052
Updating Traefik -> Traefik v1.*
2019-09-25 14:23:16 +02:00
Andrey Dudin
6d9866cefb
Fix typo in query of PG DeadLocks
2019-09-25 02:42:44 +03:00
Samuel Berthe
f7f94ed81e
Fixed time interval (10min->10m)
2019-09-13 18:08:04 +02:00
timfeirg
37ef9a6f5c
free memory should include node_memory_Slab_bytes
2019-09-03 15:47:17 +08:00
Samuel Berthe
51e7231b3d
fix(blackbox exporter): alert when http >= 400 instead of 300
2019-08-29 19:03:54 +02:00
Jonas Kongslund
9bd8b3698f
Add CollectorError alert for WMI exporter
2019-08-22 13:52:15 +04:00
louis
e9f247783b
add alerts for traefik
2019-08-08 14:32:47 +02:00
Jonas Kongslund
d789cc314c
Add ProbeFailed alert for the Blackbox exporter
2019-07-25 13:01:47 +04:00
Dam Viet
e2c731229b
fix rule Container Volume usage
2019-07-17 16:59:56 +07:00
Dam Viet
6d6d6ac6a7
update
2019-07-15 15:13:23 +07:00
Dam Viet
db26f248f8
fix rule Container Volume usage
2019-07-15 14:56:52 +07:00
Dam Viet
4b7ecc82e2
suggest fix Container Memory usage
2019-07-15 14:54:13 +07:00
Samuel Berthe
a9019cb063
🤘 🎸
2019-07-14 20:00:55 +02:00
Samuel Berthe
3cdc7d625a
_data/rules.yml: Added CoreDNS panic alert. ( #35 )
...
_data/rules.yml: Added CoreDNS panic alert.
2019-07-14 18:06:21 +02:00
Samuel Berthe
089ab714c0
Update rules.yml
2019-07-14 18:06:08 +02:00
Samuel Berthe
e189294c94
_data/rules.yml: Added Kubernetes volume alert rule. ( #32 )
...
_data/rules.yml: Added Kubernetes volume alert rule.
2019-07-14 17:59:49 +02:00
Samuel Berthe
78dc1ba144
Update rules.yml
2019-07-14 17:59:39 +02:00
Samuel Berthe
3d6e520ac1
fix(node-exporter): better cpu load query
2019-07-14 17:51:21 +02:00
Samuel Berthe
ca22d8d3d9
Fixed windows disk usage computation
2019-07-14 17:31:52 +02:00
anon
70211339af
more alerts and removed IIS Process from wmi_service_status
2019-07-14 08:46:00 +02:00
anon
f033e06045
Name feedback from samber
2019-07-12 08:57:10 +02:00
anon
3b6235ccb3
add wmi_exporter example
2019-07-09 11:56:41 +02:00
Jonathan Davies
ddc19224be
_data/rules.yml: Added AlertManager config reload rule.
2019-06-25 16:06:55 +01:00
Jonathan Davies
2574946609
_data/rules.yml: Use humanize instead of % printf.
2019-06-25 15:54:47 +01:00
Jonathan Davies
c7ca57f57f
_data/rules.yml: Added volume full in four days alert rule.
2019-06-25 14:45:17 +01:00
Jonathan Davies
37109f8ccd
_data/rules.yml: Added CoreDNS panic alert.
2019-06-24 22:25:40 +01:00
Jonathan Davies
3ccf6ae3d0
_data/rules.yml: Added Kubernetes volume alert rule.
2019-06-24 16:09:02 +01:00
Jonathan Davies
49d93c6f4f
_data/rules.yml: Added Prometheus configuration reload alert rule.
2019-06-24 14:31:09 +01:00
anon
bb5dba262f
correct wrong AND to OR
2019-06-17 14:25:43 +02:00
Jonas DOREL
e685a7ddef
Add systemd failed services alerts
2019-06-06 15:44:56 +02:00
Samuel Berthe
ab6612b94f
Improves Juniper rules
2019-05-21 11:59:08 +02:00
Samuel Berthe
e17edc9e99
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts
2019-05-21 11:52:40 +02:00
AngelFreak
51d0357e15
Changed from 09 to 10 for 10GBit, and fix severity duplicate
2019-05-20 09:37:01 +02:00
AngelFreak
0a2a4e2aaf
Remove redundant example, and changed notation for easier reading
2019-05-16 10:32:18 +02:00
AngelFreak
5e40343cbc
Add Juniper rules
2019-05-15 15:20:48 +02:00