Commit graph

343 commits

Author SHA1 Message Date
Samuel Berthe
2323541f2d
data: adding mgob query 2022-06-09 00:23:17 +02:00
Samuel Berthe
08d482f314
doc: add postgrseql bloat 2022-06-07 02:32:09 +02:00
Samuel Berthe
4662cd2812
doc: improve pulsar doc 2022-06-07 01:29:31 +02:00
Marcel Körtgen
074e3e6d04
Add pulsar rules (#286)
* Add pulsar rules

* Add webrick, cf.:
- https://github.com/github/pages-gem/issues/752

* Update gems (minitest / ruby 3 issue)

* Add repo info (workaround), cf.
- https://github.com/jekyll/jekyll/issues/4705
2022-06-07 01:21:10 +02:00
Samuel Berthe
4d26719d41
removed some rules 2022-04-19 00:07:31 +02:00
Samuel Berthe
97810b6537
change severity of PostgresqlConfigurationChanged to info 2022-04-18 23:37:17 +02:00
Samuel Berthe
8941f71c6c
chore(ci): adding test with promtool (#281) 2022-04-18 23:30:32 +02:00
Samuel Berthe
4d161ee0a5
feat(jenkins): add "jenkins outdated plugin" rule 2022-04-18 20:29:36 +02:00
Samuel
718b002826
fix / increases requires interval (#279) 2022-04-18 20:17:33 +02:00
Koen Dierckx
21ddd2f752
Added Alert manager job alert (#272)
Co-authored-by: DIERCKXK <koen.dierckx@vito.be>
2022-01-23 19:36:36 +01:00
armondressler
038e46743d
fixed erroneous usage of rate() function on gauges (#270)
Co-authored-by: Dressler Armon, B2B-PAP-HLT-DO-ENG <armon.dressler@swisscom.com>
2022-01-16 03:24:36 +01:00
MikeN. Paxos
78a7e61050
added jenkins alert rules for jenkins metrics plugin (#268)
* added jenkins alert rules

* Update rules.yml

Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-12-27 12:48:07 +01:00
Samuel Berthe
fd0f2805c0
Renaming kube_hpa_* to kube_horizontalpodautoscaler_*
Fixes #266
2021-12-07 23:16:40 +01:00
Samuel Berthe
f3ef333a3e
doc: remove comment 2021-12-07 23:14:23 +01:00
Damon Vincent
a12f5263c2
Filter parent groups from Docker container alerts (#267) 2021-12-07 23:05:27 +01:00
Samuel Berthe
2ca7f5bebe
doc: more explicit description for HostClock* rules (#265) 2021-12-02 20:54:23 +01:00
Lauri Võsandi
2be7e9684c
Add HostNetworkBondDegraded (#260) 2021-12-02 20:48:11 +01:00
John Losito
1a7690a1a3
Add rule for reboot-required (#262) 2021-12-02 20:45:33 +01:00
leemos
ee3c878b06
apiserver_request_count has been turned off (#264) 2021-12-02 20:23:56 +01:00
Torsten Bøgh Köster
4e1a26cab3
Add Solr rules (#258) 2021-11-21 18:53:32 +01:00
chaoxiaodi
7a40d7f423
Update rules.yml (#252) 2021-10-27 14:00:35 +02:00
Samuel Berthe
7857afab6e
fix(rule): fixing KubernetesOutOfCapacity (#227) 2021-10-17 17:14:44 +02:00
Samuel Berthe
a978cfb5a1
doc: more explicit "ContainerAbsent" and "ContainerKilled" rules (#247) 2021-10-10 20:13:30 +02:00
Samuel Berthe
4e0d99dd09
fix(mongodb): fix query for MongodbReplicationHeadroom rule (#250) 2021-10-10 20:12:06 +02:00
kayge
2d9e4ae431
Cleaning up typos in rules.yml (#248) 2021-10-09 01:05:15 +02:00
Andre Martins
36ca52e598
adding alerts to promtail and loki (#241)
Co-authored-by: apmbktf <andre.pasqualinoto-martins@itau-unibanco.com.br>
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-10-03 22:12:59 +02:00
Christian Zenker
7c67f02ee6
The metric is called 'thanos_compact_halted' (#243)
According to https://github.com/thanos-io/thanos/blob/main/examples/alerts/alerts.md
2021-09-21 15:48:27 +02:00
Ondřej Nový
abfae043bb
Fix typo in description (#242) 2021-09-19 23:37:51 +02:00
Samuel Berthe
a225087b06
prevent +inf max value 2021-08-19 23:45:58 +02:00
gökhan
b9222993ac
istio pilot duplicate cluster (#220) 2021-08-19 21:23:27 +02:00
Guillaume
6fcdcff5e3
Fix bad syntax for Haproxy rules (#232)
Aggregations require parentheses around expressions
2021-08-19 21:22:39 +02:00
flf2ko
a02a7e6eab
Fix "percentil" typo in Etcd rules (#234) 2021-08-19 21:21:16 +02:00
Krasimir Nedelchev
3d69117f33
Add missing parenthesis to rule (#237) 2021-08-19 21:20:11 +02:00
Igor Churmeev
3612c9cc3e
Add alerts for Hashicorp Vault (#238)
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-08-19 21:19:43 +02:00
Andre Martins
b47359c2fd
added alerts to cortex (#240)
* added alerts to cortex

* Update rules.yml

Co-authored-by: apmbktf <andre.pasqualinoto-martins@itau-unibanco.com.br>
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-08-19 20:31:46 +02:00
Benjamin Dos Santos
7304d40539
fix(HostNetworkInterfaceSaturated): display network interface name in description (#239)
`$labels.interface` doesn't exist, use `$labels.device` instead
2021-08-16 16:29:12 +02:00
Gjed
c2b8178304
Loki alerts (#218)
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-07-04 23:59:46 +02:00
asteny
243c0280cf
Haproxy 2 embedded exporter fixes (#229) 2021-07-04 23:28:58 +02:00
Alexandros Orfanos
6a6f89bad5
Add php-fpm max-children alert (#224) 2021-06-29 12:37:54 +02:00
Alberto del Barrio
0ba7c2a47e
fix typo (#228) 2021-06-27 14:16:42 +02:00
Samuel Berthe
092d0f8bda
fix(haproxy): some query were using wrong metrics name 2021-05-01 22:48:54 +02:00
Samuel Berthe
e044fddd11
feat(data): reverse traefik exporters order 2021-05-01 22:12:12 +02:00
Samuel Berthe
af30d0f06c
fix(node_exporter): better alert description for EDAC + network errors (#204) 2021-05-01 22:01:10 +02:00
Samuel Berthe
135d4b7c1a
fix(data): for KubernetesPodNotHealthy, insert a step of subquery execution time 2021-05-01 20:30:35 +02:00
Samuel Berthe
54b1e674b2
fix(data): fix pg replicatino lag query 2021-05-01 19:58:42 +02:00
Moritz
335ba16032
Fix upper/lowercase of systemd (#207)
The're quite clear on how they want it to be written:

https://unix.stackexchange.com/review/suggested-edits/372414
2021-05-01 19:44:06 +02:00
Samuel Berthe
1c44cd7818
feat(data): adding k8s rule - detect container killed by oomkiller 2021-05-01 19:33:03 +02:00
Gustavo Kazuo Motizuki
18672ff0f9
Improve KubernetesOutOfCapacity alert (#211) 2021-05-01 19:27:46 +02:00
Samuel
97c48862d7
fix(haproxy) (#213) 2021-05-01 18:58:46 +02:00
Samuel Berthe
b9f09e7f93
fix(freeswitch): move to the networking section 2021-05-01 18:53:04 +02:00
Samuel
823b8edd7e
feat(freeswitch) (#214) 2021-05-01 18:45:36 +02:00
Samuel Berthe
c3ba0cf199
data: rename coredns metric 2021-03-28 00:34:56 +01:00
Samuel Berthe
b9db2c0c68
data: fix some elasticsearch rules 2021-02-26 11:31:06 +01:00
Samuel Berthe
1d0fd50033
fix(data): quickfix on cassandra, because i merged a little bit to fast pr-196 2021-02-22 14:44:45 +01:00
ko-christ
24ae7de2f5
Fill in PrometheusRules for instaclustr/cassandra-exporter (#196) 2021-02-22 14:38:40 +01:00
Samuel Berthe
19f9316868
Merge pull request #197 from yasharne/new_minio 2021-02-22 14:09:38 +01:00
Yashar Nesabian
f166c909f1 removed old minio rules 2021-02-22 11:35:49 +03:30
Samuel Berthe
ca31cc8a71
fix(data): fix node exporter temperature alarm 2021-02-21 19:05:10 +01:00
Yashar Nesabian
def11767bf added minio disk space usage missed condition 2021-02-16 21:33:33 +03:30
Yashar Nesabian
4c5ff1fc68 Added new minio alert rules 2021-02-16 21:06:14 +03:30
Samuel Berthe
6d7ef1cdbb
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts 2021-02-07 20:47:59 +01:00
Samuel Berthe
4138f78ea2
feat(ui): adding navbar 2021-02-07 20:46:45 +01:00
Samuel Berthe
417fb2e691
Merge pull request #189 from strangeman/zookeeper-alerts 2021-02-02 14:16:30 +01:00
Anton Markelov
040cbe1ace add suggested changes 2021-02-02 14:19:32 +02:00
Samuel Berthe
b1e2e02db9
💄 2021-02-01 15:46:53 +01:00
Anton Markelov
b619efac76 deal with proposed changes 2021-02-01 13:15:51 +02:00
Anton Markelov
647508e520 add alerts for kafka burrow exporter 2021-02-01 11:01:36 +02:00
Anton Markelov
1f7712b332 add alerts for dabealu/zookeeper-exporter 2021-02-01 10:48:43 +02:00
Bertrand Mailhe
cbc281cea7 fix rule -Container Volume usage-
Signed-off-by: Bertrand Mailhe <bmailhe@leadformance.com>
2021-01-25 17:50:22 +01:00
Samuel Berthe
0ee7f1266f
minor improvements for ssl exporter 2021-01-20 18:09:36 +01:00
Samuel Berthe
8d0826020b
Merge pull request #184 from yasharne/master
added ssl/tls exporter alert rules
2021-01-20 18:02:00 +01:00
Yashar Nesabian
916ac1af8f added ssl/tls exporter alert rules 2021-01-20 14:51:23 +03:30
Samuel Berthe
6f76f46eff
redis: adding comment for exporter flag 2021-01-19 17:11:40 +01:00
Per Lundberg
b3674d96c5 Redis: add alternative maxmemory alert 2021-01-19 15:40:53 +02:00
Samuel Berthe
93b2f1390a
fixing #181: k8s request latency tracking 2021-01-13 11:22:21 +01:00
Samuel Berthe
d3f514b7e4
data: fixing etcd query 2021-01-11 11:01:14 +01:00
Samuel Berthe
f5e05c55d0
data: some netdata disk alerts 2021-01-08 23:48:04 +01:00
Samuel Berthe
3b9cd87f3d
data: adding nomad 2021-01-08 23:40:54 +01:00
Samuel Berthe
f7c25e648c
data: adding netdata 2021-01-08 23:26:57 +01:00
Heckel, Robert J
ce12720abc updating per samber's comment. 2021-01-08 12:13:56 -06:00
Heckel, Robert J
b033ca9e8d Adding some basic rules snagged from the defaults. 2021-01-07 10:40:49 -06:00
Benjamin Dos Santos
0f24d8cc9e
refactor: improve some haproxy v2 rules 2021-01-06 21:09:21 +01:00
Samuel Berthe
df602d6e47
typo haproxy error status 2021-01-06 15:38:07 +01:00
Samuel Berthe
fe00569998
Merge pull request #172 from bdossantos/chore/haproxy2
chore: add Prometheus alerts for HAProxy v2
2021-01-06 15:37:19 +01:00
Gert Vilain
de8e2f6cd9
Remove duplicate kubernetes job failed 2021-01-05 20:49:25 +01:00
Benjamin Dos Santos
1b7c36666c
chore: add Prometheus alerts for HAProxy v2
ref #87
2021-01-05 16:45:52 +01:00
Samuel Berthe
209fdf86e8
reduce p99 quantile aggregation duration 2021-01-05 12:30:32 +01:00
Samuel Berthe
5d7d99a658
Merge pull request #171 from tosin-ogunrinde/master 2021-01-03 21:45:45 +01:00
Tosin Ogunrinde
21817c3551 Improve JVM "JVM memory filling up" alert by summing up all the heap areas which include a separate entry for the Eden Space, Survivor Space and Tenured Gen. 2020-12-31 09:16:09 +00:00
Tosin Ogunrinde
ebf402aa7d Improve JVM "JVM memory filling up" alert by summing up all the heap areas which include a separate entry for the Eden Space, Survivor Space and Tenured Gen. 2020-12-31 09:06:36 +00:00
Samuel Berthe
97345d3b6f
mysql restart alert: severity=info 2020-12-31 00:47:14 +01:00
Samuel Berthe
971bbe03ec
Add FOR clause to alerting rules (when necessary) 2020-12-31 00:27:12 +01:00
Samuel Berthe
3a352d08dc
fix k8s rule: longer alert check time 2020-12-30 19:13:02 +01:00
Samuel Berthe
d3ecfaaad3
Merge pull request #139 from xkfen/istio 2020-12-30 18:47:28 +01:00
Samuel Berthe
2f6d4921c6
fix initial istio alerts 2020-12-30 18:46:50 +01:00
Samuel Berthe
fa4325218f
Merge branch 'master' of github.com:samber/awesome-prometheus-alerts 2020-12-30 17:46:58 +01:00
Samuel Berthe
ed62bdc567
alerts node_exporter: improve network and disk rules 2020-12-30 17:45:30 +01:00
Tosin Ogunrinde
0add93363f Fix JVM "JVM memory filling up" alert 2020-12-30 00:30:08 +00:00
Samuel Berthe
f686698f68
Merge pull request #166 from cityofships/fix_es
Fix Elasticsearch "No new documents" alert
2020-12-28 16:50:47 +01:00
Samuel Berthe
965fefab89
fix alert description 2020-12-28 16:40:11 +01:00