Commit graph

442 commits

Author SHA1 Message Date
Samuel Berthe
dc98d72206
🇺🇦 2022-02-25 13:52:19 +01:00
Samuel Berthe
6bfcdcca16
🇺🇦 2022-02-25 13:48:26 +01:00
Koen Dierckx
21ddd2f752
Added Alert manager job alert (#272)
Co-authored-by: DIERCKXK <koen.dierckx@vito.be>
2022-01-23 19:36:36 +01:00
armondressler
038e46743d
fixed erroneous usage of rate() function on gauges (#270)
Co-authored-by: Dressler Armon, B2B-PAP-HLT-DO-ENG <armon.dressler@swisscom.com>
2022-01-16 03:24:36 +01:00
Samuel Berthe
37722256d5
Adding jenkins 2021-12-27 12:49:32 +01:00
MikeN. Paxos
78a7e61050
added jenkins alert rules for jenkins metrics plugin (#268)
* added jenkins alert rules

* Update rules.yml

Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-12-27 12:48:07 +01:00
Samuel Berthe
fd0f2805c0
Renaming kube_hpa_* to kube_horizontalpodautoscaler_*
Fixes #266
2021-12-07 23:16:40 +01:00
Samuel Berthe
f3ef333a3e
doc: remove comment 2021-12-07 23:14:23 +01:00
Damon Vincent
a12f5263c2
Filter parent groups from Docker container alerts (#267) 2021-12-07 23:05:27 +01:00
Samuel Berthe
2ca7f5bebe
doc: more explicit description for HostClock* rules (#265) 2021-12-02 20:54:23 +01:00
Lauri Võsandi
2be7e9684c
Add HostNetworkBondDegraded (#260) 2021-12-02 20:48:11 +01:00
John Losito
1a7690a1a3
Add rule for reboot-required (#262) 2021-12-02 20:45:33 +01:00
leemos
ee3c878b06
apiserver_request_count has been turned off (#264) 2021-12-02 20:23:56 +01:00
Samuel Berthe
3ff969670d
Update README.md 2021-11-21 18:54:56 +01:00
Torsten Bøgh Köster
4e1a26cab3
Add Solr rules (#258) 2021-11-21 18:53:32 +01:00
chaoxiaodi
7a40d7f423
Update rules.yml (#252) 2021-10-27 14:00:35 +02:00
Samuel Berthe
7857afab6e
fix(rule): fixing KubernetesOutOfCapacity (#227) 2021-10-17 17:14:44 +02:00
Samuel Berthe
a978cfb5a1
doc: more explicit "ContainerAbsent" and "ContainerKilled" rules (#247) 2021-10-10 20:13:30 +02:00
Samuel Berthe
4e0d99dd09
fix(mongodb): fix query for MongodbReplicationHeadroom rule (#250) 2021-10-10 20:12:06 +02:00
kayge
2d9e4ae431
Cleaning up typos in rules.yml (#248) 2021-10-09 01:05:15 +02:00
Samuel Berthe
251a929db0
Adding "sleep peacefully" doc section (#246) 2021-10-03 23:58:16 +02:00
Andre Martins
36ca52e598
adding alerts to promtail and loki (#241)
Co-authored-by: apmbktf <andre.pasqualinoto-martins@itau-unibanco.com.br>
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-10-03 22:12:59 +02:00
dependabot[bot]
dc85963ae5
build(deps): bump nokogiri from 1.11.4 to 1.12.5 (#245)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-03 19:09:42 +02:00
Christian Zenker
7c67f02ee6
The metric is called 'thanos_compact_halted' (#243)
According to https://github.com/thanos-io/thanos/blob/main/examples/alerts/alerts.md
2021-09-21 15:48:27 +02:00
Ondřej Nový
abfae043bb
Fix typo in description (#242) 2021-09-19 23:37:51 +02:00
Samuel Berthe
a225087b06
prevent +inf max value 2021-08-19 23:45:58 +02:00
gökhan
b9222993ac
istio pilot duplicate cluster (#220) 2021-08-19 21:23:27 +02:00
Guillaume
6fcdcff5e3
Fix bad syntax for Haproxy rules (#232)
Aggregations require parentheses around expressions
2021-08-19 21:22:39 +02:00
dependabot[bot]
27b17962ea
build(deps): bump addressable from 2.7.0 to 2.8.0 (#233)
Bumps [addressable](https://github.com/sporkmonger/addressable) from 2.7.0 to 2.8.0.
- [Release notes](https://github.com/sporkmonger/addressable/releases)
- [Changelog](https://github.com/sporkmonger/addressable/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sporkmonger/addressable/compare/addressable-2.7.0...addressable-2.8.0)

---
updated-dependencies:
- dependency-name: addressable
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-19 21:21:30 +02:00
flf2ko
a02a7e6eab
Fix "percentil" typo in Etcd rules (#234) 2021-08-19 21:21:16 +02:00
Krasimir Nedelchev
3d69117f33
Add missing parenthesis to rule (#237) 2021-08-19 21:20:11 +02:00
Igor Churmeev
3612c9cc3e
Add alerts for Hashicorp Vault (#238)
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-08-19 21:19:43 +02:00
Andre Martins
b47359c2fd
added alerts to cortex (#240)
* added alerts to cortex

* Update rules.yml

Co-authored-by: apmbktf <andre.pasqualinoto-martins@itau-unibanco.com.br>
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-08-19 20:31:46 +02:00
Benjamin Dos Santos
7304d40539
fix(HostNetworkInterfaceSaturated): display network interface name in description (#239)
`$labels.interface` doesn't exist, use `$labels.device` instead
2021-08-16 16:29:12 +02:00
Gjed
c2b8178304
Loki alerts (#218)
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2021-07-04 23:59:46 +02:00
asteny
243c0280cf
Haproxy 2 embedded exporter fixes (#229) 2021-07-04 23:28:58 +02:00
dependabot[bot]
650914d4ad
build(deps): bump nokogiri from 1.11.1 to 1.11.4 (#221)
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.11.1 to 1.11.4.
- [Release notes](https://github.com/sparklemotion/nokogiri/releases)
- [Changelog](https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.11.1...v1.11.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-29 12:38:05 +02:00
Alexandros Orfanos
6a6f89bad5
Add php-fpm max-children alert (#224) 2021-06-29 12:37:54 +02:00
piano
d123523164
Update alertmanager.md (#231)
webhook_config should be webhook_configs
2021-06-29 12:21:35 +02:00
Alberto del Barrio
0ba7c2a47e
fix typo (#228) 2021-06-27 14:16:42 +02:00
Samuel Berthe
092d0f8bda
fix(haproxy): some query were using wrong metrics name 2021-05-01 22:48:54 +02:00
Samuel Berthe
2c62d2cd6e
feat(template): better addressing of section (adding support for exporter-level linking and clipboarding) 2021-05-01 22:30:07 +02:00
Samuel Berthe
e044fddd11
feat(data): reverse traefik exporters order 2021-05-01 22:12:12 +02:00
Samuel Berthe
f2f012a2fb
fix(template): replace ":" by "=" into rule template 2021-05-01 22:05:12 +02:00
Paul Haerle
e090fd1569
fix "copy" button by quoting description fields... (#182)
...in yaml output and escape quotes inside.

Without this change, the YAML outputted isn't valid due to ":"
characters in the description which end up throwing errors like

/etc/prometheus/rules/prometheus.rules: yaml: line 88: mapping values
are not allowed in this context.
2021-05-01 22:04:09 +02:00
Samuel Berthe
af30d0f06c
fix(node_exporter): better alert description for EDAC + network errors (#204) 2021-05-01 22:01:10 +02:00
Samuel Berthe
bbba46c41b
fix(clipboard): copy of multiple rules was broken 2021-05-01 21:20:41 +02:00
Samuel Berthe
135d4b7c1a
fix(data): for KubernetesPodNotHealthy, insert a step of subquery execution time 2021-05-01 20:30:35 +02:00
Samuel Berthe
54b1e674b2
fix(data): fix pg replicatino lag query 2021-05-01 19:58:42 +02:00
Moritz
335ba16032
Fix upper/lowercase of systemd (#207)
The're quite clear on how they want it to be written:

https://unix.stackexchange.com/review/suggested-edits/372414
2021-05-01 19:44:06 +02:00