Samuel Berthe
1eb5c5834f
Update rules.yml
2024-03-11 23:28:06 +01:00
Samuel Berthe
90706282ad
Update rules.yml
2024-03-11 22:55:05 +01:00
Samuel Berthe
05c4716c2b
Fix KubernetesAPIserverlatency
2024-02-12 09:41:03 +01:00
Samuel Berthe
f5f6b338a3
fix: high/low cpu alert
2024-02-10 23:24:10 +01:00
Samuel Berthe
937cd35df7
💄
2024-02-10 20:04:17 +01:00
Samuel Berthe
5f57f09db0
fix(HostOutOfInodes): exclude msdosfs FS
...
See #398
2024-02-10 20:01:19 +01:00
Marek Červenka
4eb0e910e7
SMART monitoring ( #402 )
...
* SMART monitoring
* query regex fix
---------
Co-authored-by: Marek Cervenka <cervenka@ipex.cz>
2024-02-09 20:23:30 +01:00
Samuel Berthe
0727f2ef2e
Update rules.yml
2024-01-26 04:10:22 +01:00
josedev-union
c6ff5a59dc
feat: Add rules for Graph Node ( #387 )
...
Co-authored-by: josedev-union <josedev-union@users.noreply.github.com>
2024-01-20 20:33:26 +01:00
michaelact
7fa11bf6cc
Add simple and meaningful kube-state-metrics alert summary ( #394 )
...
* feat: add 'summary' to be overriden from rules.yml
* chore: add simple and meaningful summary for kubernetes alerts
2023-12-01 18:25:11 +01:00
Samuel Berthe
a4de5323ad
Update rules.yml
2023-11-26 02:18:16 +01:00
Samuel Berthe
76de11d71b
Update rules.yml
2023-10-24 15:03:51 +02:00
Pierre Riteau
cbf7046afa
Fix capitalisation of RabbitMQ ( #392 )
2023-10-13 17:09:10 +02:00
Vicky Wilson Jacob
7a8f883df6
feat: adding hadoop jmx exporter ( #391 )
...
* adding hadoop exporter
* added hadoop rules with jmx exporter
* added hadoop rules with jmx exporter
* Update rules.yml
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2023-10-06 18:48:54 +02:00
Samuel Berthe
bacb433089
Update rules.yml
2023-09-18 20:14:57 +02:00
Samuel Berthe
053cde27e4
Update rules.yml
2023-08-22 15:51:53 +02:00
Pavel Timofeev
6b1685261d
Rework kube-state-metrics alerts ( #381 )
...
* Rework kube-state-metrics alerts:
- provide meaningful labels in summary as 'instance' label hardly makes sense in most of them
- rename some alerts to tell more accurate what the problem is
- adjust description trying to follow some kind of the message schema found in other alerts
* move changes to _data/rules.yml
* Update rules.yml
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2023-08-20 00:39:22 +02:00
Samuel Berthe
c3d78786e8
fix ci
2023-08-15 20:27:13 +02:00
Roman Pertl
ecd92399d5
feat: adding patroni alert rules ( #369 )
2023-08-15 19:54:15 +02:00
fzyzcjy
13e90b3aea
Update rules.yml ( #371 )
2023-08-15 19:42:46 +02:00
Ted Hahn
94b9f3cfbb
Fix for Postgres max connections. Postgres does not limit connections by database, but total over the server. Additionally, alert labels didn't match across the pair. Using a min by on the right side deals with the possibility additional labels are present on your exporter. ( #376 )
2023-08-15 19:39:41 +02:00
Samuel Berthe
15e3131547
Update rules.yml
2023-08-15 19:36:22 +02:00
Samuel Berthe
eb3220c8d7
Update rules.yml
2023-08-15 19:34:14 +02:00
Ivan Dudin
86e3e38a99
fix typo ( #377 )
2023-08-07 19:43:10 +02:00
Samuel Berthe
ff76ceccde
Update rules.yml
2023-07-30 22:24:31 +02:00
Moritz
fe5f78171a
update rules.yml ( #374 )
2023-07-30 22:21:20 +02:00
Samuel Berthe
8c811045e5
Update rules.yml
2023-07-29 18:20:58 +02:00
Samuel Berthe
32cf16a53d
Update rules.yml
2023-07-12 14:32:43 +02:00
Samuel Berthe
1bb6c602f7
Update rules.yml
2023-07-06 13:54:31 +02:00
Samuel Berthe
5d254811b4
Update rules.yml
2023-06-27 00:28:31 +02:00
Samuel Berthe
47b7748618
Update rules.yml
2023-06-22 18:40:33 +02:00
Samuel Berthe
3d0c5fcafd
Update rules.yml
2023-06-22 18:29:21 +02:00
Samuel Berthe
600a759344
Update rules.yml
2023-06-22 15:01:06 +02:00
Samuel Berthe
ee86c2d233
Update rules.yml
2023-06-22 15:00:40 +02:00
michaelact
7e8bc1a215
Add under-utilized container alerts ( #322 )
...
* chore: add container under-utilized allerts
* chore: resolve duplicated query and description
2023-05-21 22:58:04 +02:00
Paul-Élie Testud
c36014f03e
fix(nginx): fix nginx query for histogram_percentile ( #351 )
2023-04-28 16:06:12 +02:00
deimosOmegaChan
b98b2a2777
fix node-exporter nodename regex expression ( #349 )
...
nodename should not depends with the prefix "hostname"
2023-04-25 10:58:52 +02:00
Samuel Berthe
9efec14d26
chore: move from " https://awesome-prometheus-alerts.grep.to " to " https://samber.github.io/awesome-prometheus-alerts/ "
2023-04-23 23:32:26 +02:00
Madhu Sudhan
8b9fc8864f
refactor: node-exporter queries to include hostname as label which will be helpful for alerting ( #348 )
2023-04-23 22:16:08 +02:00
Mikael Lindström
8357165cfb
Update MongoDB replication lag alert to use seconds ( #344 )
...
The mongodb_rs_members_optimeDate metric is in milliseconds, the
replication lag query has been updated to reflect this.
2023-04-07 01:42:25 +02:00
Mikael Lindström
2617aa5dab
Fix MongoDB replication headroom query ( #342 )
...
The query was changed to use `mongodb_oplog_stats_start` and
`mongodb_oplog_stats_end` in #291 but these metrics does not represent
the start and end of the oplog. The original head and tail metrics are
calculated from the oplog and are consistent with the output of
`db.getReplicationInfo()`.
2023-04-03 10:01:25 +02:00
Samuel Berthe
f9b43cf3bf
Update rules.yml
2023-03-24 14:36:52 +01:00
Kratik Jain
aa2988693b
Adding more rules for Thanos Monitoring ( #340 )
...
* Adding more rules for Thanos Components Monitoring
* lint
* lint
* lint
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2023-03-15 18:26:24 +01:00
Samuel Berthe
59891728e4
Solves #336
2023-02-26 02:33:50 +01:00
Samuel Berthe
60cb26681f
Update rules.yml
2023-02-23 15:19:36 +01:00
Samuel Berthe
bde83bc9ee
Update rules.yml
2023-02-17 01:14:19 +01:00
alexandrumarian-portal
1e44e348ee
Hashicorp Vault cluster health ( #338 )
...
* Hashicorp Vault cluster health
2023-02-17 01:13:41 +01:00
Samuel Berthe
65a0f969be
Update rules.yml
2023-02-14 14:02:35 +01:00
Yannick Markus
7aeccf2874
Add APC UPS & ZFS exporter ( #331 )
...
* add apcupsd_exporter rules
* add zfs_exporter rules
2023-02-12 20:01:26 +01:00
Jan Gosmann
df6d71bad5
Make ElasticsearchNoNewDocuments alert more robust ( #334 )
...
Use `elasticsearch_indices_indexing_index_total` instead of
`elasticsearch_indices_docs` because `elasticsearch_indices_docs` might
not update without an index refresh [1]. Refreshes happen every second
by default, *but* only if there have been search requests within the
last 30 seconds [2]. If there are no search requests for a sufficiently
long duration, the alert based on `elasticsearch_indices_docs` will fire
mistakenly.
Apart from that, `elasticsearch_indices_docs` has the gauge metric type
(while `elasticsearch_indices_indexing_index_total` is of the counter
type) and the `increase` function is not intended to be used with
gauges. Drops in the document count would be treated as a reset to 0,
thus showing an increase by all remaining documents.
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-stats.html#index-stats-api-path-params
[2]: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
2023-01-30 17:06:40 +01:00