Commit graph

38 commits

Author SHA1 Message Date
samber
ed1515015a Publish 2026-04-06 18:38:45 +00:00
samber
af2f277830 Publish 2026-03-18 20:41:01 +00:00
samber
c0e1f7a5f5 Publish 2026-03-18 17:06:34 +00:00
samber
1e4e3d17bc Publish 2026-03-15 17:08:32 +00:00
samber
80400e9a56 Publish 2026-03-01 19:15:42 +00:00
samber
dd10c7ef05 Publish 2026-01-30 11:15:52 +00:00
Simon Matic Langford
f810ff531d
Node exporter rules to preserve instance labels (#488)
* Jenkins node offline for clause (#2)

* Convert cpu alert expressions to without() rather than on()

* Remove on() expression from network throughput alerts as labels fully match

---------

Co-authored-by: Simon Matic Langford <simon@longshotsystems.co.uk>
2026-01-06 16:24:18 +01:00
samber
4acbddb21a Publish 2025-11-05 16:04:56 +00:00
samber
b04b11ce1d Publish 2025-06-25 11:32:39 +00:00
samber
ea63d8001a Publish 2025-06-17 17:16:15 +00:00
samber
6ebe6d8a8e Publish 2025-06-17 15:07:35 +00:00
samber
a75d5124c5 Publish 2025-04-17 15:26:25 +00:00
samber
9963b750ac Publish 2025-02-20 14:06:17 +00:00
samber
7889a9a29b Publish 2025-02-16 22:37:09 +00:00
samber
20f9a36615 Publish 2025-02-16 22:17:02 +00:00
guruevi
70ac7d9cae
Various updates and quality of life changes (#405)
* smartctl_exporter publishes both drive_trip and current drive temperatures. Since most of the alerts are going to be permanent, it does not make sense to wait for the alert to be on for a certain time. Temperature sensors likewise vary, using the last sample is not sufficient to alert on potential issues.

* Add an option to run GitHub Action manually

* Add an option to force running the action for testing purposes

* Set variables correctly

* Set variables correctly

* Publish

* Clean up some more metrics

* Publish

* Minor bug fixes

* Publish

* Removed queries that throw errors when systems are upgraded. Also fixed and simplified a few Postgres queries.

* Publish

* Refined some more queries

* Publish

* PostgreSQL now has optimized autovacuum behavior

* Publish

* PostgreSQL now has optimized autovacuum behavior

* Publish

* Publish

* Query fails if instance names are not unique across jobs. This fixes it.

* Publish

* Ruby is out of date

---------

Co-authored-by: samber <samber@users.noreply.github.com>
2025-01-28 06:06:47 +01:00
sunlei
cbb2337438
fix: formatting errors (#448)
* fix: formatting errors

* Update query format in rules.yml

---------

Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
2025-01-12 22:01:21 +01:00
samber
4aa45dee05 Publish 2024-08-28 06:49:52 +00:00
samber
60c235975c Publish 2024-06-14 18:16:53 +00:00
samber
b77cb3467c Publish 2024-04-29 20:36:49 +00:00
samber
284db65e46 Publish 2024-02-10 19:02:28 +00:00
Brett Beutell
56a7e0d03a
Update rule for host memory underutilization to use avg_over_time instead of rate, since node_memory_MemAvailable_bytes is a gauge (#400) 2024-01-26 04:09:35 +01:00
samber
7d05d142d5 Publish 2023-11-26 01:19:24 +00:00
samber
afddf710ab Publish 2023-08-15 18:28:36 +00:00
samber
f72620203f Publish 2023-07-30 20:22:47 +00:00
samber
7a05f925b4 Publish 2023-06-22 16:42:13 +00:00
samber
a4dbefd853 Publish 2023-06-22 16:30:42 +00:00
samber
7a874b7205 Publish 2023-04-25 08:59:28 +00:00
samber
9d3d52bbfa Publish 2023-04-23 20:16:41 +00:00
Julien Lecomte
baa4f223cd
Ignore temperature from tctl sensors (#341) 2023-03-24 14:36:24 +01:00
samber
293aba1437 Publish 2023-02-26 01:34:30 +00:00
samber
fa56b637a1 Publish 2023-02-14 13:03:11 +00:00
samber
5de0ee850b Publish 2023-02-14 13:01:25 +00:00
alexandrumarian-portal
18da40f8b4
disk io ops alarm (#337)
* disk io ops alarm

* disk io ops alarm
2023-02-14 14:00:43 +01:00
samber
32a0ce2c0b Publish 2022-12-06 09:38:04 +00:00
samber
4f908b36fb Publish 2022-12-06 09:27:25 +00:00
samber
6c9c521150 Publish 2022-10-24 11:47:50 +00:00
Samuel Berthe
ccf24bcf03 Publish 2022-06-15 00:08:51 +00:00