awesome-prometheus-alerts

mirror of https://github.com/samber/awesome-prometheus-alerts.git synced 2026-06-21 00:47:18 +08:00

Author	SHA1	Message	Date
samber	a75d5124c5	Publish	2025-04-17 15:26:25 +00:00
Samuel Berthe	3b440fec7b	Remove buggy HostRequiresReboot rule Closing #459	2025-04-17 17:26:00 +02:00
samber	32a4bfb19b	Publish	2025-03-27 16:23:49 +00:00
Samuel Berthe	8b730ef059	Update rules.yml	2025-03-27 17:23:19 +01:00
samber	93f9daecee	Publish	2025-03-27 13:42:51 +00:00
Motte	69c8208e3c	Added PostgresqlReplicationLagHigh rule (#456 ) * Added PostgresqlReplicationLagHigh rule * Update PostgreSQL replication lag alert settings --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-03-27 14:42:22 +01:00
Pigueiras	97a31f34e5	Fix queries in elasticsearch latency alerts (#455 ) The `elasticsearch_indices_search_fetch_total`, `elasticsearch_indices_search_fetch_time_seconds`, `elasticsearch_indices_indexing_index_time_seconds_total` and `elasticsearch_indices_indexing_index_total` metrics are counters. Dividing these metrics doesn't make sense because a spike in numerator would cause the alert to persist, even if subsequent fetch/index operations are normal. Adding `increase` changes the query to check if operations took, on average, more than X over a 1-minute interval, which was likely the original intent of this alert.	2025-03-26 22:15:24 +01:00
dependabot[bot]	242054f7dc	build(deps-dev): bump uri from 0.13.1 to 0.13.2 (#454 ) Bumps [uri](https://github.com/ruby/uri) from 0.13.1 to 0.13.2. - [Release notes](https://github.com/ruby/uri/releases) - [Commits](https://github.com/ruby/uri/compare/v0.13.1...v0.13.2) --- updated-dependencies: - dependency-name: uri dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-23 16:30:56 +01:00
dependabot[bot]	4335f85830	build(deps-dev): bump nokogiri from 1.18.3 to 1.18.4 (#453 ) Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.18.3 to 1.18.4. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.18.3...v1.18.4) --- updated-dependencies: - dependency-name: nokogiri dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-23 16:26:08 +01:00
samber	7bcae33011	Publish	2025-02-20 15:18:08 +00:00
Samuel Berthe	2127c4ce90	Update rules.yml	2025-02-20 16:17:39 +01:00
samber	9963b750ac	Publish	2025-02-20 14:06:17 +00:00
Roman	c189984d0f	fix node-exporter.yaml missing parentheses (#452 )	2025-02-20 15:05:48 +01:00
samber	807db03d0d	Publish	2025-02-19 14:25:58 +00:00
Samuel Berthe	6838196343	fix: remove duplicated rule	2025-02-19 15:25:29 +01:00
dependabot[bot]	0f4b45d127	build(deps-dev): bump nokogiri from 1.16.7 to 1.18.3 (#451 ) Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.16.7 to 1.18.3. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/v1.18.3/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.16.7...v1.18.3) --- updated-dependencies: - dependency-name: nokogiri dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-19 14:33:37 +01:00
samber	4e49e77d29	Publish	2025-02-16 22:47:17 +00:00
dzaczek	11a78f0f06	Update google-cadvisor.yml (#382 ) * Update google-cadvisor.yml Expression Explanation: The expression calculates the absolute change in CPU usage for containers by comparing the current rate of CPU usage (within the last 1 minute) with the rate of CPU usage from the previous minute. If this change exceeds 25%, the alert is triggered. Additionally, it compares the current rate of CPU usage with the rate from the previous 5 minutes to capture larger trends. If any of these conditions are met, the alert fires. Alert Details: - Alert Name: ContainerHighLowChangeCpuUsage - Trigger Condition: Absolute change in CPU usage exceeding 25% - Alert Severity: Informational (info) * Add alert rule for high CPU usage change * Change alert severity from warning to info --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-02-16 23:46:53 +01:00
samber	7889a9a29b	Publish	2025-02-16 22:37:09 +00:00
Samuel Berthe	add097c489	data: revert `5f57f09` (see #398 )	2025-02-16 23:36:44 +01:00
samber	12b8acb1b8	Publish	2025-02-16 22:29:24 +00:00
asdf1234	4a7b9b5c72	Update mysqld-exporter.yml (#442 ) * Update mysqld-exporter.yml add some rules * Add new MySQL monitoring rules --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-02-16 23:29:00 +01:00
samber	20f9a36615	Publish	2025-02-16 22:17:02 +00:00
Samuel Berthe	fb857e8b39	data: fix rules	2025-02-16 23:16:36 +01:00
Samuel Berthe	2f9c0c0483	upgrade ruby version	2025-02-16 23:15:43 +01:00
Samuel Berthe	eb92a79898	upgrade github action ruby version	2025-02-04 16:44:40 +01:00
Samuel Berthe	ae12871fa9	Update rules.yml	2025-02-04 16:40:21 +01:00
Felix Bühler	10d00c66da	Add caddy.yml (#450 )	2025-02-04 14:23:14 +01:00
guruevi	70ac7d9cae	Various updates and quality of life changes (#405 ) * smartctl_exporter publishes both drive_trip and current drive temperatures. Since most of the alerts are going to be permanent, it does not make sense to wait for the alert to be on for a certain time. Temperature sensors likewise vary, using the last sample is not sufficient to alert on potential issues. * Add an option to run GitHub Action manually * Add an option to force running the action for testing purposes * Set variables correctly * Set variables correctly * Publish * Clean up some more metrics * Publish * Minor bug fixes * Publish * Removed queries that throw errors when systems are upgraded. Also fixed and simplified a few Postgres queries. * Publish * Refined some more queries * Publish * PostgreSQL now has optimized autovacuum behavior * Publish * PostgreSQL now has optimized autovacuum behavior * Publish * Publish * Query fails if instance names are not unique across jobs. This fixes it. * Publish * Ruby is out of date --------- Co-authored-by: samber <samber@users.noreply.github.com>	2025-01-28 06:06:47 +01:00
Samuel Berthe	fc6b3faadc	Fix from #405	2025-01-28 06:04:10 +01:00
Samuel Berthe	d916b7c6ab	Fix from #405	2025-01-28 05:58:49 +01:00
sunlei	cbb2337438	fix: formatting errors (#448 ) * fix: formatting errors * Update query format in rules.yml --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-01-12 22:01:21 +01:00
samber	53a369769d	Publish	2024-12-16 11:19:08 +00:00
Samuel Berthe	bdcc67c04e	Update rules.yml	2024-12-16 12:17:59 +01:00
Samuel Berthe	84a3b517a8	Update rules.yml	2024-12-16 12:17:26 +01:00
samber	4533f23b79	Publish	2024-12-16 11:17:17 +00:00
dxrayz	52d4a8c744	Update postgres-exporter.yml (#444 ) Modify PostgresqlConfigurationChanged for prevent error: "many-to-many matching not allowed: matching labels must be unique on one side" in cases when you have multiple instances of postgres	2024-12-16 12:16:05 +01:00
samber	c5203e94d0	Publish	2024-12-08 20:29:15 +00:00
Samuel Berthe	a8d7c43b30	Update rules.yml	2024-12-08 21:28:07 +01:00
Samuel Berthe	fff8a80ae5	Update README.md	2024-12-08 21:24:45 +01:00
samber	4e38ae2087	Publish	2024-12-05 22:38:38 +00:00
Samuel Berthe	8c3d06502f	Update rules.yml	2024-12-05 23:37:28 +01:00
samber	8a220b1b8a	Publish	2024-11-30 09:31:05 +00:00
Martin Anderson	353ef1ed95	RabbitMQ: add too many ready messages alert (#441 ) * RabbitMQ: add too many ready messages alert * Add RabbitMQ ready messages alert rule --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2024-11-30 10:29:57 +01:00
samber	14949721ba	Publish	2024-10-28 21:25:18 +00:00
sipr-invivo	bb75cb2c68	feat: Add rule to Kubernetes Job not starting (#436 )	2024-10-28 22:24:10 +01:00
dependabot[bot]	f9e683896f	build(deps-dev): bump rexml from 3.3.7 to 3.3.9 (#438 ) Bumps [rexml](https://github.com/ruby/rexml) from 3.3.7 to 3.3.9. - [Release notes](https://github.com/ruby/rexml/releases) - [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md) - [Commits](https://github.com/ruby/rexml/compare/v3.3.7...v3.3.9) --- updated-dependencies: - dependency-name: rexml dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-28 20:17:58 +01:00
Samuel Berthe	c41fda1d92	Update alertmanager.md	2024-10-06 17:31:23 +02:00
Samuel Berthe	7313acce36	Create FUNDING.json	2024-10-05 18:57:43 +02:00
Samuel Berthe	640f06588d	Delete FUNDING.json	2024-10-05 18:21:35 +02:00

1 2 3 4 5 ...

782 commits