awesome-prometheus-alerts

mirror of https://github.com/samber/awesome-prometheus-alerts.git synced 2026-06-21 00:47:18 +08:00

Author	SHA1	Message	Date
Roman	c189984d0f	fix node-exporter.yaml missing parentheses (#452 )	2025-02-20 15:05:48 +01:00
samber	807db03d0d	Publish	2025-02-19 14:25:58 +00:00
Samuel Berthe	6838196343	fix: remove duplicated rule	2025-02-19 15:25:29 +01:00
dependabot[bot]	0f4b45d127	build(deps-dev): bump nokogiri from 1.16.7 to 1.18.3 (#451 ) Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.16.7 to 1.18.3. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/v1.18.3/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.16.7...v1.18.3) --- updated-dependencies: - dependency-name: nokogiri dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-19 14:33:37 +01:00
samber	4e49e77d29	Publish	2025-02-16 22:47:17 +00:00
dzaczek	11a78f0f06	Update google-cadvisor.yml (#382 ) * Update google-cadvisor.yml Expression Explanation: The expression calculates the absolute change in CPU usage for containers by comparing the current rate of CPU usage (within the last 1 minute) with the rate of CPU usage from the previous minute. If this change exceeds 25%, the alert is triggered. Additionally, it compares the current rate of CPU usage with the rate from the previous 5 minutes to capture larger trends. If any of these conditions are met, the alert fires. Alert Details: - Alert Name: ContainerHighLowChangeCpuUsage - Trigger Condition: Absolute change in CPU usage exceeding 25% - Alert Severity: Informational (info) * Add alert rule for high CPU usage change * Change alert severity from warning to info --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-02-16 23:46:53 +01:00
samber	7889a9a29b	Publish	2025-02-16 22:37:09 +00:00
Samuel Berthe	add097c489	data: revert `5f57f09` (see #398 )	2025-02-16 23:36:44 +01:00
samber	12b8acb1b8	Publish	2025-02-16 22:29:24 +00:00
asdf1234	4a7b9b5c72	Update mysqld-exporter.yml (#442 ) * Update mysqld-exporter.yml add some rules * Add new MySQL monitoring rules --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-02-16 23:29:00 +01:00
samber	20f9a36615	Publish	2025-02-16 22:17:02 +00:00
Samuel Berthe	fb857e8b39	data: fix rules	2025-02-16 23:16:36 +01:00
Samuel Berthe	2f9c0c0483	upgrade ruby version	2025-02-16 23:15:43 +01:00
Samuel Berthe	eb92a79898	upgrade github action ruby version	2025-02-04 16:44:40 +01:00
Samuel Berthe	ae12871fa9	Update rules.yml	2025-02-04 16:40:21 +01:00
Felix Bühler	10d00c66da	Add caddy.yml (#450 )	2025-02-04 14:23:14 +01:00
guruevi	70ac7d9cae	Various updates and quality of life changes (#405 ) * smartctl_exporter publishes both drive_trip and current drive temperatures. Since most of the alerts are going to be permanent, it does not make sense to wait for the alert to be on for a certain time. Temperature sensors likewise vary, using the last sample is not sufficient to alert on potential issues. * Add an option to run GitHub Action manually * Add an option to force running the action for testing purposes * Set variables correctly * Set variables correctly * Publish * Clean up some more metrics * Publish * Minor bug fixes * Publish * Removed queries that throw errors when systems are upgraded. Also fixed and simplified a few Postgres queries. * Publish * Refined some more queries * Publish * PostgreSQL now has optimized autovacuum behavior * Publish * PostgreSQL now has optimized autovacuum behavior * Publish * Publish * Query fails if instance names are not unique across jobs. This fixes it. * Publish * Ruby is out of date --------- Co-authored-by: samber <samber@users.noreply.github.com>	2025-01-28 06:06:47 +01:00
Samuel Berthe	fc6b3faadc	Fix from #405	2025-01-28 06:04:10 +01:00
Samuel Berthe	d916b7c6ab	Fix from #405	2025-01-28 05:58:49 +01:00
sunlei	cbb2337438	fix: formatting errors (#448 ) * fix: formatting errors * Update query format in rules.yml --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2025-01-12 22:01:21 +01:00
samber	53a369769d	Publish	2024-12-16 11:19:08 +00:00
Samuel Berthe	bdcc67c04e	Update rules.yml	2024-12-16 12:17:59 +01:00
Samuel Berthe	84a3b517a8	Update rules.yml	2024-12-16 12:17:26 +01:00
samber	4533f23b79	Publish	2024-12-16 11:17:17 +00:00
dxrayz	52d4a8c744	Update postgres-exporter.yml (#444 ) Modify PostgresqlConfigurationChanged for prevent error: "many-to-many matching not allowed: matching labels must be unique on one side" in cases when you have multiple instances of postgres	2024-12-16 12:16:05 +01:00
samber	c5203e94d0	Publish	2024-12-08 20:29:15 +00:00
Samuel Berthe	a8d7c43b30	Update rules.yml	2024-12-08 21:28:07 +01:00
Samuel Berthe	fff8a80ae5	Update README.md	2024-12-08 21:24:45 +01:00
samber	4e38ae2087	Publish	2024-12-05 22:38:38 +00:00
Samuel Berthe	8c3d06502f	Update rules.yml	2024-12-05 23:37:28 +01:00
samber	8a220b1b8a	Publish	2024-11-30 09:31:05 +00:00
Martin Anderson	353ef1ed95	RabbitMQ: add too many ready messages alert (#441 ) * RabbitMQ: add too many ready messages alert * Add RabbitMQ ready messages alert rule --------- Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>	2024-11-30 10:29:57 +01:00
samber	14949721ba	Publish	2024-10-28 21:25:18 +00:00
sipr-invivo	bb75cb2c68	feat: Add rule to Kubernetes Job not starting (#436 )	2024-10-28 22:24:10 +01:00
dependabot[bot]	f9e683896f	build(deps-dev): bump rexml from 3.3.7 to 3.3.9 (#438 ) Bumps [rexml](https://github.com/ruby/rexml) from 3.3.7 to 3.3.9. - [Release notes](https://github.com/ruby/rexml/releases) - [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md) - [Commits](https://github.com/ruby/rexml/compare/v3.3.7...v3.3.9) --- updated-dependencies: - dependency-name: rexml dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-28 20:17:58 +01:00
Samuel Berthe	c41fda1d92	Update alertmanager.md	2024-10-06 17:31:23 +02:00
Samuel Berthe	7313acce36	Create FUNDING.json	2024-10-05 18:57:43 +02:00
Samuel Berthe	640f06588d	Delete FUNDING.json	2024-10-05 18:21:35 +02:00
Samuel Berthe	cd5b39a1f0	Create FUNDING.json	2024-10-05 18:06:22 +02:00
dependabot[bot]	35596c866f	build(deps): bump webrick from 1.7.0 to 1.8.2 (#435 ) Bumps [webrick](https://github.com/ruby/webrick) from 1.7.0 to 1.8.2. - [Release notes](https://github.com/ruby/webrick/releases) - [Commits](https://github.com/ruby/webrick/compare/v1.7.0...v1.8.2) --- updated-dependencies: - dependency-name: webrick dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-27 22:24:21 +02:00
Samuel Berthe	d6d6ae4ef8	fix: Gemfile to reduce vulnerabilities (#434 ) The following vulnerabilities are fixed with an upgrade: - https://snyk.io/vuln/SNYK-RUBY-WEBRICK-8068535 Co-authored-by: snyk-bot <snyk-bot@snyk.io>	2024-09-26 11:31:21 +02:00
dependabot[bot]	65a5f586cb	build(deps-dev): bump rexml from 3.3.3 to 3.3.6 (#431 ) Bumps [rexml](https://github.com/ruby/rexml) from 3.3.3 to 3.3.6. - [Release notes](https://github.com/ruby/rexml/releases) - [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md) - [Commits](https://github.com/ruby/rexml/compare/v3.3.3...v3.3.6) --- updated-dependencies: - dependency-name: rexml dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-09 20:09:20 +02:00
samber	4aa45dee05	Publish	2024-08-28 06:49:52 +00:00
Samuel Berthe	f08e8df514	oops	2024-08-28 08:48:42 +02:00
Samuel Berthe	995ab4d27a	Update rules.yml	2024-08-28 08:46:41 +02:00
Samuel Berthe	3bf8d6d824	fix: Gemfile to reduce vulnerabilities (#432 ) The following vulnerabilities are fixed with an upgrade: - https://snyk.io/vuln/SNYK-RUBY-REXML-7814166 Co-authored-by: snyk-bot <snyk-bot@snyk.io>	2024-08-24 10:42:21 +02:00
Somrat Dutta	8c0bdc2b24	feat: Add NATS and JetStream Prometheus alert rules (#430 ) * feat: Add comprehensive NATS and JetStream Prometheus alert rules - Added multiple Prometheus alert rules for monitoring NATS server and JetStream metrics. - Included alerts for: - High connection count - High pending bytes - High subscriptions count - High routes count - High memory usage - Slow consumers - NATS server downtime - High CPU usage - High number of active connections - High JetStream store and memory usage - Subscription limits exceeded - High pending messages - Authentication timeouts - Errors in NATS (JetStream API errors) - JetStream consumers limit exceeded - Exceeding max payload size - Leaf node connection issues - Ping operations limit exceeded - Write deadline exceeded - Ensured consistency between `exporter.yml` and `rules.yml` files. - Improved overall NATS and JetStream monitoring to prevent performance degradation and ensure system reliability. This commit enhances the visibility of NATS and JetStream operations by providing key metrics to alert on potential issues and optimize system performance. * Update rules.yml * - minor changes, rollback rules.yml - address comment changes - revert to old rules.yml as they are generated * - minor changes, rollback rules.yml - address comment changes - revert to old rules.yml as they are generated * fix indentation --------- Co-authored-by: somratdutta <duttasomratand.com> Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr> Co-authored-by: somrat.dutta <somrat.dutta@nutanix.com>	2024-08-20 20:37:03 +02:00
samber	02687db33d	Publish	2024-08-20 16:32:36 +00:00
Samuel Berthe	d1715de751	fix PostgresqlInvalidIndex rule	2024-08-20 18:31:18 +02:00
dependabot[bot]	61da73d517	build(deps-dev): bump rexml from 3.3.2 to 3.3.3 (#428 ) Bumps [rexml](https://github.com/ruby/rexml) from 3.3.2 to 3.3.3. - [Release notes](https://github.com/ruby/rexml/releases) - [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md) - [Commits](https://github.com/ruby/rexml/compare/v3.3.2...v3.3.3) --- updated-dependencies: - dependency-name: rexml dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-02 14:14:26 +02:00

1 2 3 4 5 ...

770 commits