The `elasticsearch_indices_search_fetch_total`,
`elasticsearch_indices_search_fetch_time_seconds`,
`elasticsearch_indices_indexing_index_time_seconds_total`
and `elasticsearch_indices_indexing_index_total` metrics
are counters.
Dividing these metrics doesn't make sense because a spike in
numerator would cause the alert to persist, even if subsequent
fetch/index operations are normal. Adding `increase` changes the query
to check if operations took, on average, more than X over
a 1-minute interval, which was likely the original intent of
this alert.
* Update google-cadvisor.yml
Expression Explanation:
The expression calculates the absolute change in CPU usage for containers by comparing the current rate of CPU usage (within the last 1 minute) with the rate of CPU usage from the previous minute. If this change exceeds 25%, the alert is triggered. Additionally, it compares the current rate of CPU usage with the rate from the previous 5 minutes to capture larger trends. If any of these conditions are met, the alert fires.
Alert Details:
- Alert Name: ContainerHighLowChangeCpuUsage
- Trigger Condition: Absolute change in CPU usage exceeding 25%
- Alert Severity: Informational (info)
* Add alert rule for high CPU usage change
* Change alert severity from warning to info
---------
Co-authored-by: Samuel Berthe <dev@samuel-berthe.fr>
* smartctl_exporter publishes both drive_trip and current drive temperatures. Since most of the alerts are going to be permanent, it does not make sense to wait for the alert to be on for a certain time. Temperature sensors likewise vary, using the last sample is not sufficient to alert on potential issues.
* Add an option to run GitHub Action manually
* Add an option to force running the action for testing purposes
* Set variables correctly
* Set variables correctly
* Publish
* Clean up some more metrics
* Publish
* Minor bug fixes
* Publish
* Removed queries that throw errors when systems are upgraded. Also fixed and simplified a few Postgres queries.
* Publish
* Refined some more queries
* Publish
* PostgreSQL now has optimized autovacuum behavior
* Publish
* PostgreSQL now has optimized autovacuum behavior
* Publish
* Publish
* Query fails if instance names are not unique across jobs. This fixes it.
* Publish
* Ruby is out of date
---------
Co-authored-by: samber <samber@users.noreply.github.com>
Modify PostgresqlConfigurationChanged for prevent error: "many-to-many matching not allowed: matching labels must be unique on one side" in cases when you have multiple instances of postgres