* feat: add OpenStack alerting rules (openstack-exporter)
Add 20 alerting rules for openstack-exporter/openstack-exporter covering
Nova, Neutron, Cinder, Octavia, and Placement services.
* docs: add OpenStack to README services list
* fix: align OpenStack load balancer alert name with operating_status semantics
The operating_status label uses ONLINE/OFFLINE/DEGRADED/ERROR values,
not ACTIVE. Rename alert to "not online" and use the label in the
description for clarity.
* feat: add process-exporter alerting rules (ncabatoff/process-exporter)
* docs: add Process to README services list
* fix: address PR review feedback for process-exporter rules
- Rename service from "Process" to "Process Exporter" for clarity
- Fix grammar: "file descriptors usage" → "file descriptor usage"
- Clarify CPU alert description as core-equivalent percentage
- Rename "high disk IO" to "high disk write IO" for accuracy
* feat: add IPMI exporter alerting rules
Add 17 alerting rules for prometheus-community/ipmi_exporter covering
temperature, fan, voltage, current, power sensors, chassis status,
and system event log monitoring.
* docs: add IPMI to README service list
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Add 7 alerting rules for prometheus/snmp_exporter covering device
availability, interface status, error rates, bandwidth utilization,
and device restarts. Rules use standard IF-MIB and SNMPv2-MIB metrics.
* data: adding python/ruby/golang
* fix: address review feedback on runtime alerts
- JVM non-heap: guard against unbounded metaspace (max_bytes = -1)
- JVM old gen GC: note regex only matches CMS/G1/Parallel collectors
- JVM/Python file descriptors: note process_* metrics are generic
- Go memory usage: fix description (sys_bytes is runtime memory, not host)
- Go goroutine spike: use deriv() instead of rate() on gauge
- Go GC CPU fraction: note deprecation since Go 1.20
- Go GC duration: clarify quantile="1" is max, not p99
- Python uncollectable: use increase() on counter instead of raw threshold
- Add threshold comments for workload-dependent defaults
Adding rules to: prometheus, kubernetes, redis, docker and postgresql.
Arranging exporters into categories.
Showing number of rules.
Thanks to Gitlab for opensourcing alerting rules!