fix: add division-by-zero guards and improve quoting in memcached rules (#512)

- Add `and memcached_max_connections > 0` to connection limit queries
- Add `and memcached_limit_bytes > 0` to memory usage query
- Switch hit-rate query to single quotes for cleaner PromQL readability
This commit is contained in:
Samuel Berthe 2026-03-16 03:28:06 +01:00
parent 39d1342e98
commit c94fa0d230

View file

@ -1068,12 +1068,12 @@ groups:
1m delay allows a restart without triggering an alert.
- name: Memcached connection limit approaching (> 80%)
description: "Memcached connection usage is above 80% on {{ $labels.instance }} (current value: {{ $value }}%)"
query: "(memcached_current_connections / memcached_max_connections * 100) > 80"
query: "(memcached_current_connections / memcached_max_connections * 100) > 80 and memcached_max_connections > 0"
severity: warning
for: 2m
- name: Memcached connection limit approaching (> 95%)
description: "Memcached connection usage is above 95% on {{ $labels.instance }} (current value: {{ $value }}%)"
query: "(memcached_current_connections / memcached_max_connections * 100) > 95"
query: "(memcached_current_connections / memcached_max_connections * 100) > 95 and memcached_max_connections > 0"
severity: critical
for: 2m
- name: Memcached out of memory errors
@ -1083,7 +1083,7 @@ groups:
for: 5m
- name: Memcached memory usage high (> 90%)
description: "Memcached memory usage is above 90% on {{ $labels.instance }} (current value: {{ $value }}%)"
query: "(memcached_current_bytes / memcached_limit_bytes * 100) > 90"
query: "(memcached_current_bytes / memcached_limit_bytes * 100) > 90 and memcached_limit_bytes > 0"
severity: warning
for: 5m
comments: |
@ -1097,7 +1097,7 @@ groups:
A sustained eviction rate indicates memory pressure. Consider increasing memcached memory limit or reducing cache usage. Threshold of 10 evictions/s is a rough default — adjust based on your workload.
- name: Memcached low cache hit rate (< 80%)
description: "Memcached cache hit rate is below 80% on {{ $labels.instance }} (current value: {{ $value }}%)"
query: "(rate(memcached_commands_total{command=\"get\", status=\"hit\"}[5m]) / (rate(memcached_commands_total{command=\"get\", status=\"hit\"}[5m]) + rate(memcached_commands_total{command=\"get\", status=\"miss\"}[5m])) * 100) < 80 and (rate(memcached_commands_total{command=\"get\", status=\"hit\"}[5m]) + rate(memcached_commands_total{command=\"get\", status=\"miss\"}[5m])) > 0"
query: '(rate(memcached_commands_total{command="get", status="hit"}[5m]) / (rate(memcached_commands_total{command="get", status="hit"}[5m]) + rate(memcached_commands_total{command="get", status="miss"}[5m])) * 100) < 80 and (rate(memcached_commands_total{command="get", status="hit"}[5m]) + rate(memcached_commands_total{command="get", status="miss"}[5m])) > 0'
severity: warning
for: 10m
comments: |