Add under-utilized HPA alert

This alert should inform when HPAs are scaled more than half the time at their minReplicas, which is an indication of possible cost savings.
In addition, it is assumed that a minimum number of replicas should still be running for redundancy.
This commit is contained in:
skoenig 2023-01-06 17:14:47 +01:00
parent ae1d84c788
commit 0b213a9c5d

View file

@ -1708,6 +1708,11 @@ groups:
query: 'kube_horizontalpodautoscaler_status_desired_replicas >= kube_horizontalpodautoscaler_spec_max_replicas' query: 'kube_horizontalpodautoscaler_status_desired_replicas >= kube_horizontalpodautoscaler_spec_max_replicas'
severity: info severity: info
for: 2m for: 2m
- name: Kubernetes HPA underutilized
description: HPA is constantly at minimum replicas for 50% of the time
query: 'max(quantile_over_time(0.5, kube_horizontalpodautoscaler_status_desired_replicas[1d]) == kube_horizontalpodautoscaler_spec_min_replicas) by (horizontalpodautoscaler) > 3' # allow minimum 3 replicas running
severity: info
for: 5m
- name: Kubernetes Pod not healthy - name: Kubernetes Pod not healthy
description: Pod has been in a non-ready state for longer than 15 minutes. description: Pod has been in a non-ready state for longer than 15 minutes.
query: 'sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed"}) > 0' query: 'sum by (namespace, pod) (kube_pod_status_phase{phase=~"Pending|Unknown|Failed"}) > 0'