mirror of
https://github.com/samber/awesome-prometheus-alerts.git
synced 2026-06-20 16:46:37 +08:00
LiteLLM (https://github.com/BerriAI/litellm) is a popular LLM-gateway/proxy that exposes Prometheus metrics via its built-in callback. There were no existing alerting rules for LiteLLM in this repo, despite its growing adoption as an OpenAI/Anthropic-compatible proxy. Added 3 alerts covering the most common operational concerns: 1. **LiteLLM provider spend over budget** — soft-warning on cumulative 24h spend per model-name regex. Useful when LiteLLM's native `provider_budget_config` hard-cap is unavailable, disabled, or buggy (e.g. BerriAI/litellm#26701). 2. **LiteLLM proxy failed requests rate high** — error-rate ratio alert for downstream LLM provider availability/auth issues. 3. **LiteLLM request latency p95 high** — histogram-quantile alert for downstream provider response-time degradation. All 3 rules tested via `promtool check rules` (SUCCESS) and validated on a real LiteLLM v1.83.7 production deployment. Reference: https://docs.litellm.ai/docs/proxy/prometheus |
||
|---|---|---|
| .. | ||
| rules.yml | ||