🚨 Collection of Prometheus alerting rules
Find a file
nucocloud 4c9da9ed24
Add LiteLLM section to Other group with 3 alerting rules (#553)
LiteLLM (https://github.com/BerriAI/litellm) is a popular LLM-gateway/proxy
that exposes Prometheus metrics via its built-in callback. There were no
existing alerting rules for LiteLLM in this repo, despite its growing
adoption as an OpenAI/Anthropic-compatible proxy.

Added 3 alerts covering the most common operational concerns:

1. **LiteLLM provider spend over budget** — soft-warning on cumulative
   24h spend per model-name regex. Useful when LiteLLM's native
   `provider_budget_config` hard-cap is unavailable, disabled, or
   buggy (e.g. BerriAI/litellm#26701).

2. **LiteLLM proxy failed requests rate high** — error-rate ratio
   alert for downstream LLM provider availability/auth issues.

3. **LiteLLM request latency p95 high** — histogram-quantile alert
   for downstream provider response-time degradation.

All 3 rules tested via `promtool check rules` (SUCCESS) and validated
on a real LiteLLM v1.83.7 production deployment.

Reference: https://docs.litellm.ai/docs/proxy/prometheus
2026-04-29 15:03:07 +02:00
.github ci: pin Node.js to 24 for Astro 6 compatibility 2026-04-22 01:49:13 +02:00
_data Add LiteLLM section to Other group with 3 alerting rules (#553) 2026-04-29 15:03:07 +02:00
dist Publish 2026-04-21 22:56:04 +00:00
site chore: improve seo 2026-04-26 16:52:07 +02:00
.gitignore chore: generate pagefind index at build time, not committed to git 2026-04-14 20:33:29 +02:00
.travis.yml 💄 awesome-lint 2019-02-11 22:09:50 +01:00
CLAUDE.md feat/astro migration (#538) 2026-04-10 21:08:06 +02:00
CONTRIBUTING.md feat/astro migration (#538) 2026-04-10 21:08:06 +02:00
LICENSE fix: use https in CC BY URL and trigger site build on _data changes 2026-04-14 16:27:01 +02:00
package.json 💄 awesome-lint 2019-02-11 22:09:50 +01:00
README.md docs: update tagline and clean up README 2026-04-10 21:45:27 +02:00

👋 Awesome Prometheus Alerts Awesome

940+ production-ready Prometheus alerting rules for 90+ services — copy-paste YAML for Kubernetes, MySQL, Redis, Kafka, and more.

Collection available here: https://samber.github.io/awesome-prometheus-alerts

Contents

🚨 Rules

Basic resource monitoring

Databases

Message brokers

Proxies, load balancers and service meshes

Runtimes

Data engineering

Orchestrators

CI/CD

Network and security

Storage

Cloud providers

Observability

Other

🤝 Contributing

Contributions from community (you!) are most welcome!

There are many ways to contribute: writing code, alerting rules, documentation, reporting issues, discussing better error tracking...

Instructions here

💫 Show your support

Give a if this project helped you!

support us

📝 License

See LICENSE for details.