refactor: remove previous website

This commit is contained in:
Samuel Berthe 2026-04-07 17:14:38 +02:00
parent cc6835cdf0
commit 9b995315d5
No known key found for this signature in database
GPG key ID: 64863511FFBD0E3C
14 changed files with 29 additions and 1082 deletions

6
.gitignore vendored
View file

@ -1,9 +1,3 @@
# Jekyll (legacy)
_site/
.sass-cache/
.jekyll-cache/
.jekyll-metadata
# Generated data
_data/rules.json
test/rules/

View file

@ -6,17 +6,21 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
A curated collection of ~940 Prometheus alerting rules covering 90+ services across 100+ exporters, organized in categories: basic resource monitoring (Prometheus, host/hardware, SMART, Docker, Blackbox, Windows, VMware, Netdata), databases (MySQL, PostgreSQL, Redis, MongoDB, Elasticsearch, Cassandra, Clickhouse, CouchDB, etc.), message brokers (RabbitMQ, Kafka, Pulsar, Nats, Zookeeper), proxies/load balancers/service meshes (Nginx, Apache, HaProxy, Traefik, Caddy, Linkerd, Istio), runtimes (PHP-FPM, JVM, Sidekiq), data engineering (Apache Flink, Apache Spark, Hadoop), orchestrators (Kubernetes, Nomad, Consul, Etcd, OpenStack), CI/CD (Jenkins, ArgoCD, FluxCD, GitLab CI, Spinnaker), network and security (SSL/TLS, CoreDNS, Vault, Cloudflare, Cilium, eBPF), storage (Ceph, ZFS, OpenEBS, Minio), cloud providers (AWS, Azure, DigitalOcean), observability (Thanos, Loki, Cortex, OpenTelemetry Collector, Grafana Tempo/Mimir/Alloy, Jaeger), and other (APC UPS, Graph Node).
All rules are stored in a single YAML data file (`_data/rules.yml`) and rendered as a Jekyll-based GitHub Pages site at https://samber.github.io/awesome-prometheus-alerts. The site provides copy-pasteable Prometheus alert snippets and downloadable rule files per exporter.
All rules are stored in a single YAML data file (`_data/rules.yml`) and rendered as a static site built with Astro + TypeScript (located in `site/`). The site provides copy-pasteable Prometheus alert snippets and downloadable rule files per exporter.
The project is community-driven. Most contributions are PRs adding or updating rules in `_data/rules.yml`. Files in `dist/rules/` are auto-generated on merge — never edit them manually.
## Architecture
- **`_data/rules.yml`** — The single source of truth for all alerting rules. This is the main file contributors edit. It is NOT a valid Prometheus config; the site renders each rule into copy-pasteable Prometheus alert format.
- **`rules.md`** — Jekyll template that iterates over `_data/rules.yml` and renders the rules page with copy buttons and formatted YAML blocks.
- **`alertmanager.md`** — Static page with Prometheus/AlertManager configuration examples.
- **`_layouts/default.html`** — Site layout (Jekyll theme: cayman).
- **`_config.yml`** — Jekyll configuration.
- **`site/`** — Astro + TypeScript static site. Run `npm run dev` inside this directory to develop locally.
- **`site/src/data/rules.ts`** — Typed wrappers and helper functions over `_data/rules.yml`.
- **`site/src/data/site.ts`** — Shared site metadata constants (URLs, author, schema objects).
- **`site/src/pages/`** — Astro page routes: `index.astro` (homepage), `rules/[group]/[service].astro` (per-service rule pages), `alertmanager.astro`, `blackbox-exporter.astro`, `sleep-peacefully.astro` (guides).
- **`site/src/layouts/BaseLayout.astro`** — Root HTML layout (SEO, GA, dark mode).
- **`site/src/layouts/GuideLayout.astro`** — Layout for guide pages (TOC, hero, related guides).
- **`site/src/components/`** — Shared Astro components (Header, Footer, Sidebar, RuleCard, ExporterSection, etc.).
- **`site/astro.config.mjs`** — Astro configuration (sitemap, Vite YAML plugin, base URL).
- **`dist/rules/`** — Pre-built downloadable rule files organized by service/exporter (referenced in the site for `wget` commands).
## Rules YAML Structure
@ -50,19 +54,20 @@ Services are grouped in category. If you are not sure about the classification,
## Running Locally
```bash
# With Ruby/Bundler
gem install bundler
bundle install
jekyll serve
# With Docker Compose
docker compose up -d
# With Docker directly
docker run --rm -it -p 4000:4000 -v $(pwd):/srv/jekyll jekyll/jekyll jekyll serve
cd site
npm install
npm run dev
```
Site serves at http://localhost:4000/awesome-prometheus-alerts.
Site serves at http://localhost:4321/awesome-prometheus-alerts.
To build for production:
```bash
cd site
npm run build
npm run preview
```
## Contributing Rules

View file

@ -16,24 +16,16 @@ Please ensure your pull request adheres to the following guidelines:
- Description must be factual (the "what?") and should provide root cause suggestions (the "why?"), for faster resolution.
- Queries must be tested on latest exporter version.
## Improving Github page
## Improving the website
The site is built with Astro + TypeScript, located in `site/`.
### Run locally
```
gem install bundler
bundle install
jekyll serve
cd site
npm install
npm run dev
```
Or with Docker:
```
docker run --rm -it -p 4000:4000 -v $(pwd):/srv/jekyll jekyll/jekyll jekyll serve
```
Or with Docker Compose:
```
docker compose up -d
```
Site serves at http://localhost:4321/awesome-prometheus-alerts.

View file

@ -1,3 +0,0 @@
source 'https://rubygems.org'
gem 'github-pages', '>= 232', group: :jekyll_plugins
gem 'webrick', '~> 1.8'

View file

@ -1,293 +0,0 @@
GEM
remote: https://rubygems.org/
specs:
activesupport (7.2.3.1)
base64
benchmark (>= 0.3)
bigdecimal
concurrent-ruby (~> 1.0, >= 1.3.1)
connection_pool (>= 2.2.5)
drb
i18n (>= 1.6, < 2)
logger (>= 1.4.2)
minitest (>= 5.1, < 6)
securerandom (>= 0.3)
tzinfo (~> 2.0, >= 2.0.5)
addressable (2.8.9)
public_suffix (>= 2.0.2, < 8.0)
base64 (0.3.0)
benchmark (0.5.0)
bigdecimal (4.0.1)
coffee-script (2.4.1)
coffee-script-source
execjs
coffee-script-source (1.12.2)
colorator (1.1.0)
commonmarker (0.23.12)
concurrent-ruby (1.3.6)
connection_pool (3.0.2)
csv (3.3.5)
dnsruby (1.73.1)
base64 (>= 0.2)
logger (~> 1.6)
simpleidn (~> 0.2.1)
drb (2.2.3)
em-websocket (0.5.3)
eventmachine (>= 0.12.9)
http_parser.rb (~> 0)
ethon (0.18.0)
ffi (>= 1.15.0)
logger
eventmachine (1.2.7)
execjs (2.10.0)
faraday (2.14.1)
faraday-net_http (>= 2.0, < 3.5)
json
logger
faraday-net_http (3.4.2)
net-http (~> 0.5)
ffi (1.17.3)
ffi (1.17.3-x86_64-linux-gnu)
ffi (1.17.3-x86_64-linux-musl)
forwardable-extended (2.6.0)
gemoji (4.1.0)
github-pages (232)
github-pages-health-check (= 1.18.2)
jekyll (= 3.10.0)
jekyll-avatar (= 0.8.0)
jekyll-coffeescript (= 1.2.2)
jekyll-commonmark-ghpages (= 0.5.1)
jekyll-default-layout (= 0.1.5)
jekyll-feed (= 0.17.0)
jekyll-gist (= 1.5.0)
jekyll-github-metadata (= 2.16.1)
jekyll-include-cache (= 0.2.1)
jekyll-mentions (= 1.6.0)
jekyll-optional-front-matter (= 0.3.2)
jekyll-paginate (= 1.1.0)
jekyll-readme-index (= 0.3.0)
jekyll-redirect-from (= 0.16.0)
jekyll-relative-links (= 0.6.1)
jekyll-remote-theme (= 0.4.3)
jekyll-sass-converter (= 1.5.2)
jekyll-seo-tag (= 2.8.0)
jekyll-sitemap (= 1.4.0)
jekyll-swiss (= 1.0.0)
jekyll-theme-architect (= 0.2.0)
jekyll-theme-cayman (= 0.2.0)
jekyll-theme-dinky (= 0.2.0)
jekyll-theme-hacker (= 0.2.0)
jekyll-theme-leap-day (= 0.2.0)
jekyll-theme-merlot (= 0.2.0)
jekyll-theme-midnight (= 0.2.0)
jekyll-theme-minimal (= 0.2.0)
jekyll-theme-modernist (= 0.2.0)
jekyll-theme-primer (= 0.6.0)
jekyll-theme-slate (= 0.2.0)
jekyll-theme-tactile (= 0.2.0)
jekyll-theme-time-machine (= 0.2.0)
jekyll-titles-from-headings (= 0.5.3)
jemoji (= 0.13.0)
kramdown (= 2.4.0)
kramdown-parser-gfm (= 1.1.0)
liquid (= 4.0.4)
mercenary (~> 0.3)
minima (= 2.5.1)
nokogiri (>= 1.16.2, < 2.0)
rouge (= 3.30.0)
terminal-table (~> 1.4)
webrick (~> 1.8)
github-pages-health-check (1.18.2)
addressable (~> 2.3)
dnsruby (~> 1.60)
octokit (>= 4, < 8)
public_suffix (>= 3.0, < 6.0)
typhoeus (~> 1.3)
html-pipeline (2.14.3)
activesupport (>= 2)
nokogiri (>= 1.4)
http_parser.rb (0.8.1)
i18n (1.14.8)
concurrent-ruby (~> 1.0)
jekyll (3.10.0)
addressable (~> 2.4)
colorator (~> 1.0)
csv (~> 3.0)
em-websocket (~> 0.5)
i18n (>= 0.7, < 2)
jekyll-sass-converter (~> 1.0)
jekyll-watch (~> 2.0)
kramdown (>= 1.17, < 3)
liquid (~> 4.0)
mercenary (~> 0.3.3)
pathutil (~> 0.9)
rouge (>= 1.7, < 4)
safe_yaml (~> 1.0)
webrick (>= 1.0)
jekyll-avatar (0.8.0)
jekyll (>= 3.0, < 5.0)
jekyll-coffeescript (1.2.2)
coffee-script (~> 2.2)
coffee-script-source (~> 1.12)
jekyll-commonmark (1.4.0)
commonmarker (~> 0.22)
jekyll-commonmark-ghpages (0.5.1)
commonmarker (>= 0.23.7, < 1.1.0)
jekyll (>= 3.9, < 4.0)
jekyll-commonmark (~> 1.4.0)
rouge (>= 2.0, < 5.0)
jekyll-default-layout (0.1.5)
jekyll (>= 3.0, < 5.0)
jekyll-feed (0.17.0)
jekyll (>= 3.7, < 5.0)
jekyll-gist (1.5.0)
octokit (~> 4.2)
jekyll-github-metadata (2.16.1)
jekyll (>= 3.4, < 5.0)
octokit (>= 4, < 7, != 4.4.0)
jekyll-include-cache (0.2.1)
jekyll (>= 3.7, < 5.0)
jekyll-mentions (1.6.0)
html-pipeline (~> 2.3)
jekyll (>= 3.7, < 5.0)
jekyll-optional-front-matter (0.3.2)
jekyll (>= 3.0, < 5.0)
jekyll-paginate (1.1.0)
jekyll-readme-index (0.3.0)
jekyll (>= 3.0, < 5.0)
jekyll-redirect-from (0.16.0)
jekyll (>= 3.3, < 5.0)
jekyll-relative-links (0.6.1)
jekyll (>= 3.3, < 5.0)
jekyll-remote-theme (0.4.3)
addressable (~> 2.0)
jekyll (>= 3.5, < 5.0)
jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0)
rubyzip (>= 1.3.0, < 3.0)
jekyll-sass-converter (1.5.2)
sass (~> 3.4)
jekyll-seo-tag (2.8.0)
jekyll (>= 3.8, < 5.0)
jekyll-sitemap (1.4.0)
jekyll (>= 3.7, < 5.0)
jekyll-swiss (1.0.0)
jekyll-theme-architect (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-cayman (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-dinky (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-hacker (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-leap-day (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-merlot (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-midnight (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-minimal (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-modernist (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-primer (0.6.0)
jekyll (> 3.5, < 5.0)
jekyll-github-metadata (~> 2.9)
jekyll-seo-tag (~> 2.0)
jekyll-theme-slate (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-tactile (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-theme-time-machine (0.2.0)
jekyll (> 3.5, < 5.0)
jekyll-seo-tag (~> 2.0)
jekyll-titles-from-headings (0.5.3)
jekyll (>= 3.3, < 5.0)
jekyll-watch (2.2.1)
listen (~> 3.0)
jemoji (0.13.0)
gemoji (>= 3, < 5)
html-pipeline (~> 2.2)
jekyll (>= 3.0, < 5.0)
json (2.19.2)
kramdown (2.4.0)
rexml
kramdown-parser-gfm (1.1.0)
kramdown (~> 2.0)
liquid (4.0.4)
listen (3.10.0)
logger
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
logger (1.7.0)
mercenary (0.3.6)
mini_portile2 (2.8.9)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.27.0)
net-http (0.9.1)
uri (>= 0.11.1)
nokogiri (1.19.1)
mini_portile2 (~> 2.8.2)
racc (~> 1.4)
nokogiri (1.19.1-x86_64-linux-gnu)
racc (~> 1.4)
nokogiri (1.19.1-x86_64-linux-musl)
racc (~> 1.4)
octokit (4.25.1)
faraday (>= 1, < 3)
sawyer (~> 0.9)
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (5.1.1)
racc (1.8.1)
rb-fsevent (0.11.2)
rb-inotify (0.11.1)
ffi (~> 1.0)
rexml (3.4.4)
rouge (3.30.0)
rubyzip (2.4.1)
safe_yaml (1.0.5)
sass (3.7.4)
sass-listen (~> 4.0.0)
sass-listen (4.0.0)
rb-fsevent (~> 0.9, >= 0.9.4)
rb-inotify (~> 0.9, >= 0.9.7)
sawyer (0.9.3)
addressable (>= 2.3.5)
faraday (>= 0.17.3, < 3)
securerandom (0.4.1)
simpleidn (0.2.3)
terminal-table (1.8.0)
unicode-display_width (~> 1.1, >= 1.1.1)
typhoeus (1.4.1)
ethon (>= 0.9.0)
tzinfo (2.0.6)
concurrent-ruby (~> 1.0)
unicode-display_width (1.8.0)
uri (1.1.1)
webrick (1.9.2)
PLATFORMS
ruby
x86_64-linux
x86_64-linux-musl
DEPENDENCIES
github-pages (>= 232)
webrick (~> 1.8)
BUNDLED WITH
2.3.25

View file

@ -179,7 +179,7 @@ There are many ways to contribute: writing code, alerting rules, documentation,
## 🏋️ Improvements
- Create an alert rule builder in Jekyll for custom alerts (severity, thresholds, instances...)
- Create an alert rule builder for custom alerts (severity, thresholds, instances...)
- Add resolution suggestions to rule descriptions, for faster incident resolution ([#85](https://github.com/samber/awesome-prometheus-alerts/issues/85)).
## 💫 Show your support

View file

@ -1,8 +0,0 @@
theme: jekyll-theme-cayman
title: Awesome Prometheus alerts
description: Collection of alerting rules
repository: samber/awesome-prometheus-alerts
baseurl: /awesome-prometheus-alerts

View file

@ -1,162 +0,0 @@
<!DOCTYPE html>
<html lang="{{ site.lang | default: "en-US" }}">
<head>
<meta charset="UTF-8">
{% seo %}
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#157878">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<link rel="stylesheet" href="{{ '/assets/css/style.css?v=' | append: site.github.build_revision | relative_url }}">
<link rel="stylesheet" href="{{ '/assets/css/app.css?v=' | append: site.github.build_revision | relative_url }}">
<link rel="icon" type="image/x-icon" href="{{ '/assets/favicon.ico' | relative_url }}">
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/js/bootstrap.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js"></script>
<script src="{{ '/assets/js/app.js?v=' | append: site.github.build_revision | relative_url }}"></script>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-118604063-2"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'UA-118604063-2');
</script>
</head>
<body>
<style>
#skip-to-content {
height: 1px;
width: 1px;
position: absolute;
overflow: hidden;
top: -10px;
&:focus {
position: fixed;
top: 10px;
left: 10px;
height: auto;
width: auto;
background: invert($body-link-color);
outline: thick solid invert($body-link-color);
}
}
ul.github-buttons-cta li {
display: inline-block;
height: 20px;
padding: 0px 15px;
}
ul.github-buttons-cta li a {
/* width: 100px; */
text-decoration: none;
}
.fa {
/* padding: 14px;
width: 50px;
height: 50px; */
font-size: 25px;
text-align: center;
text-decoration: none;
border-radius: 50%;
}
.fa:hover {
opacity: 0.8;
}
.fa-twitter,
.fa-linkedin {
/* background: #55ACEE; */
color: white;
}
</style>
<a id="skip-to-content" href="#content">Skip to the content.</a>
<header class="page-header" role="banner">
<h1 class="project-name">
<a href="{{ '/' | relative_url }}" style="color: white">
{{ site.title | default: site.github.repository_name }}
</a>
</h1>
<h2 class="project-tagline">{{ site.description | default: site.github.project_tagline }}</h2>
<a href="{{ '/alertmanager' | relative_url }}" class="btn">Global configuration</a>
<a href="{{ '/rules' | relative_url }}" class="btn">Rules</a>
<a href="{{ '/sleep-peacefully' | relative_url }}" class="btn">Sleep peacefully</a>
<a href="{{ '/blackbox-exporter' | relative_url }}" class="btn">Blackbox</a>
<a href="https://github.com/samber/awesome-prometheus-alerts/blob/master/CONTRIBUTING.md" class="btn">
Contribute on GitHub
</a>
<ul class="github-buttons-cta">
<li>
<a href="https://github.com/samber/awesome-prometheus-alerts">
<img alt="GitHub Repo Watchers" src="https://img.shields.io/github/watchers/samber/awesome-prometheus-alerts?style=social">
</a>
</li>
<li>
<a href="https://github.com/samber/awesome-prometheus-alerts">
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/samber/awesome-prometheus-alerts?style=social">
</a>
</li>
<li>
<a href="https://github.com/samber/awesome-prometheus-alerts">
<img alt="GitHub Repo forks" src="https://img.shields.io/github/forks/samber/awesome-prometheus-alerts?style=social">
</a>
</li>
<li>
<a href="https://twitter.com/share?via=samuelberthe&related=samuelberthe&text=🚨 📊 Here is a collection of Awesome Prometheus Alerts&url=https://samber.github.io/awesome-prometheus-alerts"
class="fa fa-twitter" target="_blank"></a>
</li>
<li>
<a href="http://www.linkedin.com/shareArticle?mini=true&url=https://samber.github.io/awesome-prometheus-alerts/"
class="fa fa-linkedin" target="_blank"></a>
</li>
</ul>
<ul id="sponsoring">
<li>
Kindly supported by&nbsp; 👉
</li>
<li>
<a href="https://cast.ai/samuel">
<img width="" src="assets/sponsor-cast-ai.png" />
</a>
</li>
<li>
<a href="https://betterstack.com/">
<img width="" src="assets/sponsor-betterstack.png" />
</a>
</li>
</ul>
</header>
<main id="content" class="main-content" role="main">
{{ content }}
<footer class="site-footer">
{% if site.github.is_project_page %}
<span class="site-footer-owner">
<a href="{{ site.github.repository_url }}">{{ site.title }}</a> is maintained by
<a href="{{ site.github.owner_url }}">{{ site.github.owner_name }}</a>.
</span>
{% endif %}
</footer>
</main>
</body>
</html>

View file

@ -1,141 +0,0 @@
<h1 style="text-align: center;">
Global configuration
</h1>
If you notice a delay between an event and the first notification, read the following blog post => [https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html](https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html).
## Prometheus configuration
{% highlight yaml %}
# prometheus.yml
global:
scrape_interval: 20s
# A short evaluation_interval will check alerting rules very often.
# It can be costly if you run Prometheus with 100+ alerts.
evaluation_interval: 20s
...
rule_files:
- 'alerts/*.yml'
scrape_configs:
...
{% endhighlight %}
{% highlight yaml %}
# alerts/example-redis.yml
groups:
- name: ExampleRedisGroup
rules:
- alert: ExampleRedisDown
expr: redis_up{} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Redis instance down"
description: "Whatever"
{% endhighlight %}
## AlertManager configuration
{% highlight yaml %}
{% raw %}
# alertmanager.yml
route:
# When a new group of alerts is created by an incoming alert, wait at
# least 'group_wait' to send the initial notification.
# This way ensures that you get multiple alerts for the same group that start
# firing shortly after another are batched together on the first
# notification.
group_wait: 10s
# When the first notification was sent, wait 'group_interval' to send a batch
# of new alerts that started firing for that group.
group_interval: 30s
# If an alert has successfully been sent, wait 'repeat_interval' to
# resend them.
repeat_interval: 30m
# A default receiver
receiver: "slack"
# All the above attributes are inherited by all child routes and can
# overwritten on each.
routes:
- receiver: "slack"
group_wait: 10s
match_re:
severity: critical|warning
continue: true
- receiver: "pager"
group_wait: 10s
match_re:
severity: critical
continue: true
receivers:
- name: "slack"
slack_configs:
- api_url: 'https://hooks.slack.com/services/XXXXXXXXX/XXXXXXXXX/xxxxxxxxxxxxxxxxxxxxxxxxxxx'
send_resolved: true
channel: 'monitoring'
text: "{{ range .Alerts }}<!channel> {{ .Annotations.summary }}\n{{ .Annotations.description }}\n{{ end }}"
- name: "pager"
webhook_configs:
- url: http://a.b.c.d:8080/send/sms
send_resolved: true
{% endraw %}
{% endhighlight %}
## Reduce Prometheus server load
For expansive or frequent PromQL queries, Prometheus allows to precompute rules.
{% highlight yaml %}
{% raw %}
groups:
# first define the recorded rule
- name: ExampleRecordedGroup
rules:
- record: job:rabbitmq_queue_messages_delivered_total:rate:5m
expr: rate(rabbitmq_queue_messages_delivered_total[5m])
# then use it in alerts
- name: ExampleAlertingGroup
rules:
- alert: ExampleRabbitmqLowMessageDelivery
expr: sum(job:rabbitmq_queue_messages_delivered_total:rate:5m) < 10
for: 2m
labels:
severity: critical
annotations:
summary: "Low delivery rate in Rabbitmq queues"
{% endraw %}
{% endhighlight %}
## Troubleshooting
If the notification takes too much time to be triggered, check the following delays:
- `scrape_interval = 20s` (prometheus.yml)
- `evaluation_interval = 20s` (prometheus.yml)
- `increase(mysql_global_status_slow_queries[1m]) > 0` (alerts/example-mysql.yml)
- `for: 5m` (alerts/example-mysql.yml)
- `group_wait = 10s` (alertmanager.yml)
Also read:
- [https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html](https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html).
- [https://hodovi.cc/blog/creating-awesome-alertmanager-templates-for-slack/](https://hodovi.cc/blog/creating-awesome-alertmanager-templates-for-slack/)
- [https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to-efficiently-detect-anomalies-at-scale/](https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to-efficiently-detect-anomalies-at-scale/)

View file

@ -1,125 +0,0 @@
<h1 style="text-align: center;">
Blackbox exporter
</h1>
## Wordwide probes
<a href="https://github.com/prometheus/blackbox_exporter" target="_blank">Blackbox Exporter</a> gives you the ability to probe endpoints over HTTP, HTTPS, DNS, TCP and ICMP.
You should deploy blackbox exporters in multiple Point of Presence around the globe, to monitor latency. Feel free to use the following endpoints for your own projects:
- https://probe-<b>montreal</b>.cleverapps.io
- https://probe-<b>paris</b>.cleverapps.io
- https://probe-<b>jeddah</b>.cleverapps.io
- https://probe-<b>singapore</b>.cleverapps.io
- https://probe-<b>sydney</b>.cleverapps.io
- https://probe-<b>warsaw</b>.cleverapps.io
☝️ Logs have been disabled. More probes from the community would be appreciated, please contribute <a href="https://github.com/samber/awesome-prometheus-alerts/" target="_blank">here</a>! These blackbox exporters use the following <a href="https://github.com/samber/blackbox_exporter/blob/master/samber.yml" target="_blank">configuration</a>.
## Prometheus Configuration
Blackbox exporters and endpoints must be declared in Prometheus. Here is a simple configuration, inspired by [Hayk Davtyan medium post](https://medium.com/geekculture/single-prometheus-job-for-dozens-of-blackbox-exporters-2a7ba492d6c8):
```yml
# sd/blackbox.yml
- targets:
#
# Montreal
#
# http
- probe-montreal.cleverapps.io:_:http_2xx:_:Montreal:_:f229cy:_:https://api.screeb.app
- probe-montreal.cleverapps.io:_:http_2xx:_:Montreal:_:f229cy:_:https://t.screeb.app/tag.js
# icmp
- probe-montreal.cleverapps.io:_:icmp_ipv4:_:Montreal:_:f229cy:_:api.screeb.app
- probe-montreal.cleverapps.io:_:icmp_ipv4:_:Montreal:_:f229cy:_:t.screeb.app
#
# Paris
#
# http
- probe-paris.cleverapps.io:_:http_2xx:_:Paris:_:u09tgy:_:https://api.screeb.app
- probe-paris.cleverapps.io:_:http_2xx:_:Paris:_:u09tgy:_:https://t.screeb.app/tag.js
# icmp
- probe-paris.cleverapps.io:_:icmp_ipv4:_:Paris:_:u09tgy:_:api.screeb.app
- probe-paris.cleverapps.io:_:icmp_ipv4:_:Paris:_:u09tgy:_:t.screeb.app
#
# Sydney
#
# http
- probe-sydney.cleverapps.io:_:http_2xx:_:Sydney:_:r3gpkn:_:https://api.screeb.app
- probe-sydney.cleverapps.io:_:http_2xx:_:Sydney:_:r3gpkn:_:https://t.screeb.app/tag.js
# icmp
- probe-sydney.cleverapps.io:_:icmp_ipv4:_:Sydney:_:r3gpkn:_:api.screeb.app
- probe-sydney.cleverapps.io:_:icmp_ipv4:_:Sydney:_:r3gpkn:_:t.screeb.app
# ...
```
```yml
# prometheus.yml
global:
# ...
scrape_configs:
- job_name: 'blackbox'
metrics_path: /probe
scrape_interval: 30s
scheme: https
file_sd_configs:
- files:
- /etc/prometheus/sd/blackbox.yml
relabel_configs:
# adds "module" label in the final labelset
- source_labels: [__address__]
regex: '.*:_:(.*):_:.*:_:.*:_:.*'
target_label: module
# adds "geohash" label in the final labelset
- source_labels: [__address__]
regex: '.*:_:.*:_:.*:_:(.*):_:.*'
target_label: geohash
# rewrites "instance" label with corresponding URL
- source_labels: [__address__]
regex: '.*:_:.*:_:.*:_:.*:_:(.*)'
target_label: instance
# rewrites "pop" label with corresponding location name
- source_labels: [__address__]
regex: '.*:_:.*:_:(.*):_:.*:_:.*'
target_label: pop
# passes "module" parameter to Blackbox exporter
- source_labels: [module]
target_label: __param_module
# passes "target" parameter to Blackbox exporter
- source_labels: [instance]
target_label: __param_target
# the Blackbox exporter's real hostname:port
- source_labels: [__address__]
regex: '(.*):_:.*:_:.*:_:.*:_:.*'
target_label: __address__
# ...
```
## Geohash
![](assets/grafana-map-panel.png)
To display nice maps in Grafana, you need to instruct blackbox exporters about the location. Grafana map panel speaks the "geohash" format:
- go to google map
- extract the lat/long from the url
- convert lat/long to geohash here: http://geohash.co
## Grafana
Some great dashboard have been created by the community: https://grafana.com/grafana/dashboards/?search=blackbox
Since Grafana v5.0.0, a map panel is available: https://grafana.com/docs/grafana/latest/panels-visualizations/visualizations/geomap/

View file

@ -1,11 +0,0 @@
version: '3'
services:
jekyll:
image: jekyll/jekyll:latest
command: jekyll serve
volumes:
- ./:/srv/jekyll
ports:
- 4000:4000

View file

@ -1,54 +0,0 @@
<style>
.center-image
{
margin: 0 auto;
display: block;
}
</style>
![Prometheus logo](/assets/prometheus-logo.png){: .center-image }
<h2>
Hello world
</h2>
<a href="/awesome-prometheus-alerts/alertmanager">
AlertManager configuration
</a>
<a href="/awesome-prometheus-alerts/sleep-peacefully">
Alerting time window
</a>
<h2>
Out of the box prometheus alerting rules
</h2>
<ul>
{% for group in site.data.rules.groups %}
<li style="margin-top: 30px;">
{% assign nbrRules = 0 %}
{% for service in group.services %}
{% for exporter in service.exporters %}
{% for rule in exporter.rules %}
{% assign nbrRules = nbrRules | plus: 1 %}
{% endfor %}
{% endfor %}
{% endfor %}
<h3>{{ group.name }} <small style="margin-left: 20px;">({{ nbrRules }} rules)</small></h3>
<ul>
{% for service in group.services %}
<li>
<a href="/awesome-prometheus-alerts/rules#{{ service.name | replace: " ", "-" | downcase }}">
{{ service.name }}
</a>
</li>
{% endfor %}
</ul>
</li>
{% endfor %}
</ul>

141
rules.md
View file

@ -1,141 +0,0 @@
<style>
ul {
list-style: none;
}
</style>
<!-- CAUTIONS -->
<div style="padding: 20px 20px 10px 20px; border: solid grey 1px; border-radius: 10px;">
<h2 style="text-align:center;">⚠️ Caution ⚠️</h2>
<p style="text-align:center;">
Alert thresholds depend on nature of applications.
<br>
Some queries in this page may have arbitrary tolerance threshold.
<br><br>
Building an efficient and battle-tested monitoring platform takes time. 😉
</p>
</div>
<br>
<br>
<h1></h1>
<!-- RULES -->
<ul>
{% for group in site.data.rules.groups %}
{% assign groupIndex = forloop.index %}
{% for service in group.services %}
{% assign serviceIndex = forloop.index %}
{% assign nbrExporters = service.exporters | size %}
{% for exporter in service.exporters %}
{% assign exporterIndex = forloop.index %}
{% assign nbrRules = exporter.rules | size %}
<li>
{% assign serviceId = service.name | replace: " ", "-" | downcase %}
<h2 id="{{ serviceId }}">
<span id="{{ serviceId }}-{{ exporterIndex }}"></span>
<a class="anchor" href="#{{ serviceId }}-{{ exporterIndex }}">#</a>
{{ groupIndex }}.{{ serviceIndex }}.{% if nbrExporters > 1 %}{{ exporterIndex }}.{% endif %}
{{ service.name }}
{% if exporter.name %}:
{% if exporter.doc_url %}
<a href="{{ exporter.doc_url }}">
{{ exporter.name }}
</a>
{% else %}
{{ exporter.name }}
{% endif %}
{% endif %}
{% if nbrRules > 0 %}
<small style="font-size: 60%; vertical-align: middle; margin-left: 10px;">
({{ nbrRules }} rules)
</small>
<span class="clipboard-multiple" data-clipboard-target-id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}">[copy section]</span>
{% endif %}
</h2>
{% if nbrRules == 0 %}
{% highlight javascript %}
// @TODO: Please contribute => https://github.com/samber/awesome-prometheus-alerts 👋
{% endhighlight %}
{% else %}
{{ exporter.comments | strip | newline_to_br }}
{% highlight bash %}
$ wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/{{ service.name | replace: " ", "-" | downcase }}/{{ exporter.slug }}.yml
{% endhighlight %}
{% endif %}
<ul>
{% for rule in exporter.rules %}
{% assign ruleIndex = forloop.index %}
{% assign comments = rule.comments | strip | newline_to_br | split: '<br />' %}
<li>
<h4 id="rule-{{ serviceId }}-{{ exporterIndex }}-{{ ruleIndex }}">
<span id="rule-{{ serviceId }}-{{ ruleIndex }}"></span><!-- @deprecated -->
<a class="anchor" href="#rule-{{ serviceId }}-{{ exporterIndex }}-{{ ruleIndex }}">#</a>
{{ groupIndex}}.{{ serviceIndex }}.{% if nbrExporters > 1 %}{{ exporterIndex }}.{% endif %}{{ ruleIndex }}.
{{ rule.name }}
</h4>
<summary>
{{ rule.description }}
<span class="clipboard-single" data-clipboard-target-id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}-rule-{{ ruleIndex }}" onclick="event.preventDefault();">[copy]</span>
</summary>
<div id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}-rule-{{ ruleIndex }}">
{% assign ruleName = rule.name | split: ' ' %}
{% capture ruleNameCamelcase %}{% for word in ruleName %}{{ word | capitalize }} {% endfor %}{% endcapture %}
{% highlight yaml %}
{% for comment in comments %}# {{ comment | strip }}
{% endfor %}- alert: {{ ruleNameCamelcase | remove: ' ' }}
expr: {{ rule.query }}
for: {% if rule.for %}{{ rule.for }}{% else %}0m{% endif %}
labels:
severity: {{ rule.severity }}
annotations:
summary: {{ rule.name }} (instance {% raw %}{{ $labels.instance }}{% endraw %})
description: "{{ rule.description | replace: '"', '\"' }}\n VALUE = {% raw %}{{ $value }}{% endraw %}\n LABELS = {% raw %}{{ $labels }}{% endraw %}"
{% endhighlight %}
</div>
<br/>
</li>
{% endfor %}
</ul>
<hr/>
</li>
{% endfor %}
{% endfor %}
{% endfor %}
</ul>
<!-- NAVBAR -->
<div id="rules-navbar" class="affix">
<h3>Menu</h3>
<ul>
{% for group in site.data.rules.groups %}
<li>
<h4>{{ group.name }}</h4>
<ul>
{% for service in group.services %}
<li>
<a href="#{{ service.name | replace: " ", "-" | downcase }}">
👉 {{ service.name }}
</a>
</li>
{% endfor %}
</ul>
</li>
{% endfor %}
</ul>
<script>
$('#rules-navbar').affix({offset: {top: 750} }).css('display', 'block');
</script>
</div>

View file

@ -1,106 +0,0 @@
<h1 style="text-align: center;">
Sleep Peacefully
</h1>
## Alerting time window
In some applications, load and activity can vary over the day/week/year.
In order to prevent alarm fatigue and busy pager, alerts can be disabled during a period of time (such as night or weekend).
Example:
- Weekday: `node_load5 > 10 and ON() (0 < day_of_week() < 6)`
- Day time: `node_load5 > 10 and ON() (8 < hour() < 18)`
- Exclude December: `node_load5 > 10 and ON() (month() != 12)`
## Advanced time windows and timezones
```yml
# rules.yml
groups:
- name: timezones
rules:
- record: european_summer_time_offset
expr: |
(vector(1) and (month() > 3 and month() < 10))
or
(vector(1) and (month() == 3 and (day_of_month() - day_of_week()) >= 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
or
(vector(1) and (month() == 10 and (day_of_month() - day_of_week()) < 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
or
(vector(1) and ((month() == 10 and hour() < 1) or (month() == 3 and hour() > 0)) and ((day_of_month() >= 25) and (day_of_week() == 0)))
or
vector(0)
- record: europe_london_time
expr: time() + 3600 * european_summer_time_offset
- record: europe_paris_time
expr: time() + 3600 * (1 + european_summer_time_offset)
- record: europe_london_hour
expr: hour(europe_london_time)
- record: europe_paris_hour
expr: hour(europe_paris_time)
- record: europe_london_weekday
expr: 0 < day_of_week(europe_london_time) < 6
- record: europe_paris_weekday
expr: 0 < day_of_week(europe_paris_time) < 6
# opposite
- record: not_europe_london_weekday
expr: absent(europe_london_weekday)
- record: not_europe_paris_weekday
expr: absent(europe_paris_weekday)
- record: europe_london_business_hours
expr: 9 <= europe_london_hour < 18
- record: europe_paris_business_hours
expr: 9 <= europe_paris_hour < 18
# opposite
- record: not_europe_london_business_hours
expr: absent(europe_london_business_hours)
- record: not_europe_paris_business_hours
expr: absent(europe_paris_business_hours)
# new year's day / xmas / labor day / all saints' day / ...
- record: europe_french_public_holidays
expr: |
(vector(1) and month(europe_paris_time) == 1 and day_of_month(europe_paris_time) == 1)
or
(vector(1) and month(europe_paris_time) == 12 and day_of_month(europe_paris_time) == 25)
or
(vector(1) and month(europe_paris_time) == 5 and day_of_month(europe_paris_time) == 1)
or
(vector(1) and month(europe_paris_time) == 11 and day_of_month(europe_paris_time) == 1)
or
vector(0)
# opposite
- record: not_europe_french_public_holidays
expr: absent(europe_french_public_holidays)
```
```yml
# alerts.yml
groups:
- name: CPU Load
rules:
- alert: HighLoadQuietDuringWeekendAndNight
expr: node_load5 > 10 and ON() (europe_london_weekday and europe_paris_weekday)
- alert: HighLoadQuietDuringBackup
expr: node_load5 > 10 and ON() absent(hour() == 2)
- alert: HighLoad
expr: |
node_load5 > 20 and ON() (europe_london_weekday and europe_paris_weekday)
or
node_load5 > 10
```
## Sources
- [https://medium.com/@tom.fawcett/time-of-day-based-notifications-with-prometheus-and-alertmanager-1bf7a23b7695](https://medium.com/@tom.fawcett/time-of-day-based-notifications-with-prometheus-and-alertmanager-1bf7a23b7695)
- [https://promcon.io/2019-munich/slides/improved-alerting-with-prometheus-and-alertmanager.pdf](https://promcon.io/2019-munich/slides/improved-alerting-with-prometheus-and-alertmanager.pdf)