mirror of
https://github.com/samber/awesome-prometheus-alerts.git
synced 2026-06-25 02:46:59 +08:00
refactor: remove previous website
This commit is contained in:
parent
cc6835cdf0
commit
9b995315d5
14 changed files with 29 additions and 1082 deletions
6
.gitignore
vendored
6
.gitignore
vendored
|
|
@ -1,9 +1,3 @@
|
||||||
# Jekyll (legacy)
|
|
||||||
_site/
|
|
||||||
.sass-cache/
|
|
||||||
.jekyll-cache/
|
|
||||||
.jekyll-metadata
|
|
||||||
|
|
||||||
# Generated data
|
# Generated data
|
||||||
_data/rules.json
|
_data/rules.json
|
||||||
test/rules/
|
test/rules/
|
||||||
|
|
|
||||||
37
CLAUDE.md
37
CLAUDE.md
|
|
@ -6,17 +6,21 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||||
|
|
||||||
A curated collection of ~940 Prometheus alerting rules covering 90+ services across 100+ exporters, organized in categories: basic resource monitoring (Prometheus, host/hardware, SMART, Docker, Blackbox, Windows, VMware, Netdata), databases (MySQL, PostgreSQL, Redis, MongoDB, Elasticsearch, Cassandra, Clickhouse, CouchDB, etc.), message brokers (RabbitMQ, Kafka, Pulsar, Nats, Zookeeper), proxies/load balancers/service meshes (Nginx, Apache, HaProxy, Traefik, Caddy, Linkerd, Istio), runtimes (PHP-FPM, JVM, Sidekiq), data engineering (Apache Flink, Apache Spark, Hadoop), orchestrators (Kubernetes, Nomad, Consul, Etcd, OpenStack), CI/CD (Jenkins, ArgoCD, FluxCD, GitLab CI, Spinnaker), network and security (SSL/TLS, CoreDNS, Vault, Cloudflare, Cilium, eBPF), storage (Ceph, ZFS, OpenEBS, Minio), cloud providers (AWS, Azure, DigitalOcean), observability (Thanos, Loki, Cortex, OpenTelemetry Collector, Grafana Tempo/Mimir/Alloy, Jaeger), and other (APC UPS, Graph Node).
|
A curated collection of ~940 Prometheus alerting rules covering 90+ services across 100+ exporters, organized in categories: basic resource monitoring (Prometheus, host/hardware, SMART, Docker, Blackbox, Windows, VMware, Netdata), databases (MySQL, PostgreSQL, Redis, MongoDB, Elasticsearch, Cassandra, Clickhouse, CouchDB, etc.), message brokers (RabbitMQ, Kafka, Pulsar, Nats, Zookeeper), proxies/load balancers/service meshes (Nginx, Apache, HaProxy, Traefik, Caddy, Linkerd, Istio), runtimes (PHP-FPM, JVM, Sidekiq), data engineering (Apache Flink, Apache Spark, Hadoop), orchestrators (Kubernetes, Nomad, Consul, Etcd, OpenStack), CI/CD (Jenkins, ArgoCD, FluxCD, GitLab CI, Spinnaker), network and security (SSL/TLS, CoreDNS, Vault, Cloudflare, Cilium, eBPF), storage (Ceph, ZFS, OpenEBS, Minio), cloud providers (AWS, Azure, DigitalOcean), observability (Thanos, Loki, Cortex, OpenTelemetry Collector, Grafana Tempo/Mimir/Alloy, Jaeger), and other (APC UPS, Graph Node).
|
||||||
|
|
||||||
All rules are stored in a single YAML data file (`_data/rules.yml`) and rendered as a Jekyll-based GitHub Pages site at https://samber.github.io/awesome-prometheus-alerts. The site provides copy-pasteable Prometheus alert snippets and downloadable rule files per exporter.
|
All rules are stored in a single YAML data file (`_data/rules.yml`) and rendered as a static site built with Astro + TypeScript (located in `site/`). The site provides copy-pasteable Prometheus alert snippets and downloadable rule files per exporter.
|
||||||
|
|
||||||
The project is community-driven. Most contributions are PRs adding or updating rules in `_data/rules.yml`. Files in `dist/rules/` are auto-generated on merge — never edit them manually.
|
The project is community-driven. Most contributions are PRs adding or updating rules in `_data/rules.yml`. Files in `dist/rules/` are auto-generated on merge — never edit them manually.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
- **`_data/rules.yml`** — The single source of truth for all alerting rules. This is the main file contributors edit. It is NOT a valid Prometheus config; the site renders each rule into copy-pasteable Prometheus alert format.
|
- **`_data/rules.yml`** — The single source of truth for all alerting rules. This is the main file contributors edit. It is NOT a valid Prometheus config; the site renders each rule into copy-pasteable Prometheus alert format.
|
||||||
- **`rules.md`** — Jekyll template that iterates over `_data/rules.yml` and renders the rules page with copy buttons and formatted YAML blocks.
|
- **`site/`** — Astro + TypeScript static site. Run `npm run dev` inside this directory to develop locally.
|
||||||
- **`alertmanager.md`** — Static page with Prometheus/AlertManager configuration examples.
|
- **`site/src/data/rules.ts`** — Typed wrappers and helper functions over `_data/rules.yml`.
|
||||||
- **`_layouts/default.html`** — Site layout (Jekyll theme: cayman).
|
- **`site/src/data/site.ts`** — Shared site metadata constants (URLs, author, schema objects).
|
||||||
- **`_config.yml`** — Jekyll configuration.
|
- **`site/src/pages/`** — Astro page routes: `index.astro` (homepage), `rules/[group]/[service].astro` (per-service rule pages), `alertmanager.astro`, `blackbox-exporter.astro`, `sleep-peacefully.astro` (guides).
|
||||||
|
- **`site/src/layouts/BaseLayout.astro`** — Root HTML layout (SEO, GA, dark mode).
|
||||||
|
- **`site/src/layouts/GuideLayout.astro`** — Layout for guide pages (TOC, hero, related guides).
|
||||||
|
- **`site/src/components/`** — Shared Astro components (Header, Footer, Sidebar, RuleCard, ExporterSection, etc.).
|
||||||
|
- **`site/astro.config.mjs`** — Astro configuration (sitemap, Vite YAML plugin, base URL).
|
||||||
- **`dist/rules/`** — Pre-built downloadable rule files organized by service/exporter (referenced in the site for `wget` commands).
|
- **`dist/rules/`** — Pre-built downloadable rule files organized by service/exporter (referenced in the site for `wget` commands).
|
||||||
|
|
||||||
## Rules YAML Structure
|
## Rules YAML Structure
|
||||||
|
|
@ -50,19 +54,20 @@ Services are grouped in category. If you are not sure about the classification,
|
||||||
## Running Locally
|
## Running Locally
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# With Ruby/Bundler
|
cd site
|
||||||
gem install bundler
|
npm install
|
||||||
bundle install
|
npm run dev
|
||||||
jekyll serve
|
|
||||||
|
|
||||||
# With Docker Compose
|
|
||||||
docker compose up -d
|
|
||||||
|
|
||||||
# With Docker directly
|
|
||||||
docker run --rm -it -p 4000:4000 -v $(pwd):/srv/jekyll jekyll/jekyll jekyll serve
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Site serves at http://localhost:4000/awesome-prometheus-alerts.
|
Site serves at http://localhost:4321/awesome-prometheus-alerts.
|
||||||
|
|
||||||
|
To build for production:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd site
|
||||||
|
npm run build
|
||||||
|
npm run preview
|
||||||
|
```
|
||||||
|
|
||||||
## Contributing Rules
|
## Contributing Rules
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -16,24 +16,16 @@ Please ensure your pull request adheres to the following guidelines:
|
||||||
- Description must be factual (the "what?") and should provide root cause suggestions (the "why?"), for faster resolution.
|
- Description must be factual (the "what?") and should provide root cause suggestions (the "why?"), for faster resolution.
|
||||||
- Queries must be tested on latest exporter version.
|
- Queries must be tested on latest exporter version.
|
||||||
|
|
||||||
## Improving Github page
|
## Improving the website
|
||||||
|
|
||||||
|
The site is built with Astro + TypeScript, located in `site/`.
|
||||||
|
|
||||||
### Run locally
|
### Run locally
|
||||||
|
|
||||||
```
|
```
|
||||||
gem install bundler
|
cd site
|
||||||
bundle install
|
npm install
|
||||||
jekyll serve
|
npm run dev
|
||||||
```
|
```
|
||||||
|
|
||||||
Or with Docker:
|
Site serves at http://localhost:4321/awesome-prometheus-alerts.
|
||||||
|
|
||||||
```
|
|
||||||
docker run --rm -it -p 4000:4000 -v $(pwd):/srv/jekyll jekyll/jekyll jekyll serve
|
|
||||||
```
|
|
||||||
|
|
||||||
Or with Docker Compose:
|
|
||||||
|
|
||||||
```
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
|
||||||
3
Gemfile
3
Gemfile
|
|
@ -1,3 +0,0 @@
|
||||||
source 'https://rubygems.org'
|
|
||||||
gem 'github-pages', '>= 232', group: :jekyll_plugins
|
|
||||||
gem 'webrick', '~> 1.8'
|
|
||||||
293
Gemfile.lock
293
Gemfile.lock
|
|
@ -1,293 +0,0 @@
|
||||||
GEM
|
|
||||||
remote: https://rubygems.org/
|
|
||||||
specs:
|
|
||||||
activesupport (7.2.3.1)
|
|
||||||
base64
|
|
||||||
benchmark (>= 0.3)
|
|
||||||
bigdecimal
|
|
||||||
concurrent-ruby (~> 1.0, >= 1.3.1)
|
|
||||||
connection_pool (>= 2.2.5)
|
|
||||||
drb
|
|
||||||
i18n (>= 1.6, < 2)
|
|
||||||
logger (>= 1.4.2)
|
|
||||||
minitest (>= 5.1, < 6)
|
|
||||||
securerandom (>= 0.3)
|
|
||||||
tzinfo (~> 2.0, >= 2.0.5)
|
|
||||||
addressable (2.8.9)
|
|
||||||
public_suffix (>= 2.0.2, < 8.0)
|
|
||||||
base64 (0.3.0)
|
|
||||||
benchmark (0.5.0)
|
|
||||||
bigdecimal (4.0.1)
|
|
||||||
coffee-script (2.4.1)
|
|
||||||
coffee-script-source
|
|
||||||
execjs
|
|
||||||
coffee-script-source (1.12.2)
|
|
||||||
colorator (1.1.0)
|
|
||||||
commonmarker (0.23.12)
|
|
||||||
concurrent-ruby (1.3.6)
|
|
||||||
connection_pool (3.0.2)
|
|
||||||
csv (3.3.5)
|
|
||||||
dnsruby (1.73.1)
|
|
||||||
base64 (>= 0.2)
|
|
||||||
logger (~> 1.6)
|
|
||||||
simpleidn (~> 0.2.1)
|
|
||||||
drb (2.2.3)
|
|
||||||
em-websocket (0.5.3)
|
|
||||||
eventmachine (>= 0.12.9)
|
|
||||||
http_parser.rb (~> 0)
|
|
||||||
ethon (0.18.0)
|
|
||||||
ffi (>= 1.15.0)
|
|
||||||
logger
|
|
||||||
eventmachine (1.2.7)
|
|
||||||
execjs (2.10.0)
|
|
||||||
faraday (2.14.1)
|
|
||||||
faraday-net_http (>= 2.0, < 3.5)
|
|
||||||
json
|
|
||||||
logger
|
|
||||||
faraday-net_http (3.4.2)
|
|
||||||
net-http (~> 0.5)
|
|
||||||
ffi (1.17.3)
|
|
||||||
ffi (1.17.3-x86_64-linux-gnu)
|
|
||||||
ffi (1.17.3-x86_64-linux-musl)
|
|
||||||
forwardable-extended (2.6.0)
|
|
||||||
gemoji (4.1.0)
|
|
||||||
github-pages (232)
|
|
||||||
github-pages-health-check (= 1.18.2)
|
|
||||||
jekyll (= 3.10.0)
|
|
||||||
jekyll-avatar (= 0.8.0)
|
|
||||||
jekyll-coffeescript (= 1.2.2)
|
|
||||||
jekyll-commonmark-ghpages (= 0.5.1)
|
|
||||||
jekyll-default-layout (= 0.1.5)
|
|
||||||
jekyll-feed (= 0.17.0)
|
|
||||||
jekyll-gist (= 1.5.0)
|
|
||||||
jekyll-github-metadata (= 2.16.1)
|
|
||||||
jekyll-include-cache (= 0.2.1)
|
|
||||||
jekyll-mentions (= 1.6.0)
|
|
||||||
jekyll-optional-front-matter (= 0.3.2)
|
|
||||||
jekyll-paginate (= 1.1.0)
|
|
||||||
jekyll-readme-index (= 0.3.0)
|
|
||||||
jekyll-redirect-from (= 0.16.0)
|
|
||||||
jekyll-relative-links (= 0.6.1)
|
|
||||||
jekyll-remote-theme (= 0.4.3)
|
|
||||||
jekyll-sass-converter (= 1.5.2)
|
|
||||||
jekyll-seo-tag (= 2.8.0)
|
|
||||||
jekyll-sitemap (= 1.4.0)
|
|
||||||
jekyll-swiss (= 1.0.0)
|
|
||||||
jekyll-theme-architect (= 0.2.0)
|
|
||||||
jekyll-theme-cayman (= 0.2.0)
|
|
||||||
jekyll-theme-dinky (= 0.2.0)
|
|
||||||
jekyll-theme-hacker (= 0.2.0)
|
|
||||||
jekyll-theme-leap-day (= 0.2.0)
|
|
||||||
jekyll-theme-merlot (= 0.2.0)
|
|
||||||
jekyll-theme-midnight (= 0.2.0)
|
|
||||||
jekyll-theme-minimal (= 0.2.0)
|
|
||||||
jekyll-theme-modernist (= 0.2.0)
|
|
||||||
jekyll-theme-primer (= 0.6.0)
|
|
||||||
jekyll-theme-slate (= 0.2.0)
|
|
||||||
jekyll-theme-tactile (= 0.2.0)
|
|
||||||
jekyll-theme-time-machine (= 0.2.0)
|
|
||||||
jekyll-titles-from-headings (= 0.5.3)
|
|
||||||
jemoji (= 0.13.0)
|
|
||||||
kramdown (= 2.4.0)
|
|
||||||
kramdown-parser-gfm (= 1.1.0)
|
|
||||||
liquid (= 4.0.4)
|
|
||||||
mercenary (~> 0.3)
|
|
||||||
minima (= 2.5.1)
|
|
||||||
nokogiri (>= 1.16.2, < 2.0)
|
|
||||||
rouge (= 3.30.0)
|
|
||||||
terminal-table (~> 1.4)
|
|
||||||
webrick (~> 1.8)
|
|
||||||
github-pages-health-check (1.18.2)
|
|
||||||
addressable (~> 2.3)
|
|
||||||
dnsruby (~> 1.60)
|
|
||||||
octokit (>= 4, < 8)
|
|
||||||
public_suffix (>= 3.0, < 6.0)
|
|
||||||
typhoeus (~> 1.3)
|
|
||||||
html-pipeline (2.14.3)
|
|
||||||
activesupport (>= 2)
|
|
||||||
nokogiri (>= 1.4)
|
|
||||||
http_parser.rb (0.8.1)
|
|
||||||
i18n (1.14.8)
|
|
||||||
concurrent-ruby (~> 1.0)
|
|
||||||
jekyll (3.10.0)
|
|
||||||
addressable (~> 2.4)
|
|
||||||
colorator (~> 1.0)
|
|
||||||
csv (~> 3.0)
|
|
||||||
em-websocket (~> 0.5)
|
|
||||||
i18n (>= 0.7, < 2)
|
|
||||||
jekyll-sass-converter (~> 1.0)
|
|
||||||
jekyll-watch (~> 2.0)
|
|
||||||
kramdown (>= 1.17, < 3)
|
|
||||||
liquid (~> 4.0)
|
|
||||||
mercenary (~> 0.3.3)
|
|
||||||
pathutil (~> 0.9)
|
|
||||||
rouge (>= 1.7, < 4)
|
|
||||||
safe_yaml (~> 1.0)
|
|
||||||
webrick (>= 1.0)
|
|
||||||
jekyll-avatar (0.8.0)
|
|
||||||
jekyll (>= 3.0, < 5.0)
|
|
||||||
jekyll-coffeescript (1.2.2)
|
|
||||||
coffee-script (~> 2.2)
|
|
||||||
coffee-script-source (~> 1.12)
|
|
||||||
jekyll-commonmark (1.4.0)
|
|
||||||
commonmarker (~> 0.22)
|
|
||||||
jekyll-commonmark-ghpages (0.5.1)
|
|
||||||
commonmarker (>= 0.23.7, < 1.1.0)
|
|
||||||
jekyll (>= 3.9, < 4.0)
|
|
||||||
jekyll-commonmark (~> 1.4.0)
|
|
||||||
rouge (>= 2.0, < 5.0)
|
|
||||||
jekyll-default-layout (0.1.5)
|
|
||||||
jekyll (>= 3.0, < 5.0)
|
|
||||||
jekyll-feed (0.17.0)
|
|
||||||
jekyll (>= 3.7, < 5.0)
|
|
||||||
jekyll-gist (1.5.0)
|
|
||||||
octokit (~> 4.2)
|
|
||||||
jekyll-github-metadata (2.16.1)
|
|
||||||
jekyll (>= 3.4, < 5.0)
|
|
||||||
octokit (>= 4, < 7, != 4.4.0)
|
|
||||||
jekyll-include-cache (0.2.1)
|
|
||||||
jekyll (>= 3.7, < 5.0)
|
|
||||||
jekyll-mentions (1.6.0)
|
|
||||||
html-pipeline (~> 2.3)
|
|
||||||
jekyll (>= 3.7, < 5.0)
|
|
||||||
jekyll-optional-front-matter (0.3.2)
|
|
||||||
jekyll (>= 3.0, < 5.0)
|
|
||||||
jekyll-paginate (1.1.0)
|
|
||||||
jekyll-readme-index (0.3.0)
|
|
||||||
jekyll (>= 3.0, < 5.0)
|
|
||||||
jekyll-redirect-from (0.16.0)
|
|
||||||
jekyll (>= 3.3, < 5.0)
|
|
||||||
jekyll-relative-links (0.6.1)
|
|
||||||
jekyll (>= 3.3, < 5.0)
|
|
||||||
jekyll-remote-theme (0.4.3)
|
|
||||||
addressable (~> 2.0)
|
|
||||||
jekyll (>= 3.5, < 5.0)
|
|
||||||
jekyll-sass-converter (>= 1.0, <= 3.0.0, != 2.0.0)
|
|
||||||
rubyzip (>= 1.3.0, < 3.0)
|
|
||||||
jekyll-sass-converter (1.5.2)
|
|
||||||
sass (~> 3.4)
|
|
||||||
jekyll-seo-tag (2.8.0)
|
|
||||||
jekyll (>= 3.8, < 5.0)
|
|
||||||
jekyll-sitemap (1.4.0)
|
|
||||||
jekyll (>= 3.7, < 5.0)
|
|
||||||
jekyll-swiss (1.0.0)
|
|
||||||
jekyll-theme-architect (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-cayman (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-dinky (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-hacker (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-leap-day (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-merlot (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-midnight (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-minimal (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-modernist (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-primer (0.6.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-github-metadata (~> 2.9)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-slate (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-tactile (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-theme-time-machine (0.2.0)
|
|
||||||
jekyll (> 3.5, < 5.0)
|
|
||||||
jekyll-seo-tag (~> 2.0)
|
|
||||||
jekyll-titles-from-headings (0.5.3)
|
|
||||||
jekyll (>= 3.3, < 5.0)
|
|
||||||
jekyll-watch (2.2.1)
|
|
||||||
listen (~> 3.0)
|
|
||||||
jemoji (0.13.0)
|
|
||||||
gemoji (>= 3, < 5)
|
|
||||||
html-pipeline (~> 2.2)
|
|
||||||
jekyll (>= 3.0, < 5.0)
|
|
||||||
json (2.19.2)
|
|
||||||
kramdown (2.4.0)
|
|
||||||
rexml
|
|
||||||
kramdown-parser-gfm (1.1.0)
|
|
||||||
kramdown (~> 2.0)
|
|
||||||
liquid (4.0.4)
|
|
||||||
listen (3.10.0)
|
|
||||||
logger
|
|
||||||
rb-fsevent (~> 0.10, >= 0.10.3)
|
|
||||||
rb-inotify (~> 0.9, >= 0.9.10)
|
|
||||||
logger (1.7.0)
|
|
||||||
mercenary (0.3.6)
|
|
||||||
mini_portile2 (2.8.9)
|
|
||||||
minima (2.5.1)
|
|
||||||
jekyll (>= 3.5, < 5.0)
|
|
||||||
jekyll-feed (~> 0.9)
|
|
||||||
jekyll-seo-tag (~> 2.1)
|
|
||||||
minitest (5.27.0)
|
|
||||||
net-http (0.9.1)
|
|
||||||
uri (>= 0.11.1)
|
|
||||||
nokogiri (1.19.1)
|
|
||||||
mini_portile2 (~> 2.8.2)
|
|
||||||
racc (~> 1.4)
|
|
||||||
nokogiri (1.19.1-x86_64-linux-gnu)
|
|
||||||
racc (~> 1.4)
|
|
||||||
nokogiri (1.19.1-x86_64-linux-musl)
|
|
||||||
racc (~> 1.4)
|
|
||||||
octokit (4.25.1)
|
|
||||||
faraday (>= 1, < 3)
|
|
||||||
sawyer (~> 0.9)
|
|
||||||
pathutil (0.16.2)
|
|
||||||
forwardable-extended (~> 2.6)
|
|
||||||
public_suffix (5.1.1)
|
|
||||||
racc (1.8.1)
|
|
||||||
rb-fsevent (0.11.2)
|
|
||||||
rb-inotify (0.11.1)
|
|
||||||
ffi (~> 1.0)
|
|
||||||
rexml (3.4.4)
|
|
||||||
rouge (3.30.0)
|
|
||||||
rubyzip (2.4.1)
|
|
||||||
safe_yaml (1.0.5)
|
|
||||||
sass (3.7.4)
|
|
||||||
sass-listen (~> 4.0.0)
|
|
||||||
sass-listen (4.0.0)
|
|
||||||
rb-fsevent (~> 0.9, >= 0.9.4)
|
|
||||||
rb-inotify (~> 0.9, >= 0.9.7)
|
|
||||||
sawyer (0.9.3)
|
|
||||||
addressable (>= 2.3.5)
|
|
||||||
faraday (>= 0.17.3, < 3)
|
|
||||||
securerandom (0.4.1)
|
|
||||||
simpleidn (0.2.3)
|
|
||||||
terminal-table (1.8.0)
|
|
||||||
unicode-display_width (~> 1.1, >= 1.1.1)
|
|
||||||
typhoeus (1.4.1)
|
|
||||||
ethon (>= 0.9.0)
|
|
||||||
tzinfo (2.0.6)
|
|
||||||
concurrent-ruby (~> 1.0)
|
|
||||||
unicode-display_width (1.8.0)
|
|
||||||
uri (1.1.1)
|
|
||||||
webrick (1.9.2)
|
|
||||||
|
|
||||||
PLATFORMS
|
|
||||||
ruby
|
|
||||||
x86_64-linux
|
|
||||||
x86_64-linux-musl
|
|
||||||
|
|
||||||
DEPENDENCIES
|
|
||||||
github-pages (>= 232)
|
|
||||||
webrick (~> 1.8)
|
|
||||||
|
|
||||||
BUNDLED WITH
|
|
||||||
2.3.25
|
|
||||||
|
|
@ -179,7 +179,7 @@ There are many ways to contribute: writing code, alerting rules, documentation,
|
||||||
|
|
||||||
## 🏋️ Improvements
|
## 🏋️ Improvements
|
||||||
|
|
||||||
- Create an alert rule builder in Jekyll for custom alerts (severity, thresholds, instances...)
|
- Create an alert rule builder for custom alerts (severity, thresholds, instances...)
|
||||||
- Add resolution suggestions to rule descriptions, for faster incident resolution ([#85](https://github.com/samber/awesome-prometheus-alerts/issues/85)).
|
- Add resolution suggestions to rule descriptions, for faster incident resolution ([#85](https://github.com/samber/awesome-prometheus-alerts/issues/85)).
|
||||||
|
|
||||||
## 💫 Show your support
|
## 💫 Show your support
|
||||||
|
|
|
||||||
|
|
@ -1,8 +0,0 @@
|
||||||
theme: jekyll-theme-cayman
|
|
||||||
|
|
||||||
title: Awesome Prometheus alerts
|
|
||||||
description: Collection of alerting rules
|
|
||||||
|
|
||||||
repository: samber/awesome-prometheus-alerts
|
|
||||||
|
|
||||||
baseurl: /awesome-prometheus-alerts
|
|
||||||
|
|
@ -1,162 +0,0 @@
|
||||||
<!DOCTYPE html>
|
|
||||||
<html lang="{{ site.lang | default: "en-US" }}">
|
|
||||||
|
|
||||||
<head>
|
|
||||||
<meta charset="UTF-8">
|
|
||||||
{% seo %}
|
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
|
||||||
<meta name="theme-color" content="#157878">
|
|
||||||
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
|
|
||||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
|
|
||||||
<link rel="stylesheet" href="{{ '/assets/css/style.css?v=' | append: site.github.build_revision | relative_url }}">
|
|
||||||
<link rel="stylesheet" href="{{ '/assets/css/app.css?v=' | append: site.github.build_revision | relative_url }}">
|
|
||||||
<link rel="icon" type="image/x-icon" href="{{ '/assets/favicon.ico' | relative_url }}">
|
|
||||||
|
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
|
|
||||||
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/js/bootstrap.min.js"></script>
|
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js"></script>
|
|
||||||
<script src="{{ '/assets/js/app.js?v=' | append: site.github.build_revision | relative_url }}"></script>
|
|
||||||
|
|
||||||
<!-- Global site tag (gtag.js) - Google Analytics -->
|
|
||||||
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-118604063-2"></script>
|
|
||||||
<script>
|
|
||||||
window.dataLayer = window.dataLayer || [];
|
|
||||||
|
|
||||||
function gtag() {
|
|
||||||
dataLayer.push(arguments);
|
|
||||||
}
|
|
||||||
gtag('js', new Date());
|
|
||||||
|
|
||||||
gtag('config', 'UA-118604063-2');
|
|
||||||
</script>
|
|
||||||
|
|
||||||
</head>
|
|
||||||
|
|
||||||
<body>
|
|
||||||
<style>
|
|
||||||
#skip-to-content {
|
|
||||||
height: 1px;
|
|
||||||
width: 1px;
|
|
||||||
position: absolute;
|
|
||||||
overflow: hidden;
|
|
||||||
top: -10px;
|
|
||||||
|
|
||||||
&:focus {
|
|
||||||
position: fixed;
|
|
||||||
top: 10px;
|
|
||||||
left: 10px;
|
|
||||||
height: auto;
|
|
||||||
width: auto;
|
|
||||||
background: invert($body-link-color);
|
|
||||||
outline: thick solid invert($body-link-color);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
ul.github-buttons-cta li {
|
|
||||||
display: inline-block;
|
|
||||||
height: 20px;
|
|
||||||
padding: 0px 15px;
|
|
||||||
}
|
|
||||||
|
|
||||||
ul.github-buttons-cta li a {
|
|
||||||
/* width: 100px; */
|
|
||||||
text-decoration: none;
|
|
||||||
}
|
|
||||||
|
|
||||||
.fa {
|
|
||||||
/* padding: 14px;
|
|
||||||
width: 50px;
|
|
||||||
height: 50px; */
|
|
||||||
font-size: 25px;
|
|
||||||
text-align: center;
|
|
||||||
text-decoration: none;
|
|
||||||
border-radius: 50%;
|
|
||||||
}
|
|
||||||
|
|
||||||
.fa:hover {
|
|
||||||
opacity: 0.8;
|
|
||||||
}
|
|
||||||
|
|
||||||
.fa-twitter,
|
|
||||||
.fa-linkedin {
|
|
||||||
/* background: #55ACEE; */
|
|
||||||
color: white;
|
|
||||||
}
|
|
||||||
</style>
|
|
||||||
<a id="skip-to-content" href="#content">Skip to the content.</a>
|
|
||||||
|
|
||||||
<header class="page-header" role="banner">
|
|
||||||
<h1 class="project-name">
|
|
||||||
<a href="{{ '/' | relative_url }}" style="color: white">
|
|
||||||
{{ site.title | default: site.github.repository_name }}
|
|
||||||
</a>
|
|
||||||
</h1>
|
|
||||||
<h2 class="project-tagline">{{ site.description | default: site.github.project_tagline }}</h2>
|
|
||||||
<a href="{{ '/alertmanager' | relative_url }}" class="btn">Global configuration</a>
|
|
||||||
<a href="{{ '/rules' | relative_url }}" class="btn">Rules</a>
|
|
||||||
<a href="{{ '/sleep-peacefully' | relative_url }}" class="btn">Sleep peacefully</a>
|
|
||||||
<a href="{{ '/blackbox-exporter' | relative_url }}" class="btn">Blackbox</a>
|
|
||||||
<a href="https://github.com/samber/awesome-prometheus-alerts/blob/master/CONTRIBUTING.md" class="btn">
|
|
||||||
Contribute on GitHub
|
|
||||||
</a>
|
|
||||||
|
|
||||||
<ul class="github-buttons-cta">
|
|
||||||
<li>
|
|
||||||
<a href="https://github.com/samber/awesome-prometheus-alerts">
|
|
||||||
<img alt="GitHub Repo Watchers" src="https://img.shields.io/github/watchers/samber/awesome-prometheus-alerts?style=social">
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="https://github.com/samber/awesome-prometheus-alerts">
|
|
||||||
<img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/samber/awesome-prometheus-alerts?style=social">
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="https://github.com/samber/awesome-prometheus-alerts">
|
|
||||||
<img alt="GitHub Repo forks" src="https://img.shields.io/github/forks/samber/awesome-prometheus-alerts?style=social">
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="https://twitter.com/share?via=samuelberthe&related=samuelberthe&text=🚨 📊 Here is a collection of Awesome Prometheus Alerts&url=https://samber.github.io/awesome-prometheus-alerts"
|
|
||||||
class="fa fa-twitter" target="_blank"></a>
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="http://www.linkedin.com/shareArticle?mini=true&url=https://samber.github.io/awesome-prometheus-alerts/"
|
|
||||||
class="fa fa-linkedin" target="_blank"></a>
|
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
|
|
||||||
<ul id="sponsoring">
|
|
||||||
<li>
|
|
||||||
Kindly supported by 👉
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="https://cast.ai/samuel">
|
|
||||||
<img width="" src="assets/sponsor-cast-ai.png" />
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
<li>
|
|
||||||
<a href="https://betterstack.com/">
|
|
||||||
<img width="" src="assets/sponsor-betterstack.png" />
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
</ul>
|
|
||||||
</header>
|
|
||||||
|
|
||||||
<main id="content" class="main-content" role="main">
|
|
||||||
{{ content }}
|
|
||||||
|
|
||||||
<footer class="site-footer">
|
|
||||||
{% if site.github.is_project_page %}
|
|
||||||
<span class="site-footer-owner">
|
|
||||||
<a href="{{ site.github.repository_url }}">{{ site.title }}</a> is maintained by
|
|
||||||
<a href="{{ site.github.owner_url }}">{{ site.github.owner_name }}</a>.
|
|
||||||
</span>
|
|
||||||
{% endif %}
|
|
||||||
</footer>
|
|
||||||
</main>
|
|
||||||
|
|
||||||
</body>
|
|
||||||
|
|
||||||
</html>
|
|
||||||
141
alertmanager.md
141
alertmanager.md
|
|
@ -1,141 +0,0 @@
|
||||||
<h1 style="text-align: center;">
|
|
||||||
Global configuration
|
|
||||||
</h1>
|
|
||||||
|
|
||||||
If you notice a delay between an event and the first notification, read the following blog post => [https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html](https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html).
|
|
||||||
|
|
||||||
## Prometheus configuration
|
|
||||||
|
|
||||||
{% highlight yaml %}
|
|
||||||
# prometheus.yml
|
|
||||||
|
|
||||||
global:
|
|
||||||
scrape_interval: 20s
|
|
||||||
|
|
||||||
# A short evaluation_interval will check alerting rules very often.
|
|
||||||
# It can be costly if you run Prometheus with 100+ alerts.
|
|
||||||
evaluation_interval: 20s
|
|
||||||
...
|
|
||||||
|
|
||||||
rule_files:
|
|
||||||
- 'alerts/*.yml'
|
|
||||||
|
|
||||||
scrape_configs:
|
|
||||||
...
|
|
||||||
|
|
||||||
{% endhighlight %}
|
|
||||||
|
|
||||||
{% highlight yaml %}
|
|
||||||
# alerts/example-redis.yml
|
|
||||||
|
|
||||||
groups:
|
|
||||||
|
|
||||||
- name: ExampleRedisGroup
|
|
||||||
rules:
|
|
||||||
- alert: ExampleRedisDown
|
|
||||||
expr: redis_up{} == 0
|
|
||||||
for: 2m
|
|
||||||
labels:
|
|
||||||
severity: critical
|
|
||||||
annotations:
|
|
||||||
summary: "Redis instance down"
|
|
||||||
description: "Whatever"
|
|
||||||
|
|
||||||
{% endhighlight %}
|
|
||||||
|
|
||||||
## AlertManager configuration
|
|
||||||
|
|
||||||
{% highlight yaml %}
|
|
||||||
{% raw %}
|
|
||||||
# alertmanager.yml
|
|
||||||
|
|
||||||
route:
|
|
||||||
# When a new group of alerts is created by an incoming alert, wait at
|
|
||||||
# least 'group_wait' to send the initial notification.
|
|
||||||
# This way ensures that you get multiple alerts for the same group that start
|
|
||||||
# firing shortly after another are batched together on the first
|
|
||||||
# notification.
|
|
||||||
group_wait: 10s
|
|
||||||
|
|
||||||
# When the first notification was sent, wait 'group_interval' to send a batch
|
|
||||||
# of new alerts that started firing for that group.
|
|
||||||
group_interval: 30s
|
|
||||||
|
|
||||||
# If an alert has successfully been sent, wait 'repeat_interval' to
|
|
||||||
# resend them.
|
|
||||||
repeat_interval: 30m
|
|
||||||
|
|
||||||
# A default receiver
|
|
||||||
receiver: "slack"
|
|
||||||
|
|
||||||
# All the above attributes are inherited by all child routes and can
|
|
||||||
# overwritten on each.
|
|
||||||
routes:
|
|
||||||
- receiver: "slack"
|
|
||||||
group_wait: 10s
|
|
||||||
match_re:
|
|
||||||
severity: critical|warning
|
|
||||||
continue: true
|
|
||||||
|
|
||||||
- receiver: "pager"
|
|
||||||
group_wait: 10s
|
|
||||||
match_re:
|
|
||||||
severity: critical
|
|
||||||
continue: true
|
|
||||||
|
|
||||||
receivers:
|
|
||||||
- name: "slack"
|
|
||||||
slack_configs:
|
|
||||||
- api_url: 'https://hooks.slack.com/services/XXXXXXXXX/XXXXXXXXX/xxxxxxxxxxxxxxxxxxxxxxxxxxx'
|
|
||||||
send_resolved: true
|
|
||||||
channel: 'monitoring'
|
|
||||||
text: "{{ range .Alerts }}<!channel> {{ .Annotations.summary }}\n{{ .Annotations.description }}\n{{ end }}"
|
|
||||||
|
|
||||||
- name: "pager"
|
|
||||||
webhook_configs:
|
|
||||||
- url: http://a.b.c.d:8080/send/sms
|
|
||||||
send_resolved: true
|
|
||||||
|
|
||||||
{% endraw %}
|
|
||||||
{% endhighlight %}
|
|
||||||
|
|
||||||
## Reduce Prometheus server load
|
|
||||||
|
|
||||||
For expansive or frequent PromQL queries, Prometheus allows to precompute rules.
|
|
||||||
|
|
||||||
{% highlight yaml %}
|
|
||||||
{% raw %}
|
|
||||||
groups:
|
|
||||||
|
|
||||||
# first define the recorded rule
|
|
||||||
- name: ExampleRecordedGroup
|
|
||||||
rules:
|
|
||||||
- record: job:rabbitmq_queue_messages_delivered_total:rate:5m
|
|
||||||
expr: rate(rabbitmq_queue_messages_delivered_total[5m])
|
|
||||||
|
|
||||||
# then use it in alerts
|
|
||||||
- name: ExampleAlertingGroup
|
|
||||||
rules:
|
|
||||||
- alert: ExampleRabbitmqLowMessageDelivery
|
|
||||||
expr: sum(job:rabbitmq_queue_messages_delivered_total:rate:5m) < 10
|
|
||||||
for: 2m
|
|
||||||
labels:
|
|
||||||
severity: critical
|
|
||||||
annotations:
|
|
||||||
summary: "Low delivery rate in Rabbitmq queues"
|
|
||||||
{% endraw %}
|
|
||||||
{% endhighlight %}
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
If the notification takes too much time to be triggered, check the following delays:
|
|
||||||
- `scrape_interval = 20s` (prometheus.yml)
|
|
||||||
- `evaluation_interval = 20s` (prometheus.yml)
|
|
||||||
- `increase(mysql_global_status_slow_queries[1m]) > 0` (alerts/example-mysql.yml)
|
|
||||||
- `for: 5m` (alerts/example-mysql.yml)
|
|
||||||
- `group_wait = 10s` (alertmanager.yml)
|
|
||||||
|
|
||||||
Also read:
|
|
||||||
- [https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html](https://pracucci.com/prometheus-understanding-the-delays-on-alerting.html).
|
|
||||||
- [https://hodovi.cc/blog/creating-awesome-alertmanager-templates-for-slack/](https://hodovi.cc/blog/creating-awesome-alertmanager-templates-for-slack/)
|
|
||||||
- [https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to-efficiently-detect-anomalies-at-scale/](https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to-efficiently-detect-anomalies-at-scale/)
|
|
||||||
|
|
@ -1,125 +0,0 @@
|
||||||
|
|
||||||
<h1 style="text-align: center;">
|
|
||||||
Blackbox exporter
|
|
||||||
</h1>
|
|
||||||
|
|
||||||
## Wordwide probes
|
|
||||||
|
|
||||||
<a href="https://github.com/prometheus/blackbox_exporter" target="_blank">Blackbox Exporter</a> gives you the ability to probe endpoints over HTTP, HTTPS, DNS, TCP and ICMP.
|
|
||||||
|
|
||||||
You should deploy blackbox exporters in multiple Point of Presence around the globe, to monitor latency. Feel free to use the following endpoints for your own projects:
|
|
||||||
|
|
||||||
- https://probe-<b>montreal</b>.cleverapps.io
|
|
||||||
- https://probe-<b>paris</b>.cleverapps.io
|
|
||||||
- https://probe-<b>jeddah</b>.cleverapps.io
|
|
||||||
- https://probe-<b>singapore</b>.cleverapps.io
|
|
||||||
- https://probe-<b>sydney</b>.cleverapps.io
|
|
||||||
- https://probe-<b>warsaw</b>.cleverapps.io
|
|
||||||
|
|
||||||
☝️ Logs have been disabled. More probes from the community would be appreciated, please contribute <a href="https://github.com/samber/awesome-prometheus-alerts/" target="_blank">here</a>! These blackbox exporters use the following <a href="https://github.com/samber/blackbox_exporter/blob/master/samber.yml" target="_blank">configuration</a>.
|
|
||||||
|
|
||||||
## Prometheus Configuration
|
|
||||||
|
|
||||||
Blackbox exporters and endpoints must be declared in Prometheus. Here is a simple configuration, inspired by [Hayk Davtyan medium post](https://medium.com/geekculture/single-prometheus-job-for-dozens-of-blackbox-exporters-2a7ba492d6c8):
|
|
||||||
|
|
||||||
```yml
|
|
||||||
# sd/blackbox.yml
|
|
||||||
|
|
||||||
- targets:
|
|
||||||
#
|
|
||||||
# Montreal
|
|
||||||
#
|
|
||||||
# http
|
|
||||||
- probe-montreal.cleverapps.io:_:http_2xx:_:Montreal:_:f229cy:_:https://api.screeb.app
|
|
||||||
- probe-montreal.cleverapps.io:_:http_2xx:_:Montreal:_:f229cy:_:https://t.screeb.app/tag.js
|
|
||||||
# icmp
|
|
||||||
- probe-montreal.cleverapps.io:_:icmp_ipv4:_:Montreal:_:f229cy:_:api.screeb.app
|
|
||||||
- probe-montreal.cleverapps.io:_:icmp_ipv4:_:Montreal:_:f229cy:_:t.screeb.app
|
|
||||||
|
|
||||||
|
|
||||||
#
|
|
||||||
# Paris
|
|
||||||
#
|
|
||||||
# http
|
|
||||||
- probe-paris.cleverapps.io:_:http_2xx:_:Paris:_:u09tgy:_:https://api.screeb.app
|
|
||||||
- probe-paris.cleverapps.io:_:http_2xx:_:Paris:_:u09tgy:_:https://t.screeb.app/tag.js
|
|
||||||
# icmp
|
|
||||||
- probe-paris.cleverapps.io:_:icmp_ipv4:_:Paris:_:u09tgy:_:api.screeb.app
|
|
||||||
- probe-paris.cleverapps.io:_:icmp_ipv4:_:Paris:_:u09tgy:_:t.screeb.app
|
|
||||||
|
|
||||||
|
|
||||||
#
|
|
||||||
# Sydney
|
|
||||||
#
|
|
||||||
# http
|
|
||||||
- probe-sydney.cleverapps.io:_:http_2xx:_:Sydney:_:r3gpkn:_:https://api.screeb.app
|
|
||||||
- probe-sydney.cleverapps.io:_:http_2xx:_:Sydney:_:r3gpkn:_:https://t.screeb.app/tag.js
|
|
||||||
# icmp
|
|
||||||
- probe-sydney.cleverapps.io:_:icmp_ipv4:_:Sydney:_:r3gpkn:_:api.screeb.app
|
|
||||||
- probe-sydney.cleverapps.io:_:icmp_ipv4:_:Sydney:_:r3gpkn:_:t.screeb.app
|
|
||||||
|
|
||||||
# ...
|
|
||||||
```
|
|
||||||
|
|
||||||
```yml
|
|
||||||
# prometheus.yml
|
|
||||||
|
|
||||||
global:
|
|
||||||
# ...
|
|
||||||
|
|
||||||
scrape_configs:
|
|
||||||
|
|
||||||
- job_name: 'blackbox'
|
|
||||||
metrics_path: /probe
|
|
||||||
scrape_interval: 30s
|
|
||||||
scheme: https
|
|
||||||
file_sd_configs:
|
|
||||||
- files:
|
|
||||||
- /etc/prometheus/sd/blackbox.yml
|
|
||||||
relabel_configs:
|
|
||||||
# adds "module" label in the final labelset
|
|
||||||
- source_labels: [__address__]
|
|
||||||
regex: '.*:_:(.*):_:.*:_:.*:_:.*'
|
|
||||||
target_label: module
|
|
||||||
# adds "geohash" label in the final labelset
|
|
||||||
- source_labels: [__address__]
|
|
||||||
regex: '.*:_:.*:_:.*:_:(.*):_:.*'
|
|
||||||
target_label: geohash
|
|
||||||
# rewrites "instance" label with corresponding URL
|
|
||||||
- source_labels: [__address__]
|
|
||||||
regex: '.*:_:.*:_:.*:_:.*:_:(.*)'
|
|
||||||
target_label: instance
|
|
||||||
# rewrites "pop" label with corresponding location name
|
|
||||||
- source_labels: [__address__]
|
|
||||||
regex: '.*:_:.*:_:(.*):_:.*:_:.*'
|
|
||||||
target_label: pop
|
|
||||||
# passes "module" parameter to Blackbox exporter
|
|
||||||
- source_labels: [module]
|
|
||||||
target_label: __param_module
|
|
||||||
# passes "target" parameter to Blackbox exporter
|
|
||||||
- source_labels: [instance]
|
|
||||||
target_label: __param_target
|
|
||||||
# the Blackbox exporter's real hostname:port
|
|
||||||
- source_labels: [__address__]
|
|
||||||
regex: '(.*):_:.*:_:.*:_:.*:_:.*'
|
|
||||||
target_label: __address__
|
|
||||||
|
|
||||||
# ...
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
## Geohash
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
To display nice maps in Grafana, you need to instruct blackbox exporters about the location. Grafana map panel speaks the "geohash" format:
|
|
||||||
|
|
||||||
- go to google map
|
|
||||||
- extract the lat/long from the url
|
|
||||||
- convert lat/long to geohash here: http://geohash.co
|
|
||||||
|
|
||||||
## Grafana
|
|
||||||
|
|
||||||
Some great dashboard have been created by the community: https://grafana.com/grafana/dashboards/?search=blackbox
|
|
||||||
|
|
||||||
Since Grafana v5.0.0, a map panel is available: https://grafana.com/docs/grafana/latest/panels-visualizations/visualizations/geomap/
|
|
||||||
|
|
@ -1,11 +0,0 @@
|
||||||
version: '3'
|
|
||||||
|
|
||||||
services:
|
|
||||||
|
|
||||||
jekyll:
|
|
||||||
image: jekyll/jekyll:latest
|
|
||||||
command: jekyll serve
|
|
||||||
volumes:
|
|
||||||
- ./:/srv/jekyll
|
|
||||||
ports:
|
|
||||||
- 4000:4000
|
|
||||||
54
index.md
54
index.md
|
|
@ -1,54 +0,0 @@
|
||||||
|
|
||||||
<style>
|
|
||||||
.center-image
|
|
||||||
{
|
|
||||||
margin: 0 auto;
|
|
||||||
display: block;
|
|
||||||
}
|
|
||||||
</style>
|
|
||||||
|
|
||||||
|
|
||||||
{: .center-image }
|
|
||||||
|
|
||||||
|
|
||||||
<h2>
|
|
||||||
Hello world
|
|
||||||
</h2>
|
|
||||||
|
|
||||||
<a href="/awesome-prometheus-alerts/alertmanager">
|
|
||||||
AlertManager configuration
|
|
||||||
</a>
|
|
||||||
|
|
||||||
<a href="/awesome-prometheus-alerts/sleep-peacefully">
|
|
||||||
Alerting time window
|
|
||||||
</a>
|
|
||||||
|
|
||||||
<h2>
|
|
||||||
Out of the box prometheus alerting rules
|
|
||||||
</h2>
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
{% for group in site.data.rules.groups %}
|
|
||||||
<li style="margin-top: 30px;">
|
|
||||||
{% assign nbrRules = 0 %}
|
|
||||||
{% for service in group.services %}
|
|
||||||
{% for exporter in service.exporters %}
|
|
||||||
{% for rule in exporter.rules %}
|
|
||||||
{% assign nbrRules = nbrRules | plus: 1 %}
|
|
||||||
{% endfor %}
|
|
||||||
{% endfor %}
|
|
||||||
{% endfor %}
|
|
||||||
|
|
||||||
<h3>{{ group.name }} <small style="margin-left: 20px;">({{ nbrRules }} rules)</small></h3>
|
|
||||||
<ul>
|
|
||||||
{% for service in group.services %}
|
|
||||||
<li>
|
|
||||||
<a href="/awesome-prometheus-alerts/rules#{{ service.name | replace: " ", "-" | downcase }}">
|
|
||||||
{{ service.name }}
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
141
rules.md
141
rules.md
|
|
@ -1,141 +0,0 @@
|
||||||
<style>
|
|
||||||
ul {
|
|
||||||
list-style: none;
|
|
||||||
}
|
|
||||||
</style>
|
|
||||||
|
|
||||||
<!-- CAUTIONS -->
|
|
||||||
<div style="padding: 20px 20px 10px 20px; border: solid grey 1px; border-radius: 10px;">
|
|
||||||
<h2 style="text-align:center;">⚠️ Caution ⚠️</h2>
|
|
||||||
|
|
||||||
<p style="text-align:center;">
|
|
||||||
Alert thresholds depend on nature of applications.
|
|
||||||
<br>
|
|
||||||
Some queries in this page may have arbitrary tolerance threshold.
|
|
||||||
<br><br>
|
|
||||||
Building an efficient and battle-tested monitoring platform takes time. 😉
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<br>
|
|
||||||
<br>
|
|
||||||
|
|
||||||
<h1></h1>
|
|
||||||
|
|
||||||
<!-- RULES -->
|
|
||||||
<ul>
|
|
||||||
{% for group in site.data.rules.groups %}
|
|
||||||
{% assign groupIndex = forloop.index %}
|
|
||||||
{% for service in group.services %}
|
|
||||||
{% assign serviceIndex = forloop.index %}
|
|
||||||
{% assign nbrExporters = service.exporters | size %}
|
|
||||||
{% for exporter in service.exporters %}
|
|
||||||
{% assign exporterIndex = forloop.index %}
|
|
||||||
{% assign nbrRules = exporter.rules | size %}
|
|
||||||
<li>
|
|
||||||
{% assign serviceId = service.name | replace: " ", "-" | downcase %}
|
|
||||||
<h2 id="{{ serviceId }}">
|
|
||||||
<span id="{{ serviceId }}-{{ exporterIndex }}"></span>
|
|
||||||
<a class="anchor" href="#{{ serviceId }}-{{ exporterIndex }}">#</a>
|
|
||||||
{{ groupIndex }}.{{ serviceIndex }}.{% if nbrExporters > 1 %}{{ exporterIndex }}.{% endif %}
|
|
||||||
{{ service.name }}
|
|
||||||
{% if exporter.name %}:
|
|
||||||
{% if exporter.doc_url %}
|
|
||||||
<a href="{{ exporter.doc_url }}">
|
|
||||||
{{ exporter.name }}
|
|
||||||
</a>
|
|
||||||
{% else %}
|
|
||||||
{{ exporter.name }}
|
|
||||||
{% endif %}
|
|
||||||
{% endif %}
|
|
||||||
|
|
||||||
{% if nbrRules > 0 %}
|
|
||||||
<small style="font-size: 60%; vertical-align: middle; margin-left: 10px;">
|
|
||||||
({{ nbrRules }} rules)
|
|
||||||
</small>
|
|
||||||
<span class="clipboard-multiple" data-clipboard-target-id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}">[copy section]</span>
|
|
||||||
{% endif %}
|
|
||||||
</h2>
|
|
||||||
|
|
||||||
{% if nbrRules == 0 %}
|
|
||||||
{% highlight javascript %}
|
|
||||||
// @TODO: Please contribute => https://github.com/samber/awesome-prometheus-alerts 👋
|
|
||||||
{% endhighlight %}
|
|
||||||
{% else %}
|
|
||||||
{{ exporter.comments | strip | newline_to_br }}
|
|
||||||
{% highlight bash %}
|
|
||||||
$ wget https://raw.githubusercontent.com/samber/awesome-prometheus-alerts/refs/heads/master/dist/rules/{{ service.name | replace: " ", "-" | downcase }}/{{ exporter.slug }}.yml
|
|
||||||
{% endhighlight %}
|
|
||||||
{% endif %}
|
|
||||||
|
|
||||||
<ul>
|
|
||||||
{% for rule in exporter.rules %}
|
|
||||||
{% assign ruleIndex = forloop.index %}
|
|
||||||
{% assign comments = rule.comments | strip | newline_to_br | split: '<br />' %}
|
|
||||||
<li>
|
|
||||||
<h4 id="rule-{{ serviceId }}-{{ exporterIndex }}-{{ ruleIndex }}">
|
|
||||||
<span id="rule-{{ serviceId }}-{{ ruleIndex }}"></span><!-- @deprecated -->
|
|
||||||
<a class="anchor" href="#rule-{{ serviceId }}-{{ exporterIndex }}-{{ ruleIndex }}">#</a>
|
|
||||||
{{ groupIndex}}.{{ serviceIndex }}.{% if nbrExporters > 1 %}{{ exporterIndex }}.{% endif %}{{ ruleIndex }}.
|
|
||||||
{{ rule.name }}
|
|
||||||
</h4>
|
|
||||||
<summary>
|
|
||||||
{{ rule.description }}
|
|
||||||
<span class="clipboard-single" data-clipboard-target-id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}-rule-{{ ruleIndex }}" onclick="event.preventDefault();">[copy]</span>
|
|
||||||
</summary>
|
|
||||||
<div id="group-{{ groupIndex }}-service-{{ serviceIndex }}-exporter-{{ exporterIndex }}-rule-{{ ruleIndex }}">
|
|
||||||
{% assign ruleName = rule.name | split: ' ' %}
|
|
||||||
{% capture ruleNameCamelcase %}{% for word in ruleName %}{{ word | capitalize }} {% endfor %}{% endcapture %}
|
|
||||||
|
|
||||||
{% highlight yaml %}
|
|
||||||
{% for comment in comments %}# {{ comment | strip }}
|
|
||||||
{% endfor %}- alert: {{ ruleNameCamelcase | remove: ' ' }}
|
|
||||||
expr: {{ rule.query }}
|
|
||||||
for: {% if rule.for %}{{ rule.for }}{% else %}0m{% endif %}
|
|
||||||
labels:
|
|
||||||
severity: {{ rule.severity }}
|
|
||||||
annotations:
|
|
||||||
summary: {{ rule.name }} (instance {% raw %}{{ $labels.instance }}{% endraw %})
|
|
||||||
description: "{{ rule.description | replace: '"', '\"' }}\n VALUE = {% raw %}{{ $value }}{% endraw %}\n LABELS = {% raw %}{{ $labels }}{% endraw %}"
|
|
||||||
|
|
||||||
{% endhighlight %}
|
|
||||||
|
|
||||||
</div>
|
|
||||||
<br/>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<hr/>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
{% endfor %}
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<!-- NAVBAR -->
|
|
||||||
<div id="rules-navbar" class="affix">
|
|
||||||
<h3>Menu</h3>
|
|
||||||
<ul>
|
|
||||||
{% for group in site.data.rules.groups %}
|
|
||||||
<li>
|
|
||||||
<h4>{{ group.name }}</h4>
|
|
||||||
<ul>
|
|
||||||
{% for service in group.services %}
|
|
||||||
<li>
|
|
||||||
<a href="#{{ service.name | replace: " ", "-" | downcase }}">
|
|
||||||
👉 {{ service.name }}
|
|
||||||
</a>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
</li>
|
|
||||||
{% endfor %}
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<script>
|
|
||||||
$('#rules-navbar').affix({offset: {top: 750} }).css('display', 'block');
|
|
||||||
</script>
|
|
||||||
</div>
|
|
||||||
|
|
@ -1,106 +0,0 @@
|
||||||
<h1 style="text-align: center;">
|
|
||||||
Sleep Peacefully
|
|
||||||
</h1>
|
|
||||||
|
|
||||||
## Alerting time window
|
|
||||||
|
|
||||||
In some applications, load and activity can vary over the day/week/year.
|
|
||||||
|
|
||||||
In order to prevent alarm fatigue and busy pager, alerts can be disabled during a period of time (such as night or weekend).
|
|
||||||
|
|
||||||
Example:
|
|
||||||
|
|
||||||
- Weekday: `node_load5 > 10 and ON() (0 < day_of_week() < 6)`
|
|
||||||
- Day time: `node_load5 > 10 and ON() (8 < hour() < 18)`
|
|
||||||
- Exclude December: `node_load5 > 10 and ON() (month() != 12)`
|
|
||||||
|
|
||||||
## Advanced time windows and timezones
|
|
||||||
|
|
||||||
```yml
|
|
||||||
# rules.yml
|
|
||||||
|
|
||||||
groups:
|
|
||||||
- name: timezones
|
|
||||||
rules:
|
|
||||||
- record: european_summer_time_offset
|
|
||||||
expr: |
|
|
||||||
(vector(1) and (month() > 3 and month() < 10))
|
|
||||||
or
|
|
||||||
(vector(1) and (month() == 3 and (day_of_month() - day_of_week()) >= 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
|
|
||||||
or
|
|
||||||
(vector(1) and (month() == 10 and (day_of_month() - day_of_week()) < 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
|
|
||||||
or
|
|
||||||
(vector(1) and ((month() == 10 and hour() < 1) or (month() == 3 and hour() > 0)) and ((day_of_month() >= 25) and (day_of_week() == 0)))
|
|
||||||
or
|
|
||||||
vector(0)
|
|
||||||
|
|
||||||
- record: europe_london_time
|
|
||||||
expr: time() + 3600 * european_summer_time_offset
|
|
||||||
- record: europe_paris_time
|
|
||||||
expr: time() + 3600 * (1 + european_summer_time_offset)
|
|
||||||
|
|
||||||
- record: europe_london_hour
|
|
||||||
expr: hour(europe_london_time)
|
|
||||||
- record: europe_paris_hour
|
|
||||||
expr: hour(europe_paris_time)
|
|
||||||
|
|
||||||
- record: europe_london_weekday
|
|
||||||
expr: 0 < day_of_week(europe_london_time) < 6
|
|
||||||
- record: europe_paris_weekday
|
|
||||||
expr: 0 < day_of_week(europe_paris_time) < 6
|
|
||||||
# opposite
|
|
||||||
- record: not_europe_london_weekday
|
|
||||||
expr: absent(europe_london_weekday)
|
|
||||||
- record: not_europe_paris_weekday
|
|
||||||
expr: absent(europe_paris_weekday)
|
|
||||||
|
|
||||||
- record: europe_london_business_hours
|
|
||||||
expr: 9 <= europe_london_hour < 18
|
|
||||||
- record: europe_paris_business_hours
|
|
||||||
expr: 9 <= europe_paris_hour < 18
|
|
||||||
# opposite
|
|
||||||
- record: not_europe_london_business_hours
|
|
||||||
expr: absent(europe_london_business_hours)
|
|
||||||
- record: not_europe_paris_business_hours
|
|
||||||
expr: absent(europe_paris_business_hours)
|
|
||||||
|
|
||||||
# new year's day / xmas / labor day / all saints' day / ...
|
|
||||||
- record: europe_french_public_holidays
|
|
||||||
expr: |
|
|
||||||
(vector(1) and month(europe_paris_time) == 1 and day_of_month(europe_paris_time) == 1)
|
|
||||||
or
|
|
||||||
(vector(1) and month(europe_paris_time) == 12 and day_of_month(europe_paris_time) == 25)
|
|
||||||
or
|
|
||||||
(vector(1) and month(europe_paris_time) == 5 and day_of_month(europe_paris_time) == 1)
|
|
||||||
or
|
|
||||||
(vector(1) and month(europe_paris_time) == 11 and day_of_month(europe_paris_time) == 1)
|
|
||||||
or
|
|
||||||
vector(0)
|
|
||||||
# opposite
|
|
||||||
- record: not_europe_french_public_holidays
|
|
||||||
expr: absent(europe_french_public_holidays)
|
|
||||||
```
|
|
||||||
|
|
||||||
```yml
|
|
||||||
# alerts.yml
|
|
||||||
|
|
||||||
groups:
|
|
||||||
- name: CPU Load
|
|
||||||
rules:
|
|
||||||
- alert: HighLoadQuietDuringWeekendAndNight
|
|
||||||
expr: node_load5 > 10 and ON() (europe_london_weekday and europe_paris_weekday)
|
|
||||||
|
|
||||||
- alert: HighLoadQuietDuringBackup
|
|
||||||
expr: node_load5 > 10 and ON() absent(hour() == 2)
|
|
||||||
|
|
||||||
- alert: HighLoad
|
|
||||||
expr: |
|
|
||||||
node_load5 > 20 and ON() (europe_london_weekday and europe_paris_weekday)
|
|
||||||
or
|
|
||||||
node_load5 > 10
|
|
||||||
```
|
|
||||||
|
|
||||||
## Sources
|
|
||||||
|
|
||||||
- [https://medium.com/@tom.fawcett/time-of-day-based-notifications-with-prometheus-and-alertmanager-1bf7a23b7695](https://medium.com/@tom.fawcett/time-of-day-based-notifications-with-prometheus-and-alertmanager-1bf7a23b7695)
|
|
||||||
- [https://promcon.io/2019-munich/slides/improved-alerting-with-prometheus-and-alertmanager.pdf](https://promcon.io/2019-munich/slides/improved-alerting-with-prometheus-and-alertmanager.pdf)
|
|
||||||
Loading…
Reference in a new issue