Alert thresholds depend on nature of applications.
Some queries may have arbitrary tolerance threshold.
Building an efficient an battle-tested monitoring platform takes time. 😉
{% assign ruleName = rule.name | split: ' ' %} {% capture ruleNameCamelcase %}{% for word in ruleName %}{{ word | capitalize }} {% endfor %}{% endcapture %} {% highlight yaml %} {% for comment in comments %}# {{ comment | strip }} {% endfor %} - alert: {{ ruleNameCamelcase | remove: ' ' }} expr: {{ rule.query }} for: 5m labels: severity: {{ rule.severity }} annotations: summary: "{{ rule.name }} (instance {% raw %}{{ $labels.instance }}{% endraw %})" description: "{{ rule.description }}\n VALUE = {% raw %}{{ $value }}{% endraw %}\n LABELS: {% raw %}{{ $labels }}{% endraw %}" {% endhighlight %}