Skip to content

Anomaly conditions

A fixed threshold works when there's an obvious "bad number" — CPU over 90, disk over 95. Many metrics aren't like that. Request rate, latency, and queue depth all have a normal that shifts with time of day and traffic, so any single threshold is wrong half the time: too low at peak, too high overnight.

An anomaly condition solves this by comparing the query's value against the metric's own recent behavior instead of a number you pick. The rule fires when the value is unusual for that metric — not when it crosses a line. You select an anomaly condition in place of a fixed threshold when creating a rule; everything else about the rule (the query, the for duration, the escalation chain) works the same way.

The methods

Hexcovery offers three ways to define "unusual":

  • Z-score — measures how far the current value sits from the recent mean, in units of the metric's own variability (standard deviations). A value a few deviations away from normal is flagged. Best for metrics that wobble around a stable level.
  • EMA (exponential moving average) — compares the value against an exponentially weighted moving average, which tracks a slowly drifting baseline while still reacting to recent changes. Best for metrics whose normal level trends gradually rather than staying flat.
  • Seasonal — accounts for repeating patterns (daily and weekly cycles), comparing the value against what's normal for this time of day / day of week. Best for traffic-driven metrics where 3am and 3pm are legitimately different.

Choosing one

If the metric… Use
hovers around a steady level with noise Z-score
drifts up or down over time EMA
follows a daily or weekly rhythm Seasonal

When in doubt, start with the method that matches the shape of the metric on its preview chart, then watch how the rule behaves over a few real cycles and adjust. Pair an anomaly condition with a sensible for duration so a single odd reading doesn't page you — the same sustain logic applies.

  • Alert rules — where you attach an anomaly condition to a query.
  • Incidents — review what an anomaly rule caught.