Latency SLOs vs availability SLOs

An availability SLO measures the fraction of checks that succeeded. A latency SLO measures the fraction of checks that completed under a response-time threshold. Nines runs burn-rate detection on both, independently.

Definitions

Availability SLO
Target: ≥ X% of checks return a successful status over the rolling window. Default: 99.5% over 7 days. Failure mode: the request did not succeed.
Latency SLO
Target: ≥ X% of checks return below a threshold response time over the rolling window. Default: 95% under 500 ms over 7 days. Failure mode: the request succeeded but exceeded the threshold.

Latency SLOs apply only to http_check monitors.

Latency SLO mechanics

The latency SLO is parameterised by a threshold in milliseconds and a target percentile. The SLO holds when the fraction of checks under the threshold is at least the target percentile divided by 100:

latency_slo holds when:
  fraction_of_checks_under_threshold >= target_percent / 100

Setting target = 95 and threshold = 500 is equivalent to requiring p95 ≤ 500 ms.

Why p95, not the average

An average response time can hide tail latency: 99 checks at 50 ms and 1 check at 5,000 ms averages to ~100 ms but represents one user with a 5-second wait. p95 reports the slowest 5% of requests, which is closer to what real users on real connections experience.

Failure mode matrix

AvailabilityLatencyWhat fires
FailsHoldsRegion-failure (if majority down) or availability_burn. Latency stays clear.
HoldsFailslatency_burn. Region-failure and availability_burn stay clear.
FailsFailsBoth detectors can open incidents on the same monitor; latency_burn is not suppressed by an open region-failure incident.
HoldsHoldsNo incident.

Plan tier matrix

CapabilityFreeProBusinessFounder
Latency SLO panel and per-region breakdown
latency_burn detector
latency_threshold_ms tunable per monitor
Target percentile, rolling window, excluded regions

Free and Pro monitors run the latency_burn detector against the default 95% / 500 ms / 7-day SLO with the per-monitor latency_threshold_ms override applied.

Burn-rate detection

Once the latency SLO is defined, latency_burn uses the same multi-window thresholds as availability_burn: 1h+5m at 14.4× and 6h+30m at 6×. The two SLI types surface as separate incidents and can be routed to different notification channels.

An open region-failure incident suppresses availability_burn evaluation but does not suppress latency_burn evaluation. The two can have open incidents on the same monitor at the same time. See Incident detectors: region-failure and burn-rate.

See also