Error Budgets

An error budget quantifies how much imperfection your SLO allows. Nines tracks it in real time so you can see — at a glance — whether your reliability is on track or burning down faster than expected.

What is an SLO?

A Service Level Objective (SLO) is an internal reliability target you set for yourself. For example: "99.5% of HTTP checks over the last 7 days should succeed" or "95% of checks should respond in under 500 ms."

An SLO is different from an SLA (a contractual commitment to customers). An SLO is a signal you use to manage engineering work — when you are well within your SLO you can move fast; when you are burning your budget you slow down and focus on reliability.

What is an error budget?

The error budget is the allowance implied by your SLO target. If your availability SLO is 99.5% over 7 days, the error budget is 0.5% of the 604,800 seconds in the window — approximately 50 minutes of allowed downtime. Every second of actual downtime consumes that allowance.

When the budget reaches zero, you have exceeded your SLO. When it remains comfortably above zero, you have headroom to deploy, experiment, and accept planned maintenance without breaching your target.

How to read the error-budget panel

On the monitor detail page you will see an error-budget card that shows:

  • Budget remaining (%) — what fraction of the total error budget is still intact. 100% means zero failures in the window; 0% means the SLO has been breached.
  • Used / Allowed — the raw time (or fraction of checks) consumed vs. the total allowed. Seeing "12 min used / 50 min allowed" is often more intuitive than a bare percentage.
  • Burn rate — how fast you are consuming the budget relative to the sustainable pace. A burn rate of 1.0× means you will exactly use the whole budget by the end of the window. A rate above 1.0× means you will run out early.

Burn rate states

The panel uses three visual states:

  • Normal — burn rate at or below 1.0×, budget healthy.
  • Warning — burn rate above 1.0× but budget not yet exhausted. You are trending toward a breach; investigate before it worsens.
  • Exhausted — the error budget has been consumed. Your SLO target was not met for this window.

Two types of error budget

Nines tracks two independent error budgets for eligible monitors:

  • Availability SLO — based on the ratio of successful checks to total checks. Available for all monitor types on paid plans.
  • Latency SLO — based on the fraction of HTTP checks that respond within your threshold. Available for http_check monitors on Business plan and above.

Burn-rate incident detection

Nines watches your burn rate continuously in the background. When it detects a multi-window burn-rate alert pattern, it automatically opens a burn-rate incident. See Burn-Rate Incidents for details.