realsla.is

Real uptime behind the SLA promises

← Back

How we calculate uptime

Every service we track publishes a public status page listing each incident - when it started, when it was resolved, and how severe it was. We read those lists and compute what percentage of time the service was actually affected. We don't run independent health checks; if a provider didn't report it, we don't count it.

The tricky part: services open a separate incident for each affected component. On a bad day, GitHub might have simultaneous incidents for Git operations, Copilot, and Webhooks all running at once. Simply adding up their durations double-counts the overlap and produces nonsense like −131% uptime.

So instead of summing incident durations, we merge any overlapping windows first, then measure the total affected time.

Real example - GitHub, March 19 2026

Three incidents were open at the same time that morning:

Git ops (west coast)
460 min
Copilot Agent
49 min
Copilot sessions
47 min
Actual affected window
460 min
Naïve sum
556 min
After merging
460 min

GitHub had ~40 such overlapping incidents across March 2026. After merging, the total affected time was 3,730 minutes - which is where the 91.6% uptime figure for that month comes from.

Exceptions - the * next to a number

Some incidents are excluded from a service's uptime calculation, and the number is marked with an asterisk (e.g. 95.27%*).

The incident was caused by an upstream cloud provider (AWS, Azure, GCP) and was outside the vendor's control - the kind of failure that is typically carved out of commercial SLA contracts as force majeure.

For example, a 34-day Snowflake incident scoped entirely to AWS Middle East (UAE) meets both criteria. Excluding it brings the 90-day number closer to what most users actually experienced. Incidents that affected mainstream regions - even if labelled "degraded" - are always included.

Exceptions are manually curated, not applied automatically. We err on the side of inclusion: if there is any doubt, the incident stays in the calculation.

A note on what "uptime" means here

We include all reported incidents - minor degradations to specific features as well as major outages. Providers' stated SLAs (e.g. GitHub Enterprise's 99.9%) typically cover only core services under specific conditions, not every component. Our numbers will often be lower, and that's intentional: we're showing the full picture of what the provider's own status page reported.

The status page API returns roughly the last 50 incidents, covering around 4–6 weeks of history. Months outside that window appear as grey squares with no data.