GitHub Enhances Service Health Communication with New Status Page Updates

From Eatin3d, the free encyclopedia of technology

Introduction

GitHub, the platform where millions of developers collaborate on critical projects, has long understood the weight of its responsibility to maintain reliable service. Earlier this year, the company acknowledged recent availability challenges and outlined ongoing investments to bolster reliability. Alongside these technical improvements, GitHub has now turned its attention to how it communicates service health—both during and after incidents. The goal is to provide more precise, transparent, and timely information about the platform's operational status.

GitHub Enhances Service Health Communication with New Status Page Updates
Source: github.blog

Starting today, three key changes are being rolled out to the GitHub status page, each designed to offer deeper insight into service performance and incident impact. These updates center on a new incident severity level, per-service uptime metrics, and more granular component reporting, all inspired by the principles of accuracy, transparency, and timeliness.

Introducing a New Severity Level: Degraded Performance

Previously, GitHub classified all service disruptions as either a Partial Outage or a Major Outage. This binary system often failed to capture the true nature of an incident, especially when a service remained functional but with reduced performance. For example, if a service experienced elevated latency or intermittent errors without becoming completely unavailable, it was still labeled a Partial Outage—leading users to believe the service was down when it was still accessible.

To address this, GitHub has introduced a Degraded Performance severity level. This new state sits between a fully operational service and a Partial Outage, creating a three-tier classification system that more accurately reflects the spectrum of issues that can affect GitHub services. Here's what each level means:

  • Degraded Performance: The service is operational but impaired. Users may experience elevated latency, reduced functionality, or intermittent errors affecting a small percentage of requests. This level does not count as downtime in uptime calculations.
  • Partial Outage: A significant portion of the service is unavailable or severely impacted for a meaningful number of users. This carries a downtime weight of 30% in uptime calculations.
  • Major Outage: The service is broadly unavailable, affecting most or all users. This counts as 100% downtime for the duration.

With this change, users can now better understand the actual impact of an incident. A Degraded Performance alert means the service is still usable, just not at its peak. This reduces unnecessary alarm while still providing honest communication about issues.

Per-Service Uptime Metrics on the Status Page

Another significant update is the publication of per-service uptime percentages directly on the GitHub status page. These metrics show the reliability track record for each individual service over the last 90 days, making it easy to quickly assess how a particular offering—such as Actions, Pages, or API—has been performing.

The uptime percentages are calculated using a weighted downtime model based on incident severity and duration, following industry-standard practices. Each severity level carries a specific downtime weight:

  • Major Outage: 100% of the incident duration counts as downtime.
  • Partial Outage: 30% of the duration counts as downtime.
  • Degraded Performance: 0% counts as downtime (since the service remains functional).

For example, if a service experienced a 1-hour Partial Outage over 90 days, that would contribute 18 minutes of effective downtime (1 hour × 30%) to the uptime calculation—not the full hour. A Degraded Performance event would add zero downtime, even if it lasted for hours. This approach ensures that the uptime metric accurately reflects true service unavailability rather than minor performance blips.

GitHub Enhances Service Health Communication with New Status Page Updates
Source: github.blog

By publishing these numbers, GitHub empowers users to make informed decisions about their reliance on specific services and to track improvement over time.

More Granular Insights: Copilot AI Model Providers Component

GitHub is also increasing the granularity of its incident reporting. Starting with a dedicated component for Copilot AI Model Providers, the status page will now surface more specific details about the source of certain disruptions. Previously, an issue affecting the underlying AI models powering GitHub Copilot might have been reported under a broader service category, leaving users uncertain about the root cause.

With this dedicated component, GitHub can communicate more clearly when a disruption is tied to a third-party model provider versus a core Copilot service. This level of detail helps developers and organizations plan their workflows with greater confidence, knowing exactly which part of the stack is impacted.

Over time, GitHub plans to extend this granular component approach to other services, offering even more transparency into the platform's health.

Conclusion: A Commitment to Clearer Communication

These three updates—a new Degraded Performance severity level, per-service uptime metrics, and more granular component reporting—represent a significant step forward in how GitHub communicates service health. By aligning incident classification with actual user impact, publishing transparent reliability data, and offering finer-grained status details, GitHub is making it easier for developers to understand what's happening and how it affects them.

As the platform continues to invest in reliability, this improved communication ensures that the developer community stays informed with accuracy and clarity. Check the GitHub Status Page to see these changes in action.