5 Key Insights into Kubernetes Server-Side Sharded List and Watch (v1.36 Alpha)

Kubernetes clusters are scaling to tens of thousands of nodes, and with that growth comes a pressing challenge: controllers that watch high-cardinality resources like Pods can hit a performance wall. Each replica of a horizontally scaled controller receives the full event stream from the API server, wasting CPU, memory, and network bandwidth on events it doesn't need. The upcoming Kubernetes v1.36 introduces an alpha feature—server-side sharded list and watch—that promises to flip this paradigm. Here are five essential things you need to know about this feature, from the problem it solves to how you can leverage it in your own controllers.

1. The Scaling Challenge with High-Cardinality Resources
2. Client-Side Sharding: Why It Falls Short
3. Server-Side Sharding: How It Solves the Problem
4. Technical Implementation: ShardSelector and Hash Ranges
5. Integrating Sharded Watches into Controllers with Informers

1. The Scaling Challenge with High-Cardinality Resources

As Kubernetes clusters grow to tens of thousands of nodes, controllers that watch high-cardinality resources like Pods face a scaling wall. Every replica of a horizontally scaled controller receives the full stream of events from the API server, paying the CPU, memory, and network cost to deserialize everything, only to discard the objects it is not responsible for. Scaling out the controller does not reduce per-replica cost; it multiplies it. This inefficiency becomes a bottleneck for large clusters, where the volume of events can overwhelm even robust deployments. The core issue is that the API server broadcasts all events to every interested party, regardless of whether they actually need each event. This design worked well for smaller clusters, but as the number of nodes and workloads grows, the cost of processing all those events becomes prohibitive. The server-side sharded list and watch feature aims to change this by filtering events at the source—the API server—so that each controller replica only sees the data it owns.

5 Key Insights into Kubernetes Server-Side Sharded List and Watch (v1.36 Alpha)

2. Client-Side Sharding: Why It Falls Short

Some controllers, such as kube-state-metrics, already support horizontal sharding. Each replica is assigned a portion of the keyspace and discards objects that do not belong to it. While this works functionally, it does not reduce the volume of data flowing from the API server. With client-side sharding, N replicas each receive the full event stream: every replica deserializes and processes every event, then throws away what it does not need. Network bandwidth scales with replicas, not with shard size, and CPU spent on deserialization is wasted for the discarded fraction. In large clusters, this means that even if you add more replicas to handle load, the total traffic and processing overhead increase linearly, not decreasing per replica. Client-side sharding is a logical step but falls short of being a true scalability solution because the bottleneck remains at the API server and network level. It's like having every employee in a company read every email and then only act on those relevant—the inefficiency is obvious.

3. Server-Side Sharding: How It Solves the Problem

Server-side sharded list and watch solves the inefficiency by moving the filtering upstream into the API server. Each replica tells the API server which hash range it owns, and the API server only sends matching events. This means that the full event stream is no longer broadcast to every replica; instead, each replica gets a filtered subset of events based on its assigned shard. The result is a dramatic reduction in network bandwidth, CPU usage, and memory consumption for both the API server and the controller replicas. The feature works with both list responses and watch event streams, ensuring that from the very first list operation to every subsequent watch event, only the relevant data is transmitted. This approach scales linearly with the number of shards—if you double the replicas, each replica handles half the data, rather than doubling total traffic. It's a fundamental shift that treats the API server not as a dumb broadcaster but as an intelligent filter, making large-scale cluster management far more efficient.

4. Technical Implementation: ShardSelector and Hash Ranges

The feature adds a shardSelector field to ListOptions. Clients specify a hash range using the shardRange() function, such as shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000'). The API server computes a deterministic 64-bit FNV-1a hash of the specified field and returns only objects whose hash falls within the range [start, end). This applies to both list responses and watch event streams. The hash function produces the same result across all API server instances, so the feature is safe to use with multiple API server replicas. Currently supported field paths are object.metadata.uid and object.metadata.namespace. This means you can shard based on unique object IDs or namespace boundaries, giving flexibility in how work is distributed. The hash-based approach ensures even distribution of objects across shards, assuming the field has good entropy. For a two-replica setup, you would split the 64-bit space in half: one replica takes [0, 0x8000000000000000) and the other takes [0x8000000000000000, 0xFFFFFFFFFFFFFFFF). As the cluster grows, you can add more replicas and adjust the ranges accordingly.

5. Integrating Sharded Watches into Controllers with Informers

Controllers typically use informers to list and watch resources. To shard the workload, each replica injects the shardSelector into the ListOptions used by its informers via WithTweakListOptions. For example, a Go controller might use:

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/informers"
)

shardSelector := "shardRange(object.metadata.uid, '0x0000000000000000', '0x8000000000000000')"

factory := informers.NewSharedInformerFactoryWithOptions(client, resyncPeriod,
    informers.WithTweakListOptions(func(opts *metav1.ListOptions) {
        opts.ShardSelector = shardSelector
    }),
)

For a two-replica deployment, the selectors split the hash space in half: Replica 0 gets the lower half and Replica 1 gets the upper half. This integration requires minimal changes to existing controllers—just add the shard selector tweak and ensure each replica has a unique range. The feature is still in alpha in Kubernetes v1.36, so it requires the ServerSideShardedListWatch feature gate to be enabled. Early adopters can experiment with it in development clusters to validate the performance gains. As the feature matures, it will likely become a standard tool for scaling controllers in large clusters.

Conclusion

Server-side sharded list and watch represents a significant step forward in Kubernetes scalability, addressing a pain point that has plagued operators of large clusters for years. By shifting filtering from client to server, it reduces redundant data transfer and processing, making controller scaling both more efficient and more cost-effective. As an alpha feature in v1.36, it offers a glimpse into a future where even the largest clusters can be managed with fewer resources and less complexity. Whether you're running kube-state-metrics, custom controllers, or other watch-intensive components, this feature is worth tracking and testing. Keep an eye on the Kubernetes changelog for when it graduates to beta and stable, and start planning how your controllers can leverage sharded watches to achieve better performance at scale.

Tags: