Kubernetes v1.36: A New Era for Workload-Aware Scheduling in Production Clusters

Why this matters

For businesses running complex batch or AI/ML workloads on Kubernetes, scheduling challenges go beyond placing pods one by one. These workloads often require multiple pods to start simultaneously to avoid wasted compute resources or deadlocks. The older pod-by-pod approach can lead to partial deployments, inefficient resource use, and unpredictable scheduling delays.

Kubernetes v1.36 addresses these issues with an evolved scheduling architecture that separates workload templates from runtime scheduling state. This brings a clearer model for managing groups of pods as units, enabling atomic scheduling decisions and opening the door to enhanced features like topology-aware placement and workload-aware preemption. For healthcare and professional services SMBs, where workload reliability and efficient cloud spend are non-negotiable, these improvements can translate into more stable and predictable environments.

The release also integrates workload-aware scheduling with Kubernetes’ native Job controller, reducing the complexity of managing tightly coupled parallel workloads. This reduces the operational burden on teams who often need to meet compliance and uptime mandates while controlling cloud costs.

What usually goes wrong

A common pain point in Kubernetes scheduling is handling "gang scheduling," where a set of pods must start together to provide meaningful work. Traditional pod scheduling evaluates each pod independently, which leads to partial pod groups being scheduled if resources are tight. This can cause some pods to run while others wait indefinitely, resulting in wasted compute and potentially missed SLAs.

Moreover, the previous approach embedded runtime scheduling information within the workload object itself, which coupled static configuration with dynamic state. This made scaling and status tracking cumbersome, especially as workloads grew in size and complexity. Without clear separation, the scheduler had to watch and parse workload templates constantly, reducing efficiency.

Another issue arises with topology and resource locality. When pods are scattered randomly across nodes, network latency and bottlenecks can degrade performance, especially for AI or batch jobs sensitive to intra-cluster communication delays. Prior Kubernetes versions had limited capabilities for scheduling pods with explicit topology constraints.

Preemption—the process of evicting running pods to free space for higher-priority ones—was also pod-centric. This meant that preempting pods for group workloads often failed to free enough resources cluster-wide, limiting the scheduler's ability to guarantee gang scheduling.

Lastly, managing device resources such as GPUs at scale was inefficient, as ResourceClaims were tied to individual pods. This limited sharing and complicated scenarios where multiple pods needed coordinated access to shared devices.

A better Cloudain-style approach

Kubernetes v1.36 introduces a clean decoupling between the static Workload API and a new PodGroup API that manages runtime scheduling state. The Workload becomes a static template describing pod group composition and policies, while PodGroup instances represent the current scheduling state of those pods. This architectural shift enhances scalability by enabling per-replica sharding of status updates and streamlining the scheduler's logic to focus on runtime scheduling data.

The kube-scheduler gains a dedicated PodGroup scheduling cycle, which treats a PodGroup as an atomic unit. When scheduling, it evaluates all pods in the group at once, taking a single cluster snapshot to avoid race conditions. This means either the entire group is scheduled together if resources allow, or none are scheduled and the group waits. Such atomic scheduling prevents partial deployments and deadlocks, improving resource efficiency.

To address the challenges of network topology, this release introduces topology-aware scheduling at the PodGroup level. You can specify constraints, such as rack-level affinity, so pods are co-located to reduce communication latency. The scheduler generates candidate placements based on these constraints, evaluates feasibility, and selects the optimal node set to satisfy performance needs. This capability is critical for distributed AI workloads where latency directly affects throughput and cost.

For resource contention scenarios, workload-aware preemption treats entire PodGroups as preemptor units. Unlike traditional pod-by-pod preemption, this allows the scheduler to evict pods across multiple nodes simultaneously, freeing enough cluster-wide resources to schedule the whole PodGroup. Additional PodGroup fields control priority and whether pods can be preempted individually or as a group, providing granular control over disruption policies.

On the resource allocation front, v1.36 enhances Dynamic Resource Allocation (DRA) support by enabling PodGroups to represent a single claim for shared devices like GPUs. This reduces the overhead of managing hundreds of individual claims in large workloads and allows efficient device sharing within pod groups.

Finally, the Job controller now integrates with this workload-aware scheduling framework. When enabled, it automatically creates the relevant Workload and PodGroup objects for qualifying Jobs. This relieves operators from manually wiring pod groups and scheduling annotations, simplifying the rollout of gang-scheduled, tightly coupled batch jobs.

A simple next step

For SMB teams managing Kubernetes clusters, the immediate step is to experiment with the alpha features offered in v1.36 in a testing environment. Enabling the required feature gates—such as GenericWorkload, GangScheduling, TopologyAwareWorkloadScheduling, and WorkloadAwarePreemption—can help validate whether workload-aware scheduling fits the operational model.

Start by identifying workloads that require gang scheduling properties, such as distributed training jobs or batch workloads with strict pod co-dependencies. Evaluate how topology-aware scheduling might improve latency-sensitive operations by specifying topology constraints.

Simultaneously, assess whether device sharing through PodGroup-aligned ResourceClaims could reduce complexity in managing GPU or NIC resources. Trying out the Job controller integration for indexed Jobs can demonstrate how workload-aware scheduling reduces manual configuration overhead.

Monitoring and observability are key. Track improvements in pod scheduling times, resource utilization, and frequency of preemptions. This data will guide decisions on scaling these features into production.

While these features are alpha and require careful testing, they represent a foundational shift in Kubernetes scheduling that aligns well with the needs of production-grade, tightly coupled workloads common in healthcare and professional services sectors.

How Cloudain can help

Cloudain can assist SMBs in evaluating and adopting Kubernetes v1.36’s workload-aware scheduling capabilities to improve resource efficiency and workload reliability. With experience in orchestrating complex AI/ML and batch workloads on AWS, Azure, and GCP, Cloudain helps design cluster architectures that leverage PodGroup scheduling cycles, topology constraints, and workload-aware preemption effectively.

For teams aiming to reduce cloud spend while maintaining compliance and performance SLAs, Cloudain’s advisory services can guide test deployments, feature gate configuration, and integration with existing CI/CD and observability pipelines. By bridging the gap between new Kubernetes features and practical operational needs, Cloudain ensures smoother migrations and more predictable workload execution at scale.

Kubernetes v1.36: A New Era for Workload-Aware Scheduling in Production Clusters

Why this matters

What usually goes wrong

A better Cloudain-style approach

A simple next step

How Cloudain can help

Cloudain

Unite your teams behind measurable transformation outcomes.

Kubernetes v1.36: A New Era for Workload-Aware Scheduling in Production Clusters

Why this matters

What usually goes wrong

A better Cloudain-style approach

A simple next step

How Cloudain can help

Cloudain

Unite your teams behind measurable transformation outcomes.