Kubernetes Jobs are designed for batch workloads — finite tasks that run to completion and terminate. But when you’re running hundreds of batch jobs per day for ML training, data processing, or CI/CD pipelines, the native Job controller becomes a bottleneck. You need queueing, priority management, gang scheduling (all-or-nothing scheduling for multi-pod jobs), and fair resource sharing across teams.
This guide compares three approaches to Kubernetes batch job management: native Jobs, Volcano (CNCF batch scheduling system), and Kueue (Kubernetes-native job queueing). Each addresses different aspects of the batch workload challenge.
Comparison Table
| Feature | Native K8s Jobs | Volcano | Kueue |
|---|---|---|---|
| Job Queueing | None (runs immediately) | JobQueue with priorities | ClusterQueue + LocalQueue |
| Gang Scheduling | No | Yes (minAvailable) | Yes (via PodSets) |
| Fair Sharing | No | Queue-based fair sharing | Cohort-based fair sharing |
| Preemption | Limited | Full preemption support | Preemption + requeuing |
| GPU Awareness | Resource requests only | GPU-aware scheduling | Resource flavor support |
| Plugin System | None | Yes (scheduling, capacity) | Extensible via webhooks |
| GitHub Stars | (built-in) | ~4,400 (volcano-sh/volcano) | ~680 (kubernetes-sigs/kueue) |
| CNCF Status | Built-in | Incubating | Graduated (K8s 1.32+) |
| ML Workload Focus | No | Yes (TFJob, PyTorch, MPI) | Yes (job flavors) |
| Multi-Tenancy | Namespace isolation | Queue hierarchies | ClusterQueue + cohort |
Native Kubernetes Job Controller
The built-in Job controller handles basic batch workloads with completion tracking and retry support.
Parallel Job Configuration
| |
Indexed Completion for Parallel Processing
The Indexed completion mode assigns each Pod a unique index (0 to N-1), enabling parallel processing where each Pod handles a specific partition of the workload.
| |
Volcano: Batch Scheduling System
Volcano is a CNCF-incubating batch scheduling system designed for high-performance compute workloads, particularly ML training, scientific computing, and large-scale data processing.
Volcano Installation and Scheduler Configuration
| |
Gang Scheduling with Volcano
Gang scheduling ensures that all Pods in a Job are scheduled together or none at all — critical for distributed training where a partial deployment is useless.
| |
Kueue: Kubernetes-Native Job Queueing
Kueue (Kubernetes sigs project, graduated in K8s 1.32+) provides native job queueing with fair sharing and multi-tenant support. It integrates with the native Job controller rather than replacing it.
Kueue Configuration
| |
Using Kueue with Native Jobs
Add the queue-name annotation to your Jobs to route them through Kueue:
| |
Kueue automatically suspends Jobs until resources are available, then unsuspends them in priority order based on the ClusterQueue configuration.
Why Self-Host Your Batch Job Management?
GPU Utilization: Batch scheduling systems like Volcano and Kueue maximize GPU utilization through gang scheduling and fair sharing. Without them, GPUs sit idle while Jobs wait for partial resource allocation. Proper scheduling can increase GPU utilization from 40% to 85%+.
Multi-Tenant Fairness: When multiple teams share a Kubernetes cluster, native Jobs follow a first-come-first-served model. Queue-based systems ensure each team gets a fair share of resources based on configured quotas, preventing one team from monopolizing the cluster.
Cost Efficiency: Cloud batch services (AWS Batch, GCP Batch) charge per job execution and add markup on underlying compute. Self-hosted batch management on existing clusters eliminates per-job fees and lets you use spot/preemptible instances for non-critical workloads.
ML Pipeline Integration: Modern ML training frameworks (PyTorch, TensorFlow, Horovod) assume distributed execution with all replicas available simultaneously. Gang scheduling prevents the “zombie training” problem where some workers start while others are stuck pending, wasting compute and potentially corrupting training state.
Predictable Scheduling: With queueing and priority management, you can guarantee that critical batch jobs (nightly ETL, model retraining, report generation) run on time, even during periods of high cluster utilization. Preemption ensures high-priority jobs can reclaim resources from lower-priority ones.
For ML experiment tracking workflows, see our MLflow vs ClearML vs Aim guide. For distributed training frameworks, our Horovod vs DeepSpeed vs FSDP article covers the training layer. For cluster autoscaling that complements batch scheduling, check our Karpenter vs Cluster Autoscaler guide.
FAQ
What is gang scheduling and why does it matter for batch jobs?
Gang scheduling ensures that all Pods in a multi-Pod Job are scheduled simultaneously or none at all. This matters for distributed training and batch processing where the job requires all workers to be running — having 3 out of 4 workers active provides no value and wastes resources. Volcano provides gang scheduling through its minAvailable field.
How does Kueue differ from Volcano?
Kueue works with the native Kubernetes Job controller — it queues and unsuspends Jobs based on resource availability, but doesn’t replace the scheduler. Volcano is a complete alternative scheduler that handles gang scheduling, binpacking, and fair sharing directly. Kueue is simpler to adopt (works with existing Jobs via annotations); Volcano provides more advanced scheduling features.
Can I use Kueue and Volcano together?
Technically yes, but it’s not recommended. Both manage scheduling behavior, and combining them can create conflicts. Choose Kueue if you need simple queueing with fair sharing on top of the native scheduler. Choose Volcano if you need advanced features like gang scheduling, DRF (Dominant Resource Fairness), or custom scheduling plugins.
How do I prioritize batch jobs in a shared cluster?
With native Kubernetes, use PriorityClass to assign different priorities to Jobs. With Volcano, use the Queue system with weighted priorities and preemption. With Kueue, configure ClusterQueue with borrowing policies and preemption rules. In all cases, combine with resource quotas to prevent any single team from consuming the entire cluster.
What happens to Jobs in the queue when the cluster scales up?
Kueue automatically re-evaluates queued Jobs when new nodes become available (through Cluster Autoscaler or Karpenter). Volcano does the same through its capacity plugin. Both systems dynamically adjust scheduling decisions based on current cluster resources — when new GPU nodes join, queued GPU Jobs are immediately considered for scheduling.
How do I monitor batch job queue depth and wait times?
For Kueue, use the kueue_admitted_workloads_total and kueue_pending_workloads Prometheus metrics. For Volcano, check the queue status via vcctl queue list and monitor the volcano_scheduling_duration_seconds metric. Set up Grafana dashboards to track queue depth, admission rate, and average wait time per queue.