From Bayes to Chebyshev
Between 2025 and 2026, all users will transition from the Bayes cluster to the new Chebyshev cluster, and from SLURM to Kubernetes (K8S) as the primary platform for computational workloads.
This means moving away from using SLURM-based scheduling and the Bayes hardware infrastructure, and adopting a containerized, cloud-native workflow powered by K8S on Chebyshev. The migration will be carried out in phases over the next two years, overseen by a dedicated project team. The project is currently in its initial phase, with detailed planning scheduled to begin in August 2025.
Why make this change?โ
-
Greater Scalability & Modernized Workflow Kubernetes offers more flexibility and scalability than SLURM, enabling modern, container-based workflows and streamlined job management.
-
A Strategic Computational Platform Chebyshev is designed for next-generation workloads, including support for high-throughput computing, GPU acceleration, and cloud-native integration.
-
Better Support for Researchers Kubernetes is increasingly the standard among research institutions and industry partners, enabling better compatibility and collaboration.
-
Enhanced User Experience for All Chebyshev and K8S provide a consistent user environment across platforms and locations, improving access for both local and international researchers.
-
Improved Security and Efficiency The new platform supports advanced authentication methods and automated resource orchestration, improving job security, monitoring, and overall efficiency.
Project Outcomesโ
The USBC Maintenance Group anticipates clear benefits for all users of computational resources:
- Enabling greater consistency in how computational workflows are developed and deployed, through standardized, container-based environments and reproducible configurations.
- Leveraging the advanced capabilities of Kubernetes and Chebyshev to enhance resource management, scalability, and system security.
- Supporting improved collaboration with both internal and external research teams by aligning with widely adopted cloud-native infrastructure and tooling.
- Ensuring a measurable uplift in technical skills and digital fluency, especially in areas like container orchestration, job scheduling, and reproducible science.
| From | To |
|---|---|
| Bayes Cluster | Chebyshev Cluster |
| SLURM Job Scheduler | Kubernetes + Volcano Scheduler |
| Shell-based scripts | yaml-based scripts |
| Static environments | Dynamic, reproducible containers |
| Manual scaling & resource allocation | Automated orchestration & scaling |
| On-prem-only jobs | Hybrid/cloud-ready workflows |
Key Datesโ
August 2025
Project mobilisation and governance structure to be agreed and put in place.
Autumn 2025
Multifactor authentication is introduced for all user for better security.
Winter 2025
Some staff switch to the Chebyshev as early adopters.
Spring 2026
Staff migration to the Chebyshev, on a phased basis.
August 2026
All user complete the transition to the Chebyshev.
