Skip to main content

Rafay Systems and Aviz Networks Partner for GPU Cloud Orchestration

Partnership Enables Enterprises and Graphics Processing Unit (GPU) Cloud Providers to Rapidly Operationalize Multi-Tenant Artificial Intelligence (AI) Fabrics with Self-Service and Network Visibility

SAN JOSE, Calif. & SUNNYVALE, Calif. - 3 Nov, 2025 - Rafay Systems and Aviz Networks announced a strategic partnership to promote the adoption of GPU cloud infrastructure. The collaboration aims to streamline the operationalization of multi-tenant AI fabrics by providing an integrated orchestration solution. By combining Rafay's enterprise-grade Kubernetes and GPU lifecycle management capabilities with Aviz's AI-optimized fabric orchestration, the partnership offers a full-stack solution that enhances network visibility. This enables secure and efficient GPU resource consumption through self-service workflows across computing and networking layers. Key features of the collaboration include:

  • End-to-End Self-Service: On-demand access to GPU/CPU resources with tenant-aware network automation.
  • AI-Ready Fabric Orchestration: Management of Spectrum-X switches and GPU NICs to optimize performance.
  • Multi-Tenancy at Scale: Management of tenant and GPU binding for Kubernetes clusters, ensuring resource isolation.
  • Full Stack Observability (FSO): Real-time monitoring reduces troubleshooting time and enhances ROI.
  • Rapid Time-to-Market: Converged Access Networks Converged Access Network (CAN) be deployed in weeks using integrated APIs.

Haseeb Budhani, CEO of Rafay Systems, said, “Cloud providers and enterprises need a simple way to consume GPU infrastructure without reinventing orchestration stacks.” Vishal Shukla, CEO of Aviz Networks, added, “Together with Rafay, we deliver a powerful combination: Rafay's compute lifecycle automation with Aviz's fabric-level multi-tenant orchestration.” The partnership addresses challenges faced by traditional GPU cloud infrastructures, such Autonomous System (AS) inefficiencies in multi-tenancy and orchestration, which CAN result in high costs and underutilization of resources. The integrated approach allows for a cloud-like consumption model across varied GPU environments.