Aviz ONES 4.2 details decoupled compute sharing with isolated GPU allocation
Aviz says ONES 4.2 introduces a decoupled approach to GPU infrastructure that separates compute VM sharing from GPU ownership, using fine-grained allocation and fabric-wide tenant segmentation. The change matters for enterprises running shared AI and PaaS workloads that need tighter density without weakening isolation controls.
Research Overview
The vendor describes a shift away from conventional GPU fabric setups where an entire physical server is dedicated to a single tenant, tying CPU, memory, network, and GPUs together. It frames the problem as reduced flexibility and low utilization as AI and platform workloads change over time.
ONES 4.2 is presented as addressing that constraint by allowing multiple tenants to share compute resources on the same server while keeping GPU access limited to the tenant assigned to each individual device.
Key Findings
The blog’s central claim is that compute and GPU resources can be handled independently, enabling multi-tenant compute VM placement without cross-tenant GPU access. The model keeps GPUs “strictly isolated per tenant” at the device level while compute VMs may be shared.
The blog also links the design to higher workload density and lower infrastructure costs by improving utilization of CPU and memory that would otherwise be idle in single-tenant server models.
Technical Breakdown
On the compute side, ONES supports multiple tenants running compute-only VMs on the same physical server. On the GPU side, the blog says each GPU is allocated at the individual device level, GPU ownership is bound to a single tenant, and “cross-tenant GPU sharing is not permitted.”
For networking, the blog states ONES uses EVPN VXLAN and VRF to provide scalable, end-to-end tenant isolation across the fabric. It describes tenant traffic segmentation using VLAN-based tagging, consistent propagation of segmentation during provisioning, and VLAN termination at the leaf layer mapped to tenant-specific routing domains to maintain isolation across Layer 2 and Layer 3.
Operational Impact
The blog describes GPU-aware orchestration as checking server eligibility based on available GPU allocation for a tenant, enforcing separation between compute-only and GPU workloads, and ensuring placement adheres to tenant boundaries. It frames the outcome as higher utilization and consistent isolation when shared compute is used.
It also states that decoupling compute and GPU assignments allows operators to change compute workload placement dynamically while GPU ownership remains fixed to the tenant. The blog ties this to handling shifting AI and PaaS demand without requiring dedicated hardware for every workload change.
Aviz ONES 4.2 presents a decoupled compute-and-GPU resource model that supports multi-tenant compute sharing while binding each GPU device to a single tenant, with tenant isolation implemented using EVPN VXLAN and VRF segmentation and enforced by GPU-aware orchestration. This Blog Signals brief is a fact-based summary of the vendor blog.