NVIDIA DSX Air Adds AI-NOC Workflows With Aviz ONES and Network Copilot
NVIDIA’s DSX Air is being positioned as a digital twin environment where Aviz ONES supports day-0 AI fabric bring-up and Network Copilot enables day-N AI-assisted NOC workflows, including telemetry-driven investigation and reporting.
Research Overview
The blog outlines how AI factory networking needs span front-end connectivity, back-end GPU-to-GPU communication, tenant isolation, device telemetry, and rapid scaling driven by workload placement. It argues that static lab validation or limited physical staging does not cover the full lifecycle of design, deployment, and troubleshooting.
In this model, DSX Air is described as a high-fidelity simulated setting used to demonstrate how AI network behavior changes across its lifecycle. The focus is on operational readiness rather than design validation alone.
Key Findings
Aviz ONES is presented as an orchestration layer for AI fabrics that turns day-0 bring-up into a repeatable workflow in DSX Air. The blog links configuration and provisioning activities to potential day-N outcomes such as congestion, packet drops, GPU underutilization, and application instability.
Network Copilot is described as an AI-assisted NOC layer that sits on top of ONES and other operational data feeds. The blog frames the main operational goal as converting large volumes of network and infrastructure data into correlated evidence, summaries, and recommendations.
Technical Breakdown
For day-0 operations in DSX Air, the blog says ONES can support topology onboarding, device discovery and inventory, blueprint-driven provisioning, switch configuration generation and deployment, and underlay setup using BGP/EVPN/VXLAN or fabric underlay approaches. It also describes adaptive routing and QoS, along with tenant-aware network segmentation.
For day-N operations, the blog describes Network Copilot as enabling natural-language questions that support investigations and reporting. It lists example workflows including tenant management, fabric health queries with summarization, packet drop and congestion investigation, event correlation across switches and tenants, configuration drift detection, root-cause analysis workflows, change-impact analysis, and executive/operator-level reporting.
Operational Impact
The example lifecycle in DSX Air starts with ONES onboarding network topology, device discovery, fabric configuration, routing bring-up, and telemetry enablement to reach a known-good operational state. It then uses Network Copilot with ONES APIs to create tenants and map them to network constructs such as VRFs, VXLANs, and VLANs.
The blog further describes ONES telemetry collection including interface utilization, packet drops and discards, queue and buffer behavior, optics telemetry such as power and temperature, tenant-level traffic visibility, and integrated GPU/NIC infrastructure telemetry where available. Network Copilot then consumes this operational context to support queries such as tenant-to-workload mapping and GPU utilization over time, along with warning/critical alert summaries and transceiver anomaly checks.
This blog frames DSX Air plus Aviz ONES plus Network Copilot as an operational model for AI networks, extending earlier design-validation concepts into day-0 bring-up and day-N AI-assisted NOC workflows using a digital twin. Blog Signals brief is a fact-based summary of the vendor blog.