Aviz enhances ONES 2.0 for SONiC-based AI networks
Aviz has introduced enhancements to ONES 2.0 focusing on SONiC-based networks, targeting the needs of Artificial Intelligence (AI) infrastructure. These upgrades are relevant for IT leaders managing AI workloads that require robust networking capabilities.
Technology Strategy
ONES 2.0 enhances Remote Direct Memory Access (DMA) (RDMA) over Converged Ethernet (RDMA over Converged Ethernet (RoCE)) traffic monitoring by collecting important metrics such as Power Factor Correction (PFC) counters and Quality of Service (QoS) drop counters. These enhancements provide real-time visibility into network performance, allowing for greater optimization in AI workloads.
The proactive congestion management features aim to detect and address potential bottlenecks, which is crucial for maintaining the efficiency of model training or inference tasks. This approach is essential when dealing with large datasets that require real-time processing.
Product Update
ONES 2.0 not only supports SONiC fabrics but also normalizes telemetry metrics across diverse hardware platforms. It integrates with orchestration systems and third-party APIs, which streamlines configuration and monitoring, enhancing usability in various AI environments.
The graphical user interface provided by ONES facilitates visualization of RDMA over Converged Ethernet (RoCE) traffic flow, enabling operators to configure QoS effectively and understand traffic patterns. This function is vital for troubleshooting and maintaining performance in AI fabric networks.
Customer Use Case
With its detailed visualization capabilities, ONES aids network operators in identifying queue behaviors and prioritization metrics. This capability ensures low latency during high data loads, crucial for the effectiveness of AI training and inference operations.
Furthermore, ONES enables seamless connectivity to third-party systems via Representational State Transfer (REST) APIs, assisting teams in correlating RoCE telemetry with overall application performance. This integration promotes comprehensive observability across the network.
Conclusion
The updates to ONES 2.0 reflect efforts to enhance network performance in SONiC-based AI infrastructures. The importance of these enhancements is evident for decision-makers focused on optimizing AI workloads and maintaining efficient networking. This Blog Signals brief illustrates a timely, fact-based summary of the original blog post.