Dell'Oro Group reports on optical circuit switches in large-scale AI infrastructure
Artificial Intelligence (AI) clusters are transitioning focus from Graphics Processing Unit (GPU) model superiority to the efficiency of interconnecting and utilizing GPUs, making network performance critical for enterprise and technical leaders managing large-scale AI deployments.
Market Overview
The surge in AI computation is driving cluster sizes from hundreds of thousands to millions of GPUs, with over 70 million GPUs expected to be deployed in the next five years. This expansion emphasizes the network's central role in facilitating efficient GPU connectivity, positioning the network as the computer within AI clusters.
Key Findings
Operating large-scale AI clusters presents challenges due to the exponential growth in interconnections and bandwidth speed requirements, increasing costs, power use, and latency. AI back-end networks are refreshed approximately every two years, a faster pace than traditional enterprise environments, with a projected shift to 3.2 Tbps switch ports by 2030 to meet rising demands.
Technology or Trend Analysis
Optical Circuit Switches (OCS) provide direct optical paths, bypassing packet-switched routing to reduce latency and power consumption by eliminating optical-electrical-optical conversions. Unlike electrical switches, Optical Circuit Switch (OCS) devices are speed-agnostic, avoiding frequent hardware upgrades as bandwidth standards evolve from 400 Gbps to 3.2 Tbps.
OCS has been employed in wide-area networks by tier-one operators for over a decade and relies on micro-electromechanical systems and liquid crystal on silicon technologies that have demonstrated long-term reliability in carrier networks.
Forecast or Analyst Outlook
OCS technology offers a scalable and efficient interconnect fabric suitable for unpredictable east-west traffic and bandwidth growth in AI and High performance computing (HPC) back-end networks. Industry developments include new OCS products tailored to AI data centers, indicating increased exploration of OCS as an option for AI network infrastructure.
This transition suggests that evolving AI networks must prioritize OCS integration to meet the computational and operational demands of large GPU clusters.
Conclusion
The report concludes that as AI infrastructure diverges from traditional data center designs, network evolution must outpace GPU advancements, with OCS identified as a field-proven technology capable of meeting the rigorous needs of extensive AI clusters. This Analyst Signals brief reflects a neutral, fact-based summary of the original research note.