Skip to main content

DDN launches Inferno to enhance AI inference performance

At NVIDIA GTC 2025, DDN launched DDN Inferno, an appliance aimed at enhancing inference acceleration for real-time Artificial Intelligence (AI) applications. Inferno targets two main challenges: latency and cost. It achieves response times below one millisecond and claims 10 times lower compute costs compared to prior solutions. The appliance is tailored to boost Graphics Processing Unit (GPU) utilization to 99% and eliminate data bottlenecks.

By integrating with NVIDIA Spectrum-X’s AI-optimized networking, DDN Inferno provides enterprises with improved AI workflows and infrastructure scalability. It supports multimodal AI workloads—including language models and real-time analytics—across various environments whether on-premises (on-prem) or cloud-based.

Omar Orqueda, senior vice president of Infinia Engineering at DDN, stated, “Real-time AI isn’t just about speed—it’s about removing every barrier between data and intelligence.”

For more information, visit ddn.com.