Skip to main content

MangoBoost Achieves Records in MLPerf Inference v5.0 with AMD MI300X GPUs

MangoBoost, a provider of system solutions for Artificial Intelligence (AI) efficiency, has achieved record results in MLPerf Inference v5.0 with its Mango LLMBoost™ software on AMD MI300X GPUs. This achievement marks the first-ever multi-node MLPerf inference result on these GPUs, demonstrating a significant performance advantage over prior results, including those using NVIDIA H100 GPUs.

By utilizing 32 MI300X GPUs across four server nodes, Mango LLMBoost™ secured the highest inference performance yet, achieving 103,182 tokens per second (TPS) in offline testing and 93,039 TPS in server scenarios, surpassing the previous best of 82,749 TPS on NVIDIA hardware. The software also offers substantial cost advantages, with MI300X GPUs being more affordable than their NVIDIA counterparts, resulting in potential cost savings of up to 62%.

Mango LLMBoost™ not only offers high performance but also supports over 50 open models and is designed for seamless scalability. The software is compatible with cloud environments and can be deployed on-premises (on-prem), making it suitable for various enterprise needs. Key features include efficient model distribution and runtime optimization.

This milestone was achieved through a collaboration with AMD, leveraging the ROCm software stack, which enhances the scalability and efficiency of the AI inference solution. Beyond the MLPerf results, Mango LLMBoost™ has shown superior cost-efficiency and performance across cloud configurations, further establishing its role in the AI infrastructure sector.