Skip to main content

Akamai launches Akamai Cloud Inference for enhanced AI performance

Akamai has introduced Akamai Cloud Inference, designed to improve Artificial Intelligence (AI) application performance. The service aims to deliver a threefold increase in throughput while reducing latency by up to 60% and costs by as much as 86% compared to traditional hyperscale infrastructures.

According to Adam Karon, Chief Operating Officer at Akamai, moving AI data closer to end users is challenging, particularly for legacy cloud models. While training large language models (LLMs) continues in major data center, the inference processes will leverage the edge, with the infrastructure Akamai has developed over decades providing a distinct advantage in AI efforts.

Akamai Cloud Inference offers several features, enabling engineers and developers to deploy AI applications closer to end users. The solution combines versatile compute options, including traditional CPUs and advanced GPUs, alongside Application-Specific Integrated Circuit (ASIC) VPUs to handle various AI inference tasks efficiently. Akamai integrates with Nvidia's AI Enterprise ecosystem to enhance performance through tools such as Triton and TensorRT.

Additionally, the service includes a sophisticated data management framework, developed in partnership with VAST Data, to enhance access to real-time data, essential for quick inference and responsiveness. This infrastructure is designed to handle large volumes and diverse datasets necessary for AI applications.

Moreover, the containerization of AI workloads allows for demand-based scaling, promoting resilience and flexibility across hybrid and multicloud environments. Utilizing Kubernetes, Akamai Cloud Inference facilitates quick AI-ready deployments, streamlining the rollout of models through various open-source projects.

The introduction of WebAssembly (Wasm) capabilities is included in Akamai Cloud Inference, allowing developers to execute AI tasks directly from serverless applications, optimizing performance for applications sensitive to latency.

As AI continues to evolve, Akamai notes the importance of shifting focus from training expansive models to deploying more lightweight solutions tailored to specific business needs. This transition is crucial as enterprises look to derive actionable insights from AI, particularly in operational scenarios requiring real-time data processing.

The Akamai Cloud platform, with its extensive network of over 4,200 Points of Presence (PoP) globally, supports the consistent delivery of high-throughput performance across various applications.