Skip to main content

Alluxio introduces distributed caching solution for AI workloads on Oracle Cloud Infrastructure

Alluxio said it introduced a solution that combines its data acceleration capabilities with Oracle Cloud Infrastructure to support AI training and inference workloads. The change targets data access bottlenecks that can prevent GPUs from staying fed with data in GPU-based environments.

The release describes scenarios where AI workflows rely on object storage and face tradeoffs between keeping data in place and achieving high-performance access. It says traditional approaches can require moving large datasets to align with compute resources, adding operational complexity and cost, while Alluxio enables high-throughput, low-latency access without requiring data migration.

Alluxio can be deployed alongside GPU environments on OCI by aggregating local NVMe storage into a distributed caching layer. The described results include sub-millisecond latency and terabytes per second of aggregate throughput, with access to data stored in OCI Object Storage or S3-compatible environments. The release also states support for standard interfaces such as POSIX and S3.

In the collaboration described, Fireworks AI uses Alluxio to support high-performance data access across distributed GPU environments, including OCI. Chenyu Zhao, cofounder at Fireworks AI, said, “To deliver fast, reliable inference at scale, we needed a more efficient way to manage data across our GPU infrastructure,” said Chenyu Zhao, cofounder at Fireworks AI. “With Alluxio, we’ve reduced data access times and improved overall system performance while maintaining flexibility across environments. Our infrastructure spans heterogeneous GPU environments, and we rely on efficient data access to maintain performance. By using Alluxio alongside GPU clusters—including those on OCI—we’ve built a distributed system capable of serving more than 2 PB of data daily, reducing replica download times for large models from 20 minutes to 2 minutes, and achieving up to 1 TB/s in aggregate throughput. This architecture allows us to maintain industry-leading inference performance without the operational burden of constantly moving data.” Haoyuan Li, CEO at Alluxio, said, “The goal is simple: maximize the value of every GPU,” said Haoyuan Li, CEO at Alluxio. “OCI provides some of the best GPU price-performance in the industry. By pairing that infrastructure with Alluxio’s distributed data acceleration layer, AI teams can keep GPUs fully utilized and scale compute wherever innovation demands.” Sachin Menon, Vice President of Cloud Engineering at Oracle Cloud Infrastructure, said, “Oracle Cloud Infrastructure is designed to deliver the performance, scalability, and cost efficiency required for today’s most demanding AI workloads,” said Sachin Menon, Vice President of Cloud Engineering at Oracle Cloud Infrastructure. “By working with partners like Alluxio, we can help customers reduce bottlenecks and run AI training and workloads with more consistent performance.”