Skip to main content

Cloud Native Computing Foundation graduates Dragonfly

The Cloud Native Computing Foundation graduated Dragonfly, an open source image and file distribution system, after the project demonstrated production readiness and adoption for container and Artificial Intelligence (AI) workloads.

Organizations including Ant Group, Alibaba, Datadog, DiDi, and Kuaishou used Dragonfly to power large-scale container and AI model distribution, the CNCF said. The project supported tens of millions of container launches per day, reduced storage bandwidth by up to 90 percent, and shortened launch time from minutes to seconds. Contributor counts rose from 45 individuals at five companies to 271 individuals across more than 130 companies, and total commits increased from roughly 800 to about 26,000.

Dragonfly delivered data distribution and acceleration using peer-to-peer technology. It Radio Access Network (RAN) on Kubernetes and was installed via Helm, with its official chart available on Artifact Hub. The project incorporated Prometheus for performance tracking, OpenTelemetry (OTel) for collecting and sharing data, and gRPC for communication, and it extended Harbor through the preheat feature while supporting AI model artifacts defined by the ModelPack specification.

Alibaba Group open-sourced Dragonfly in November 2017 and the project joined the CNCF Sandbox in October 2018; Dragonfly 1.0 became production-ready in November 2019. The Nydus subproject was open-sourced in January 2020, Dragonfly entered Incubation in April 2020, and Dragonfly 2.0 was released in 2021. To complete graduation, the team revised election policy, clarified the maintainer lifecycle, standardized the contribution process, defined a community ladder, added subproject guidelines, and completed a third-party security audit alongside a joint assessment with CNCF TAG Security.

“Dragonfly's graduation reflects the project's maturity, broad industry adoption and critical role in scaling cloud native infrastructure,” said Chris Aniszczyk, CTO, CNCF. “The combination of Dragonfly and Nydus substantially shortens launch times for container images and AI models, enhancing system resilience and efficiency.” said Jiang Liu, Nydus maintainer.

The project said it would accelerate AI model weight distribution using Remote Direct Memory Access (DMA) (RDMA), optimize image layout to reduce data loading time for large-scale AI workloads, introduce load-aware two-phase scheduling, and add automatic updates and fault recovery to stabilize components during traffic bursts while controlling back-to-source traffic.