pnda
PNDA (Platform for Network Data Analytics) is an open-source big data platform (data analytics) for scalable collection, storage, and analysis of network and service telemetry.
- Unified platform for ingesting, storing, and processing large-scale network and IT telemetry (data analytics)
- Supports batch and streaming data pipelines based on a Hadoop ecosystem stack (big data infrastructure)
- Provides a reference architecture for network analytics applications and Machine Learning (ML) workloads (analytics platform)
- Integrates with common messaging, processing, and storage components such as Kafka, HDFS, and Spark (data pipeline orchestration)
- Targets Code Scanning Pipeline (CSP) and enterprise use cases such as fault analysis, capacity planning, and service assurance (network operations analytics)
More About pnda
PNDA (Platform for Network Data Analytics) is an open-source big data platform (data analytics) under LF Networking that focuses on large-scale network and service analytics for communications service providers and enterprises. It addresses the problem of ingesting, normalizing, storing, and analyzing high-volume, heterogeneous telemetry data generated by modern IP networks, virtualized infrastructure, and cloud-native services. The project provides an opinionated architecture and tooling to deploy a complete analytics stack capable of handling both streaming and batch processing workloads.
At its core, PNDA defines a reference architecture and deployment model built on a Hadoop ecosystem stack (big data infrastructure), typically including Apache Kafka for ingestion (data streaming), Apache Spark for processing (distributed analytics), and HDFS for storage (distributed file system). The platform focuses on providing an integrated data pipeline that supports time-series metrics, logs, events, and other machine data from network elements, virtual network functions, and IT platforms. Through this stack, PNDA enables construction of analytics jobs and ML workflows for use cases such as fault detection, anomaly detection, capacity utilization analysis, and service quality monitoring.
PNDA also provides common services to simplify application development and operations. These include mechanisms for data ingestion and schema management (data integration), a standardized data model for storing multi-source telemetry (data modeling), and APIs to access processed and raw data (data access). The platform is designed to be deployed on standard hardware or virtualized/cloud environments (infrastructure platform), with tooling for installation, configuration, and lifecycle management. The goal is to offer a reusable environment so that multiple analytics applications can share the same underlying big data infrastructure.
In enterprise and service provider environments, PNDA is used as a base platform for network operations analytics (NetOps analytics), security monitoring (security analytics), and service assurance (operations support). It allows operators to consolidate previously siloed monitoring and logging systems into a single data lake and analytics environment. By standardizing on a Hadoop-based stack, PNDA aligns with many enterprise data engineering practices and can integrate with existing BI tools, reporting systems, and ML frameworks that can access data stored in HDFS or produced by Spark jobs.
From an ecosystem and interoperability perspective, PNDA is positioned within LF Networking (open networking ecosystem) as a platform that can consume data from Software Defined Networking (SDN) controllers, Network Functions Virtualization (NFV) infrastructure managers, and other network orchestration systems, while exposing analytics results back to operational support systems and automation platforms. Its categorization in an enterprise directory fits under big data platforms for network analytics, observability and monitoring infrastructure, and operations support tooling. The focus on open-source components and reference architectures allows organizations to adapt the deployment to their environment while maintaining a consistent model for network and service data analytics.