Skip to main content

Apache InLong

Apache InLong is an open-source one-stop data ingestion and integration framework (data integration, streaming data infrastructure) for massive data, supporting collection, aggregation, and distribution of multi-source data across diverse storage and computing systems.

  • End-to-end data ingestion framework for multi-source, massive data (data integration)
  • Supports collection, aggregation, and distribution of data streams across heterogeneous systems (streaming data infrastructure)
  • Provides pluggable components for data access, message queuing, storage, and management (modular data pipeline)
  • Offers management console and configuration mechanisms for defining and operating data ingestion tasks (data pipeline orchestration)
  • Designed for high-throughput, scalable data streaming and batch ingestion scenarios (big data infrastructure)

More About Apache InLong

Apache InLong is an open-source one-stop data ingestion and integration framework (data integration, streaming data infrastructure) under The Apache Software Foundation. It focuses on reliable collection, aggregation, and distribution of massive data from multiple sources into diverse storage and computing systems. The project targets scenarios where enterprises need a unified and manageable way to move large-scale data streams from business systems, logs, databases, or applications into downstream data platforms.

The framework is designed around a pluggable, modular architecture (modular data pipeline) that typically includes components for data access, message queuing, storage, and management. It supports the construction of end-to-end data ingestion pipelines that can handle both streaming and batch data. Through its abstraction of data flows, InLong enables users to define logical sources, sinks, and processing paths that can be deployed to underlying distributed messaging and storage technologies, as documented in its official materials.

Apache InLong provides a management console and configuration mechanisms (data pipeline orchestration) to help operators and developers define, manage, and monitor data ingestion tasks. These tools support lifecycle operations for data streams, including creation, modification, and suspension of data flows, as well as visibility into the status of tasks and components. This management layer targets enterprise needs around governance and operational control for large-scale data movement.

In enterprise environments, InLong is used to connect heterogeneous systems (system integration) such as online business applications, log collection agents, and data warehouses or lakes. It supports high-throughput ingestion (big data infrastructure) suitable for scenarios like behavioral log collection, business event streaming, and data synchronization across regions or business units. The framework is aligned with technologies from the broader Apache ecosystem, and its documentation describes how it can be deployed on distributed clusters and integrated with standard big data components.

From an interoperability standpoint, Apache InLong is built to interface with multiple kinds of data sources and sinks (data connectivity). Its modular design allows organizations to plug in specific connectors or adaptors for their existing infrastructure, providing a way to standardize data ingestion without fully replacing current systems. This positions InLong in an enterprise directory as a data ingestion and streaming integration platform, suitable for cataloging under categories such as data integration, streaming data infrastructure, big data ingestion, and pipeline orchestration.