Latency
Latency is the elapsed time between a request or stimulus and the corresponding response in a digital system, typically measured in milliseconds, used to characterize performance in networks, applications, storage, and computing.
Expanded Explanation
1. Technical Function and Core Characteristics
Latency quantifies delay across a path that can include processors, memory, storage, and networks. It often appears as one-way latency, round-trip time, or end-to-end delay, and organizations usually measure it in milliseconds or microseconds.
In networks, latency includes propagation, transmission, processing, and queuing delays. In compute and storage, latency covers the time required for instruction execution, memory access, input or output operations, and data retrieval or commit operations.
2. Enterprise Usage and Architectural Context
Enterprises track latency as a core performance metric for networked applications, cloud services, databases, APIs, and microservices. Service level objectives and Service Level Agreements (SLAs) often specify latency thresholds for end-user transactions and machine-to-machine interactions.
Architects manage latency through network design, Traffic Engineering (TE), workload placement, data locality, and caching strategies. Observability platforms, synthetic testing, and protocol-level instrumentation expose latency distributions such as p50, p95, and p99 to support capacity planning and incident response.
3. Related or Adjacent Technologies
Latency closely relates to throughput, jitter, and packet loss in network engineering. It also relates to storage input/output operations per second (IOPS), query response time in databases, and tail latency behavior in distributed systems.
Standards bodies and industry groups define latency metrics and test methods for domains such as 5G, Time-Sensitive Networking (TSN), and real-time industrial control. Performance engineering practices incorporate latency into benchmarking, load testing, and Quality of Service (QoS) mechanisms.
4. Business and Operational Significance
Latency affects user experience, transaction completion rates, and the behavior of real-time applications such as voice, video, trading, and industrial automation. High or variable latency can contribute to timeouts, retries, and Service Level Objective (SLO) breaches.
Operations teams use latency metrics for incident detection, Root Cause Analysis (RCA), and change validation. Procurement, network planning, and cloud migration decisions often evaluate latency characteristics of connectivity options, data center locations, and service providers.