Network Automation Has an Identity Crisis, & That’s a Good Thing
NANOG 96 conversations centered on network automation’s shift away from configuration pushing toward service delivery, with agentic Artificial Intelligence (AI) framed as a reasoning tool constrained by allowed actions. For enterprise operators, the discussion clarifies what to measure and how to govern execution.
Research Overview
The post draws on discussions at NANOG 96 in San Francisco and a panel moderated by Ethan Banks, featuring Justin Ryburn and Bill Lapcevic. The central theme is that automation’s scope has broadened, even as common definitions have lagged behind.
The author uses that setting to argue that automation work is oriented toward delivering services at scale rather than treating device configuration as the end goal. The framing connects the evolution of automation tooling to how operators may adopt AI reasoning in network operations.
Key Findings
The post states that device configuration is “incidental” to the work automation supports, based on a premise set by the moderator. It also describes automation as having evolved from scripts to integrations and then to orchestration and service lifecycle management.
A second finding is the mismatch between how network practitioners and business stakeholders define “service.” The author contrasts network primitives used by technical teams with outcomes sought by applications and business groups, describing this gap as a point where automation initiatives stall.
Technical Breakdown
The post defines an AI agent as a software process that connects a library to a Large Language Model (LLM), with reasoning handled by the model. It says the agent is given a prompt, a set of tools it can use, and a goal, then executes steps while calling those tools.
The author distinguishes between model reasoning at a higher level and limitations at environment-specific details, citing mixed Juniper and Arista fabric specifics and edge cases as examples of where failures can occur. The discussion then focuses on handling error-message variations in pipelines by allowing reasoning to address multiple error varieties rather than hard-coding rigid condition handling.
Operational Impact
The post reports a roughly 50/50 split among NANOG attendees on whether agentic AI is ready for production networks, with skepticism linked to determinism and predictable outcomes. It addresses determinism by placing it “between the agent and the network,” restricting agent actions to an allowed set such as templating, ticket creation, and messaging.
For rollout practices, the author compares agentic AI adoption to prior automation maturity models that started with testing, validation, human approvals, read-only operation, and bounded autonomy. It also describes a customer example where Distributed Denial of Service (DDoS) detection and mitigation reached full automation only after confidence was built through hundreds of events.
The post also describes a broader shift framed as moving from silos to integration and from activity-focused work to outcome-oriented operation. It asserts that AI reasoning can correlate data streams such as syslog, configuration changes, and traffic spikes to answer questions across layers, for example about application latency tied to network telemetry and recent changes.
Blog Signals brief is a fact-based summary of the vendor blog.