NSS Labs introduces an AIPS test methodology to evaluate runtime AI security
NSS Labs says enterprise AI security evaluation has not kept pace with the way copilots and agentic systems are being connected to data and business workflows, creating decisions without verified control effectiveness. The update spotlights two research papers and a new AIPS test methodology aimed at runtime protections.
Research Overview
The blog frames AI security as extending beyond the underlying model to the surrounding system components that determine what happens in production. It describes this as an area where risk can persist if data access, instructions, tool calling, and permissions are not controlled.
NSS Labs also introduces a research series intended to help buyers evaluate AI security. It cites two white papers that address what enterprises should consider and how to structure evaluation questions and criteria.
Key Findings
In the first paper, NSS Labs argues that real exposure includes the environment around the model, such as data the system can touch and the ways instructions can be manipulated. The blog states that mismanagement of those elements can lead to incorrect outcomes even when the model is aligned.
The second paper, according to the blog, provides a buyer-oriented set of evaluation questions, red flags, and criteria to distinguish meaningful controls from claims that are not supported by measurable results. The blog positions runtime guardrails as a central theme of this work.
Technical Breakdown
The blog describes runtime guardrails as controls placed around the model rather than changes to the training process or model itself. It states these mechanisms enforce policy, limit access, constrain agent behavior, and support evidence through observability and audit trails.
It then outlines a new comprehensive test and validation methodology for AI Protections System (AIPS) products. The blog says the methodology evaluates AIPS capabilities across dimensions including protection against prompt injection, prevention of harmful or unauthorized output, evasion handling, resilience under stress and adverse conditions, policy and filter efficacy, agentic behavior and tool invocation security, observability and auditability, and performance impact.
Operational Impact
NSS Labs says the test dimensions are intended to reflect realistic risks for enterprise deployments where AI systems connect to users, enterprise data, tools, APIs, and business processes. The stated goal is to give enterprise buyers, security leaders, and vendors a repeatable and technically rigorous basis for measuring AIPS performance under conditions reflecting real-world use and abuse scenarios.
The blog notes that final reports are expected after testing is completed later in the year. It also points readers to the two white papers as starting points for what matters in evaluation and what “good” should look like when assessing results.
Overall, the blog describes a shift toward measurable runtime evaluation for AI Protections System controls, supported by two white papers and a new AIPS testing methodology covering prompt injection, output authorization, evasion, resilience, policy efficacy, agent/tool security, auditability, and performance impact. This “Blog Signals brief” is a fact-based summary of the vendor blog.