In-process guardrails help developers build agents that try to behave well. Infrastructure-level enforcement proves agents did behave well. Production deployments need both — and it's critical to understand the difference before you buy.
The enterprise AI conversation has shifted. A year ago, the question was whether to deploy AI agents. Today, the question is how to deploy them safely, at scale, with evidence that satisfies regulators, auditors, and boards.
The market has responded with a wave of products that claim to govern what agents do. Microsoft published an open-source Agent Governance Toolkit. Okta, SailPoint, and CyberArk have all announced agent-related capabilities. NVIDIA's NeMo Guardrails, Guardrails AI, LiteLLM, and Portkey each address pieces of the puzzle. Every major platform vendor now has a slide about controlling AI agent behavior.
But most of these solutions operate at the same architectural layer — and that layer, while necessary, is not sufficient for production enterprise deployments. Understanding why requires a distinction that the market hasn't clearly drawn yet.
Layer 1: In-Process Governance
In-process governance means embedding policy enforcement inside the agent's runtime. The governance logic runs in the same process as the agent. When the agent attempts an action — calling a tool, accessing data, spawning a sub-agent — the governance middleware intercepts the request, evaluates it against policy rules, and allows or denies it.
This is what Microsoft's Agent Governance Toolkit does, and it does it well. Sub-millisecond policy evaluation. Deterministic enforcement (not probabilistic, not prompt-based). Four-tier privilege rings inspired by OS hardware protection. Cryptographic agent identity. SRE primitives for reliability engineering.
This is also what NeMo Guardrails does for conversational constraints, what Guardrails AI does for output validation, and what LiteLLM and Portkey do for LLM call management. They all operate within or alongside the agent process, filtering, validating, or routing at the application layer.
Value
Speed and developer ergonomics. Sub-millisecond latency. Native framework integration. Configuration as code, checked into version control alongside the agent.
Limitation
The enforcement boundary is the agent process. Every agent must include the middleware. Every agent must load the right policies. If a developer deploys without the middleware — that agent is ungoverned.
In-process governance is the seatbelt. It protects the occupant when correctly used. But it can be unbuckled.
Layer 2: Infrastructure-Level Enforcement
Infrastructure-level enforcement means deploying governance as network services that sit between agents and the resources they access. The enforcement happens at the network boundary — at the API gateway, the tool invocation proxy, the LLM traffic interceptor. The agent doesn't need to opt into governance. If it wants to reach a tool, an LLM, or a data source, it goes through the enforcement point. There is no alternative path.
This is what firewalls do for network security. This is what API gateways do for service mesh architectures. This is what Policy Enforcement Points (PEPs) do in traditional authorization architectures — a concept that's been battle-tested for decades in enterprise identity management.
At this layer, a dedicated Policy Decision Point evaluates authorization requests over a network protocol. The PDP doesn't run inside the agent — it runs as a separate service, with its own lifecycle, its own availability characteristics, and its own security boundary. The wire protocol is standard (in the case of OpenID AuthZEN, it's a public specification). The enforcement point blocks or allows the action based on the PDP's response.
Universal Enforcement
An agent built with LangChain, an agent built with CrewAI, an agent built with the OpenAI Agents SDK, and a bespoke agent written in Go all go through the same enforcement point. No agent-specific configuration required.
Independence of Proof
When a Layer 2 system records an action, the record is produced by an independent enforcement point — a separate service with its own cryptographic signing key. This is the difference between a driver recording their speed and a speed camera photographing the license plate.
Why Most Vendors Sell You Only One Layer
Building an in-process SDK is a well-understood engineering problem. You define interfaces, implement middleware, publish packages, and write framework adapters. Microsoft did this in months — and shipped five language SDKs. It's hard work, but the architecture is familiar.
Building infrastructure-level enforcement for AI agents is a different engineering challenge. You need a PDP that can evaluate rich authorization requests at single-digit millisecond latency. You need specialized PEPs for different traffic types — LLM interactions have different authorization semantics than tool invocations. You need an identity model that integrates with existing enterprise directories, not one that creates a parallel identity system. You need a credential isolation architecture where tokens never enter the agent's process memory. You need budget enforcement at the point of execution. And you need a proof chain — cryptographic receipts that independently verify every governed action.
This is identity governance infrastructure. It builds on decades of authorization architecture — XACML, OAuth 2.0, SAML, SCIM, and now AuthZEN. It requires deep domain expertise in enterprise identity, delegation models, and regulatory compliance. It's not something you build from scratch in a quarter.
What Production Deployments Actually Need
Identity
Enterprises already have Azure AD, Okta, or Ping. Agents should participate in the same identity chain via OAuth 2.0 Token Exchange (RFC 8693) and DPoP (RFC 9449) — not a parallel system with custom DIDs and proprietary trust scores that creates administrative overhead and audit gaps.
Authorization
The same PDP should govern agents, APIs, and UI sessions through a single policy surface. OpenID AuthZEN provides a standard evaluation API. A network-accessible PDP means one policy change propagates to every governed surface — not YAML changes in every agent repository.
Audit
Structured logging is useful for debugging. It is not evidence. Regulators need signed receipts (JWS) with per-agent hash chains providing independently verifiable proof — not log entries that require trust in the logging infrastructure and post-hoc reconstruction.
Budget Control
An in-process SDK has no native mechanism for budget enforcement. An infrastructure-level enforcement point can perform budget pre-checks, apply stream-time caps during LLM generation, and return HTTP 402 when budgets are exhausted. Finance teams get real-time spend visibility.
Tool Integrity
An in-process scanner detects known patterns of tool tampering — heuristic detection, analogous to antivirus scanning. An infrastructure-level gateway validates cryptographic schema pins — RS256-signed attestations analogous to code signing. When supply chain integrity is at stake, you need the signature.
The Defense-in-Depth Architecture
The strongest deployment architecture uses both layers:
Layer 1 (In-Process)
Catches obvious policy violations at sub-millisecond latency. Provides fast feedback to developers during testing. Enforces capability boundaries within the agent's execution context. It's the seatbelt.
Layer 2 (Infrastructure)
Enforces policy at the network boundary regardless of agent implementation. Provides the cryptographic proof chain. Integrates with enterprise identity. Governs budgets. Pins tool schemas. It's the traffic enforcement system.
The two layers are complementary, not competing. A Layer 1 denial prevents a wasted network call. A Layer 2 denial prevents an ungoverned action. A Layer 1 log entry aids debugging. A Layer 2 receipt satisfies an auditor.
This is exactly the pattern we've built for. Our open-source adapters on PyPI (authzen-policy-backend and aria-agentkit) implement Microsoft's AGT extension point interfaces natively. A developer using the toolkit can add infrastructure-level enforcement with a pip install and three lines of configuration. No code changes. No framework migration. The seatbelt and the traffic system, working in concert.
How to Evaluate Vendors Against This Framework
When you're evaluating runtime execution control for your AI agent deployments, ask these questions:
Where does enforcement happen? In the agent's process, at the network boundary, or both?
What identity standard does it use? Standard OAuth/OIDC, or a parallel identity system?
What does it produce as evidence? Structured logs, or cryptographically signed receipts with hash chains?
Does it enforce budgets? Pre-execution checks with real-time enforcement, or post-hoc cost reporting?
What's the wire protocol? Proprietary, or a public standard like AuthZEN?
Can it govern agents you didn't build? Can third-party agents be governed without adopting your SDK?
The Bigger Picture
The market for runtime execution control is in its earliest innings. The fact that Microsoft shipped an open-source toolkit with serious engineering depth is a net positive — it raises the floor for the entire industry. The fact that multiple vendors are competing in this space means enterprises have choices.
But the architectural distinction between in-process governance and infrastructure-level enforcement is not a marketing nuance. It's a fundamental design decision that determines what you can prove, who you can govern, and how your deployment scales. Enterprises putting AI agents into production — where regulatory compliance, financial controls, and supply chain integrity matter — need both layers.
The seatbelt and the traffic system. The antivirus and the code signature. The diary and the notarized document. Both layers. Always.
EmpowerNow ARIA provides infrastructure-level runtime execution control for AI agents — authorization at the network boundary, cryptographic proof of every governed action, and budget enforcement at the point of execution. Our open-source adapters integrate natively with Microsoft's Agent Governance Toolkit. Learn more about ARIA →