The Real Reason AI Agents Fail in Production Has Nothing to Do With the Model

5 min read
Summarize this with AI

Forrester puts the number at 15%. That is the share of enterprise AI projects that successfully scale from pilot to production. The other 85% stall somewhere between a promising demo and anything a real user actually benefits from at scale.

When teams diagnose why, the answer rarely points to the model. The models have been capable for a while. What fails is the infrastructure around them, specifically the identity, authorization, and governance layer that determines whether an agent can operate safely across real enterprise systems.

This post lays out what that infrastructure requires and what it takes to build it correctly.

Identity has to travel with the request

The most common way enterprise teams deploy AI agents today is with service accounts that have broad permissions. The agent authenticates once and operates as itself. The audit trail records that the agent acted, but it does not record who requested it, whether the person was authorized to make the request, or whether the action fell within the scope of what the initiating user should have been able to authorize.

That gap matters the moment a compliance team asks the question any auditor will ask: who authorized this action?

The answer that closes that gap is treating identity as a per-request property rather than a session-level one. The user who initiates an action needs their identity token to travel with that request through every hop in the chain, to the agent, to the tools the agent invokes, to the data systems the tools access. At every layer, the system enforcing the access policy sees the original human identity.

This means that a junior analyst’s entitlements remain junior analyst entitlements when an AI agent performs the work on their behalf. A contractor’s access scope does not expand because an agent is orchestrating the workflow. The audit trail that compliance requires reflects what actually happened.

Policy needs to evaluate context at runtime

Permission tables are a necessary infrastructure for AI governance. They are also a starting point, not a complete solution.

A permission table tells you whether a user has access to a resource. It answers the question at the level of the role or the session. What it cannot answer are the contextual questions that agentic systems generate constantly: 

  • Is this request consistent with the user’s normal behavior? 
  • Is the volume of data being retrieved proportionate to the stated task? 
  • Is this action being taken within the expected operational context?

These are runtime questions. They require a policy that can evaluate context as a request arrives.

An effective governance layer operates at two levels simultaneously:

  1. The first is entitlement: does this user have permission to invoke this tool against this system? 
  2. The second is context: does the current request, given everything known about the user, the agent, the timing, and the data involved, satisfy the runtime policy in force? 

Both conditions must be met before any tool executes. When that runtime layer is in place, AI agents can handle genuinely complex, multi-step workflows in production, and security teams have the visibility they need to stand behind those deployments.

What MCP makes possible

The Model Context Protocol (MCP) is becoming the standard interface between LLMs and the tools they can invoke. Anthropic introduced it, and major enterprise software vendors are adopting it because it solves a real problem. 

Before MCP, every tool integration required custom plumbing between the model and the system. MCP standardizes that interface. It gives agents a consistent way to discover and call tools, and it gives tools a consistent way to receive and respond to requests.

Authorization is what organizations add on top. The architecture that makes this work sits between the LLM and the tools it can invoke. When an agent selects a tool and sends a request, the authorization layer evaluates the full context before execution: user identity, entitlement, runtime policy, and the provenance chain showing which agent is acting and on whose behalf. 

Only when all of that passes does the tool execute, and the outcome is logged with full context at every layer.

The audit trail this produces is the kind compliance teams can actually use: “This user, with these entitlements, authorized this agent to perform this action at this time, and the policy in force was evaluated as follows.”

Governance is the foundation for production AI

Organizations that treat the governance layer as foundational infrastructure rather than a compliance obligation are the ones that close the gap between a promising demo and something that delivers real value at real scale.

Agents operating without a proper identity and authorization infrastructure are agents that organizations cannot trust with consequential work. Agents that they cannot trust with consequential work stay in the demo environment indefinitely. That is where 85% of enterprise AI projects end up.

Getting to the other 15% requires building the identity, authorization, and runtime governance layer first. Then the agents can do the work they were designed to do.

SnapLogic is the Agentic Integration Company.
Category: AI