Skip to Content

AI & MCP

Where Claude falls short in AI security

Cameron McClellan

Cameron McClellan

May 15, 2026 - 9 min read

Where Claude falls short in AI security

Claude’s enterprise plan is a genuine step forward compared with unmanaged AI tool usage. It ties developer identity to a corporate SSO provider, gives compliance teams programmatic access to conversation logs, and lets administrators publish an approved list of MCP servers. For organizations previously running Claude Code against personal Anthropic accounts with no oversight, that’s a meaningful starting point, but it doesn’t constitute a security posture.

Claude Enterprise gives security teams visibility into how developers use AI, but doesn’t govern what AI agents do after the model responds: tool calls, MCP server invocations, data reads, and system actions that happen at machine speed, often without human review before they execute. That gap matters more now that AI is running agentic workflows in production.

This article looks at where Claude’s native controls end, and what that means for security teams weighing whether they can afford to wait.

What security controls does Claude Enterprise include?

Claude Enterprise includes SSO and identity via SAML 2.0 and OIDC, with SCIM for automated provisioning and deprovisioning. Developers log in with corporate credentials and SCIM means a departing employee’s Claude access is revoked with their Anthropic workspace seat.

The Compliance API gives compliance teams real-time programmatic access to conversation content (prompts and completions), with deletion of chats, files, and projects. The MCP server allowlist lets administrators publish approved servers and deploy them to developer machines. Spend controls and zero data retention  (available under the Enterprise agreement on request) round out the package.

Why Claude Enterprise doesn’t cover agentic tool calls

Anthropic’s own documentation  makes the boundary explicit: as of May 2026, Cowork and agent activity is excluded from all three compliance mechanisms (Audit Logs, Compliance API, and Data Exports) across every plan tier, including Enterprise. The audit trail ends the moment an AI agent makes a tool call.

Consider what a typical agentic Claude Code workflow looks like in practice. A developer asks the agent to investigate a failing test. The agent reads the test file, queries the database for related records, checks the Git log for recent changes, runs the test suite, and posts a summary to Slack. Five tool calls, none of which appear in the Compliance API. If a customer record was leaked through that Slack message, the audit log would only show the conversation (the prompt in, the model’s text response out). The actual tool call that sent data to Slack happened after the model responded and wouldn’t appear anywhere in the log.

The diagram below shows what Claude Enterprise captures and what falls outside its scope.

Two-column diagram showing what Claude Enterprise captures (Compliance API, Audit Logs, Data Exports, MCP Allowlist) versus what is not captured (tool call arguments and results, data reads, agent actions, Cowork activity). The not-captured column is visually muted, illustrating the audit trail gap at the tool-call layer.

The MCP server allowlist has a similar structural limit. It’s a configuration file deployed to employee laptops that restricts which servers users are permitted to add, and it cannot enforce that restriction at the protocol layer. Specifically, it has no way to:

  • block an employee who adds a server to their local ~/.claude.json
  • revoke access in real time when an employee changes teams
  • detect when a registered server has been modified to include malicious tool descriptions

MCP security risks Claude Enterprise doesn’t address

Claude’s controls were designed for the conversation layer. In the gap between a model response and the tool calls that follow it, several attack surfaces have no coverage:

  • Prompt injection — an attacker embeds hidden instructions in content the agent reads (documents, issues, tickets); the agent follows them as legitimate tasks, and neither the injected instructions nor the resulting tool calls appear in audit logs (demonstrated against the GitHub MCP server, May 2025 )
  • Tool poisoning — MCP server descriptions can include hidden instructions that redirect agent behavior on every invocation, whether through shadow AI installs and unsanctioned tools outside IT approval or post-approval rug pulls on registered servers; the MCP gateway inspects tool descriptions at the protocol layer on every call, a config-file allowlist does not
  • Supply chain compromise — MCP packages can be malicious before installation; in early 2026 LiteLLM (3.4M daily downloads) was compromised on PyPI, harvesting cloud credentials and SSH keys from every affected system
  • Environment and config injection — repository config and environment variables can redirect agent traffic before trust mechanisms activate (CVE-2025-59536, CVE-2026-21852; both patched, the structural pattern has not)
  • Multi-agent propagation — in chained agent pipelines, a compromised agent can pass injected instructions to downstream agents with broader permissions

The following diagram maps each attack vector to the tool-call layer and the effects that reach the agent.

Architecture diagram showing five attack vectors (prompt injection, tool poisoning, supply chain compromise, config injection, multi-agent propagation) entering through the unmonitored tool-call layer to reach the Claude Code agent. The agent box shows three effects: credential theft, data exfiltration, and unauthorized actions.

We cover these threats in more detail in What is AI security?.

The compliance deadline for AI agents

Enterprise security teams asking “can we wait on this?” are operating in a regulatory environment that is moving faster than most AI governance programs.

EU AI Act

The EU AI Act  Annex III enforcement deadline in August 2026 requires organizations using AI in high-risk categories (employment decisions, access to essential public and private services, law enforcement, and others) to have documented oversight, audit trails covering AI decision-making, and the ability to intervene in AI system operation. Article 25 makes a deployer responsible for provider-level obligations when they modify an AI system’s intended purpose, a threshold worth assessing with legal counsel for any deployment that connects AI to internal data through MCP. Under this framing, tool-call audit logs are the documentation that satisfies the oversight requirement.

SOC 2 and ISO 27001

SOC 2  and ISO 27001  auditors are not yet requiring tool-call logs by name, but they are asking about AI systems in scope and what controls govern their access to sensitive data. An organization that can show conversation-level logs but not what data an AI agent read or wrote through tool calls is leaving a gap that a reasonable auditor will treat as a finding.

Incident rate

A 2026 Gravitee survey  of over 900 executives and practitioners found that 88% of organizations reported confirmed or suspected AI agent security incidents in the prior 12 months. According to the Harvard Law School Forum on Corporate Governance , 72% of S&P 500 companies disclosed at least one material AI risk in 2025. The Cloud Security Alliance  found that only 26% of organizations have comprehensive AI security governance policies in place. Organizations treating AI tool governance as a future-quarter project are already reporting incidents.

How an AI control plane closes the security gap

The AI control plane closes the gap between Claude’s native controls and enterprise security requirements. The MCP gateway (also called an AI gateway) is the enforcement point at the tool-call layer.

Control
SSO and user identity
Claude Enterprise
AI control plane
Conversation audit logs
Claude Enterprise
AI control plane
MCP tool-call logs (arguments + results)
Claude Enterprise
AI control plane
PII redaction at the tool layer
Claude Enterprise
AI control plane
Prompt injection blocking
Claude Enterprise
AI control plane
Per-tool access control (not just per-server)
Claude Enterprise
AI control plane
Shadow MCP detection and blocking
Claude Enterprise
Allowlist only
AI control plane
Protocol enforcement
Dynamic access revocation (identity-based)
Claude Enterprise
AI control plane
Tool description inspection against poisoning
Claude Enterprise
AI control plane
Cross-session incident reconstruction
Claude Enterprise
AI control plane

The diagram below shows the four layers a control plane applies to every tool call between an agent and the systems it can reach.

Architecture diagram showing an AI control plane between callers (Claude Code, Cursor, Copilot, autonomous agents) and destinations (MCP servers, enterprise databases). The control plane contains four layers: identity and access, MCP gateway, policy and threat detection, and observability and audit.

Claude’s controls live in configuration files on employee machines, enforced by the client. A control plane runs as infrastructure: a gateway through which every tool call is routed, inspected, and logged, with policy enforced server-side rather than dependent on the state of any individual employee’s laptop. In practice:

  • When a developer connects an unapproved MCP server, it’s blocked at the protocol layer, not just missing from a config file.
  • When a developer changes teams, their tool-call permissions update the moment their group membership changes in the identity provider.
  • When an auditor asks what data an agent accessed last quarter, there’s a structured log of every tool call: tool name, arguments, result, duration, and the identity behind it. That log is the foundation of AI observability: the ability to reconstruct exactly what an agent did and why.

Claude Code is a capable AI coding assistant, but its enterprise controls cover only the conversation layer, and production agentic workflows need governance at the tool-call layer.

If you’re a security team trying to understand what your organization’s exposure looks like before the August 2026 compliance deadline, the Speakeasy AI control plane is where that conversation starts.

Frequently asked questions

No. As of May 2026, Anthropic’s own documentation explicitly excludes Cowork and agent activity from all three compliance mechanisms (Audit Logs, Compliance API, and Data Exports) across every plan tier, including Enterprise, and the audit trail ends the moment an AI agent makes a tool call. Cowork and agent activity, including tool calls, is not captured in the Compliance API .

An MCP server allowlist is a configuration file deployed to employee machines that restricts which servers users are permitted to configure. It cannot block an employee who adds a server to their local ~/.claude.json alongside the approved list, and it cannot detect when an approved server has been modified after deployment. Protocol-level enforcement routes every tool call through a gateway that applies policy server-side, blocking at the network layer regardless of what’s configured on the employee’s laptop.

You can, but the native controls cover only the conversation layer. If your workflows involve tool calls to internal databases, APIs, or external services (which most production agentic workflows do), the gaps are:

  • no audit trail for tool-call actions
  • no way to enforce which tools a given user can call
  • no detection for prompt injection or tool poisoning attacks targeting the tool layer

Whether that exposure is acceptable depends on the sensitivity of the data the agent can reach and your organization’s regulatory obligations.

The Annex III enforcement deadline in August 2026 requires organizations using AI in high-risk categories to maintain documented oversight, audit trails covering AI decision-making, and the ability to intervene in AI system operation. Article 25 extends provider-level obligations to deployers who modify an AI system’s intended purpose, which includes connecting it to internal data sources through MCP. Conversation logs alone are unlikely to satisfy the oversight requirement; auditors will want documentation of what the agent actually did.

No. Prompt injection attacks embed hidden instructions in content the agent reads, such as documents, issues, tickets, or API responses, and the agent follows them as legitimate tasks. Blocking prompt injection requires inspection at the point where the agent reads external content, which is a function of the MCP gateway.

An AI control plane is the governing infrastructure layer between AI agents and the systems they’re allowed to reach. Rather than relying on configuration files on employee machines, a control plane routes every tool call through a central gateway that can inspect, log, redact, and block based on policy. It enforces access control at the protocol layer, integrates with identity providers for real-time permission updates, and produces a structured audit log of every tool call: arguments, results, duration, and the identity behind it. It operates independently of which AI client or model the employee is using.

Last updated on

AI everywhere.