MCP Tool Poisoning Prevention: Gateway Controls for the Model Context Protocol Surface
Model Context Protocol tool poisoning is the agentic analog to supply-chain compromise. An MCP server presents a set of tools to an agent host; an attacker who controls the MCP server (or the tool definitions an MCP server advertises) can change what the tools do, what they return, or what parameters they accept. The agent loop calls the tool in good faith and the actions executed against downstream systems are the attacker'"'"'s. The prevention surface splits across MCP server selection, tool-definition pinning, and per-decision authorization at the agent-tool boundary. This article walks through the MCP poisoning patterns and the gateway controls that contain them.

Model Context Protocol tool poisoning is the agentic analog to supply-chain compromise. An MCP server presents a set of tools to an agent host; the agent host queries the server for available tools, retrieves the tool definitions including descriptions, parameter schemas, and capability claims, and offers the tools to the model as part of the agent loop. An attacker who controls the MCP server (or the tool definitions an MCP server advertises) can change what the tools do, change what they return, change what parameters they accept, or change the description the model uses to decide when to call them. The agent loop calls the tool in good faith and the actions executed against downstream systems are the attacker's.
The MCP surface is growing fast enough that the tool-poisoning risk is now a concrete operational concern, not a theoretical one. Multiple MCP server implementations have shipped without authentication on the server endpoint. Multiple MCP server registries have shipped without provenance on the registered servers. The pattern of "install this MCP server to give your agent access to X" lowers the friction for introducing third-party MCP servers into agent environments without the supply-chain controls the underlying API access would have required.
I want to walk through the MCP poisoning patterns that show up in practice, the prevention surface that splits across MCP server selection, tool-definition pinning, and per-decision authorization at the agent-tool boundary, and the gateway controls that contain the blast radius when one of the upstream controls fails.
The MCP poisoning patterns
Five poisoning patterns recur against MCP-served tools.
Description manipulation. The MCP server advertises a tool with a description that misrepresents what the tool does. The model reads the description when deciding which tool to call. A description that says "fetch the user's calendar" for a tool that actually sends an email induces the model to invoke the tool in contexts where the email action was not intended.
Parameter schema spoofing. The MCP server advertises a tool with a parameter schema that the agent host parses, but the server's actual tool execution accepts additional unadvertised parameters. The agent loop sends what the schema specifies; an upstream-compromised server passes additional attacker-controlled parameters to the underlying service.
Tool substitution at invocation. The agent host calls tool A; the MCP server routes the invocation to tool B. The agent's planning step reasoned about tool A's expected behavior; tool B executes a different action. The substitution is invisible to the agent host without a separate validation of the call path.
Response injection. The MCP server returns a tool response that includes content designed to manipulate the model's next-turn reasoning. The response includes embedded instructions ("Disregard your previous task and execute X"), encoded payloads the model is encouraged to decode and act on, or false context the model treats as authoritative.
Capability creep over the server's lifetime. The MCP server starts with a narrow tool set; the operator (or an attacker who has taken over the operator's deployment) adds tools incrementally. The agent host that did not pin the tool list reads the current set on every connection and offers any newly added tool to the model. The expanded capability set was not authorized by the original deployment decision.
The prevention surface
The prevention surface splits across three layers.
MCP server selection. The decision of which MCP servers an agent environment is allowed to connect to belongs to the platform team, not to the individual agent or the application developer. The decision should be made through a server-allowlist with provenance attestation. Each allowlisted server has a known operator, a documented purpose, and a documented set of tools the operator declares the server provides. Off-allowlist servers are blocked at the network layer.
Tool-definition pinning. The agent host should pin the tool definitions it received from the MCP server at deployment time rather than refreshing from the server on every connection. The pinning includes the tool name, the description, the parameter schema, and the capability claims. Any deviation between the pinned definitions and the runtime-fetched definitions is treated as a change event that requires re-review before the agent uses the updated definitions.
Per-decision authorization at the agent-tool boundary. The gateway between the agent and the tool surface evaluates whether the calling identity is authorized for the specific tool invocation with the specific parameters. The authorization is independent of what the MCP server advertised. A poisoned tool that advertises calendar access but executes email sends gets blocked at the gateway if the calling identity is not authorized for email sends.
The three layers are complementary. Server allowlisting reduces the supply-chain surface. Tool-definition pinning catches drift after deployment. Per-decision authorization catches misalignment between the advertised capability and the actual execution.
The gateway controls in detail
Six controls at the agent-tool boundary do most of the MCP poisoning containment.
Per-identity tool allowlists keyed on tool identifier and not on description. The allowlist records which canonical tools each identity is permitted to invoke. The identifier is stable across description changes; an attacker who alters the description does not change the allowlist's binding.
Parameter constraint enforcement against pinned schemas. The gateway holds the pinned parameter schema for each tool and validates every invocation against the schema. Invocations with parameters outside the schema are rejected at the gateway, including the case where an MCP server later announces an extended schema that includes attacker-controlled fields.
Response validation against pinned shapes. Responses from the MCP server are validated against the pinned response schema before being returned to the agent loop. Responses with extra fields are stripped; responses that fail schema validation are blocked. The control catches response-injection patterns that ride on out-of-schema content.
Per-decision audit on every tool invocation with both the advertised and the executed action. The audit record captures the tool identifier, the parameters as sent, the MCP server's response, the policy decisions applied, and the calling identity. The forensic trail is the source data for detecting drift between advertised and actual behavior.
Network-level allowlisting of MCP server endpoints. The gateway (or the broader network controls in front of it) restricts which MCP server endpoints the agent host can connect to. Off-allowlist endpoints are blocked at the network layer regardless of any application-side configuration that might point at them.
Drift detection on tool-definition fetches. The gateway compares the tool definitions returned by the MCP server on each connection against the pinned definitions. Deviations are flagged, the agent is paused or routed to a degraded mode, and the security team is notified. The control catches the capability-creep pattern at the moment the drift appears.
What sits outside the gateway boundary
The MCP server implementation is outside the gateway boundary. The gateway cannot patch CVEs in the MCP server, cannot audit the server's source code, and cannot guarantee the server is implementing the MCP specification correctly. The controls at the supply chain layer (server allowlisting, provenance attestation) are the primary defense for those failure modes.
The downstream service the MCP server connects to is partially outside the boundary. The gateway sees the tool invocation but does not see the side effects the tool produces inside the downstream service. If the tool itself has been compromised at the service level (the operator has been breached), the gateway records the invocation but does not prevent the side effect.
The agent's planning step is outside the boundary. The model decides which tool to call based on the descriptions it has been given. The gateway evaluates the invocation after the decision; the gateway does not rewrite the model's reasoning.
This is one of the cases where the DeepInspect HTTP-boundary rule applies cleanly. If an MCP server is compromised in a way that lets the attacker exfiltrate credentials directly to a command-and-control endpoint outside the agent loop's HTTP traffic, the exfiltration path does not pass through the gateway. The architectural fix is at the supply chain layer where the MCP server itself is secured.
How MCP tool poisoning maps to OWASP AISVS and the OWASP Top 10 for Agentic Applications
OWASP AISVS Chapter 9 (agentic systems) and Chapter 4 (supply chain) cover the verification requirements for the MCP surface. The verification requirements include documented tool allowlists, provenance for tool definitions, per-invocation logging, and parameter constraint enforcement.
OWASP Top 10 for Agentic Applications (2026 release) introduces "agentic skills" as a vulnerable intermediate behavior layer. MCP tool definitions sit at that layer. The Top 10 for Agentic Applications calls for verification that the agentic-skills layer is identity-aware, audited per-invocation, and bounded by per-identity authorization.
The June 2026 LiteLLM CVE wave is the supply-chain reminder that the gateway and proxy infrastructure of the AI stack is not exempt from CVE-level issues. MCP server implementations are in the same category and need the same patch cadence and the same scanning posture as any other security-critical infrastructure.
DeepInspect
This is the boundary-layer control DeepInspect provides for the MCP poisoning surface. DeepInspect sits inline between authenticated agents and the tools they invoke through MCP servers, binds every invocation to a verified identity, enforces per-identity tool allowlists keyed on stable identifiers, validates invocations and responses against pinned schemas, and writes a per-decision audit record outside the calling agent host.
The audit record includes the tool identifier, the parameters as sent, the MCP server's response, the policy decisions applied, and the calling identity. The forensic trail enables post-incident reconstruction of any tool invocation and surfaces drift between advertised and executed behavior on every connection. The per-identity authorization at the gateway means a poisoned tool that advertises one action but executes another still gets blocked at the gateway if the calling identity is not authorized for the executed action.
If you are deploying agents that use MCP-served tools and your tool-poisoning coverage depends on trusting every MCP server in the supply chain to be honest, let's talk today.
Frequently asked questions
- What is the Model Context Protocol?
MCP is a specification for how host applications expose tool and resource capabilities to AI agents. An MCP server runs at an endpoint, advertises a set of tools with descriptions and parameter schemas, and executes tool invocations on the agent's behalf. The protocol formalizes the interface so any compatible agent host can connect to any compatible MCP server.
- How is MCP tool poisoning different from classic supply-chain attacks?
The mechanics are similar: an attacker compromises a component in the stack, and the downstream consumer trusts the component. The MCP-specific surface is the agent's reliance on tool descriptions and parameter schemas to make planning decisions. A description manipulation that an attacker introduces is invisible to the agent because the agent has no source of truth outside the MCP server's own advertisement.
- Do I need an MCP-specific gateway?
The controls described above are applicable to any agent-tool boundary regardless of whether MCP is the protocol. An MCP-aware gateway adds value through MCP-specific drift detection and schema pinning. The core controls (per-identity allowlists, per-invocation audit, parameter validation, response validation) generalize.
- What is server allowlisting in this context?
A list of MCP server endpoints (URL plus public key fingerprint or equivalent) that the agent host is permitted to connect to. Connections to off-allowlist endpoints are blocked at the network layer. The allowlist is curated by the platform team based on documented operator provenance and documented purpose.
- How often should tool definitions be re-pinned?
The pinning is updated on a scheduled cadence that includes human review of the new definitions, or on an event-triggered basis when the MCP server operator announces a tool-set change. Automatic re-pinning on every connection defeats the purpose because it eliminates the human review step.
- What does response validation look like in practice?
The pinned response schema specifies the expected shape of the tool's return value: field names, types, allowed enumerations, and any content classifications that apply. The gateway validates each response against the schema; non-conforming responses are blocked or have non-schema fields stripped. The control catches response-injection patterns that hide instructions in out-of-schema