How is Kong AI Gateway different from DeepInspect?

Kong AI Gateway is a family of plugins on the Kong data plane that handle multi-provider LLM routing, semantic caching, token rate limiting, and prompt template injection. The AI Prompt Guard plugin filters prompts against regex allow and deny lists. DeepInspect is an identity-bound policy enforcement layer that classifies prompt data, evaluates per-route policy bundles, and commits per-decision audit records formatted for EU AI Act Article 12 review and sector audit requirements. Kong's logs satisfy the existence requirement of an audit. DeepInspect's audit records satisfy the traceability requirement that Article 12 and the Fannie Mae LL-2026-04 review apply.

Can Kong AI Gateway replace DeepInspect for a regulated workload?

For unregulated workloads, Kong's AI plugins cover the operational layer that a platform team needs. For regulated workloads, Kong's operational logs fall short of the per-decision audit record format the regulator expects. The Kong log lacks natural-person identity attribution at the model API call (the AI Prompt Guard plugin operates on the prompt text, not the identity context), the data classification outcome beyond a regex match, the policy version that evaluated the decision, and the cryptographic integrity signature that decouples the audit record from the application that took the action.

Can DeepInspect replace Kong AI Gateway?

For deployments that already have multi-provider operational routing handled elsewhere, DeepInspect forwards cleared requests to a configurable upstream and can address a specific provider directly. For deployments that want the operational features Kong AI Gateway provides (semantic caching, token rate limiting, prompt template injection, per-consumer attribution across many providers), DeepInspect does not replace Kong. The two layers compose, which is the common production pattern.

How does the deployment topology work when both are in production?

The application calls DeepInspect. DeepInspect evaluates the policy and commits the audit record. DeepInspect forwards the cleared request to Kong AI Gateway. Kong's AI Proxy plugin selects an upstream provider based on its routing rules and forwards the request. The response flows back through Kong to DeepInspect to the application. DeepInspect sees the prompt content and the identity context. Kong sees the request after DeepInspect has cleared it. The audit record carries both DeepInspect's policy outcome and Kong's routing outcome so the operator can reconstruct the full request path.

Where does the AI Prompt Guard plugin fit if DeepInspect is in front of Kong?

The AI Prompt Guard plugin still runs at the Kong layer. The plugin operates on whatever prompt content reaches Kong, which after the DeepInspect layer has already classified and policy-evaluated the request is the cleared traffic. The regex allow and deny lists in the Prompt Guard plugin can be a second filter for content categories that Kong's operator owns separately from the security team's policy bundle. The composition does not remove the plugin; it adds the regulatory audit layer above it.

DeepInspect vs Kong AI Gateway: Where Each One Fits and Where the Two Layers Compose

Kong AI Gateway is the AI-focused extension of the Kong API Gateway, the open-source data plane that handles HTTP traffic at scale. The AI Gateway plugins ship on top of Kong's existing proxy and add multi-provider LLM routing, semantic caching, prompt templates, prompt guards, AI-aware rate limiting, and per-consumer token attribution. The product targets the operational concerns of running LLM traffic through a controlled gateway. DeepInspect sits at the same HTTP position but answers a different question. It enforces identity-bound policy on prompt content, classifies prompt data against the regulated data types the organization recognizes, and commits a per-decision audit record that a reviewer for EU AI Act Article 12 or a Fannie Mae LL-2026-04 review accepts. The two products are compatible and compose in production deployments.

I want to walk through what Kong AI Gateway does, what DeepInspect does, where the responsibilities overlap, and how the two layers compose for regulated workloads.

TL;DR

Kong AI Gateway extends Kong's proxy with AI traffic plugins: routing across LLM providers, semantic caching, prompt templates, basic prompt guards, and token attribution. DeepInspect enforces identity-bound policy on prompt content and produces per-decision audit records formatted for regulatory review. Regulated deployments run DeepInspect in front of Kong AI Gateway, which preserves Kong's operational benefits and adds the regulatory audit layer that Kong was not designed to provide.

Kong AI Gateway: what it is and where it sits

Kong AI Gateway ships as a set of plugins on the Kong data plane. The core plugin is the AI Proxy, which normalizes the API surface across OpenAI, Anthropic, Bedrock, Azure OpenAI, Vertex, Cohere, Mistral, and other supported providers. The plugin translates inbound OpenAI-compatible requests to the upstream provider's native format and rewrites responses to the OpenAI shape. The application code uses one API surface and the gateway handles the routing.

The Kong AI plugin family covers operational concerns. Semantic caching reuses responses across similar prompts when the cache hit threshold is met. Prompt template plugins let an administrator inject system prompts at the gateway layer. The AI Prompt Guard plugin filters prompts against allow and deny lists at the regex level. The AI Rate Limiting Advanced plugin attributes token consumption per consumer and applies rate limits in tokens rather than requests. The AI Request Transformer and AI Response Transformer let an administrator use a separate LLM to rewrite inbound or outbound traffic.

The architectural sweet spot for Kong AI Gateway is the operational layer of a multi-provider LLM deployment. A team running Kong already for non-AI HTTP traffic gets the AI plugins on the same operator surface, which is attractive for platform teams that prefer a single data plane.

What DeepInspect is and where it sits

DeepInspect sits at the AI request boundary as a separate enforcement layer. It evaluates identity-bound policy on every request, classifies prompt data against the regulated data types the organization recognizes, and commits a per-decision audit record with cryptographic integrity. The decisions are deterministic, fail-closed, and independent of the model's behavior.

The feature set covers identity attribution at the model API call (natural-person identity attached from the application's identity primitive, not the API key alone), per-route policy enforcement (different rules for the support route, the developer route, the legal route), prompt-level data classification (PII, PHI, MNPI, source code, source-licensed content, regulated jurisdictional data), policy decisions that pass, block, or modify the request, and the per-decision audit record format that downstream audit pipelines consume.

The architectural sweet spot for DeepInspect is the regulated workload. An organization that is the data controller for prompts crossing into a model provider needs evidence that satisfies the deployer obligations under Article 26, the audit obligations under Article 12, the lender record obligations under Fannie Mae LL-2026-04, and the sector-specific regimes (HIPAA, DORA, FedRAMP, ISO 42001) that the workload is subject to.

Where the two products overlap

Both products run as an HTTP proxy in front of LLM endpoints. Both products can attach metadata to the request and write a log of what passed through. The overlap is at the surface level. The underlying responsibilities differ.

Kong AI Gateway's Prompt Guard plugin filters prompts against an allow list and a deny list of regex patterns. The filter operates on the raw prompt text. A pattern match either rejects the prompt or strips the matching content. This satisfies a basic content control use case for unregulated workloads.

DeepInspect's policy evaluation operates on the prompt content, the natural-person identity, the route context, the data classification outcome, and the policy version active at the time of the decision. The audit record carries each of these. The audit format is what a regulator reviewing the deployment under Article 12 expects to see.

Both products produce records of requests. Only one of them produces records of policy decisions with the metadata that regulatory review applies.

Feature comparison

| Feature | Kong AI Gateway | DeepInspect | |---|---|---| | HTTP proxy for LLM traffic | Yes | Yes | | Multi-provider routing | Yes (AI Proxy plugin) | Forwards to a configured upstream | | Semantic caching | Yes | Out of scope | | Token-based rate limiting | Yes | Out of scope | | Prompt regex filter | Yes (AI Prompt Guard) | Yes, plus classification | | Identity attribution at model API call | API consumer level | Natural-person from identity provider | | Per-route policy bundle | Plugin level | Yes, per-route policy bundle | | Prompt data classification | Regex only | Classification engine for PII, PHI, MNPI | | Per-decision audit record | Operational log | Cryptographically signed audit record | | Article 12 audit format | Operational log | Yes | | Fannie Mae LL-2026-04 lender record format | Operational log | Yes | | Self-hosted | Yes (Kong OSS or Kong Konnect) | Yes |

Pick Kong AI Gateway if

Pick Kong AI Gateway if the primary need is multi-provider LLM routing with operational features like caching, token rate limiting, and prompt template injection, and the regulatory audit requirement is satisfied elsewhere or the workload does not trigger a regulatory audit requirement. Kong is the strongest choice for teams already running Kong as their HTTP data plane that want the AI plugins on the same operator surface.

Pick Kong AI Gateway if the operational surface for the LLM traffic should match the operational surface for the rest of the API traffic. Teams with a mature Kong operations practice prefer the single data plane.

Pick DeepInspect if

Pick DeepInspect if the workload is subject to EU AI Act Article 12, Fannie Mae LL-2026-04, HIPAA, DORA, FedRAMP, ISO 42001, or any sector regime that requires identity-bound per-decision audit records. DeepInspect produces the record format that the regulator accepts. Kong's operational logs satisfy the existence requirement but fall short of the traceability requirement.

Pick DeepInspect if the organization is the data controller for prompts crossing into model providers and the security team needs prompt-level data classification, identity attribution from the natural-person identity primitive, and policy enforcement that fails closed.

Pick both if the deployment needs operational LLM traffic management and regulatory audit records. The composition pattern works in production today.

Composition pattern in production

The deployment topology that runs in production combines the two layers. The application calls DeepInspect (the addressable endpoint that the application points its OpenAI SDK at). DeepInspect verifies the caller's identity from the application's identity primitive, applies the data classification rules, evaluates the policy bundle for the route, commits the per-decision audit record, and forwards the cleared request to Kong AI Gateway. Kong's AI Proxy plugin routes the request to the upstream provider (OpenAI, Anthropic, Bedrock, etc.) based on its operational rules and returns the response. DeepInspect commits the response handling decision and forwards to the application.

The audit record carries the natural-person identity, the route, the policy version, the data classification outcome, the policy decision outcome, the upstream provider that Kong selected, the model and version that served the request, and the integrity signature. The operational log carries the Kong routing decision, the semantic cache hit or miss, the token consumption attribution, and the per-consumer rate limit state. The two layers compose without overlap.

Pricing approach

Kong AI Gateway is available under the Kong OSS data plane, which is open source under the Apache 2.0 license. Self-hosted deployment is free. The Kong Konnect control plane and Kong Enterprise plugins have their own pricing that Kong publishes separately.

DeepInspect's pricing is communicated through sales conversations and depends on the deployment regime, the workload volume, and the audit-record retention requirements. The cost is meaningfully lower than the cost of an audit miss under EU AI Act Article 12, Fannie Mae LL-2026-04, or a sector regime.

DeepInspect

DeepInspect sits between calling applications and any LLM endpoint over HTTP. It evaluates identity-bound policy on every request, classifies prompt data against the regulated data types the organization recognizes, commits per-decision audit records with cryptographic integrity, and produces the record format that EU AI Act Article 12 and Fannie Mae LL-2026-04 reviewers accept. The architecture composes with Kong AI Gateway by sitting in front of it, which preserves Kong's operational benefits and adds the regulatory audit layer that Kong was not designed to provide.

The composition gives organizations the multi-provider routing they want from Kong and the per-decision audit records they need for the workload to survive regulatory review. The audit pipeline consumes one record format regardless of which upstream provider Kong's AI Proxy selected for any given request, which keeps the regulatory review tractable.

If you are running Kong AI Gateway today and the EU AI Act August 2 deadline is on your roadmap, let's talk.