How is MLflow AI Gateway different from DeepInspect?

MLflow AI Gateway is an open-source MLflow component that centralizes LLM provider credentials and routes under named routes that MLflow client code calls. The MLflow tracking integration captures the call inside an MLflow run. DeepInspect is an identity-bound policy enforcement layer at the HTTP request boundary that classifies prompt data, evaluates per-route policy bundles, and commits per-decision audit records formatted for EU AI Act Article 12 review and sector audit requirements. MLflow AI Gateway covers the experimentation-side standardization. DeepInspect covers the regulatory audit obligation.

Can MLflow AI Gateway replace DeepInspect for a regulated workload?

For workloads that run primarily as offline experimentation inside MLflow and where the audit format the regulator accepts is the MLflow tracking event, possibly. For workloads that include production traffic from end-user-facing applications to LLM endpoints, the MLflow tracking event lacks the natural-person identity attribution, the policy version that evaluated the decision, the data classification outcome, and the cryptographic integrity signature that the regulatory audit format expects. The production-side audit record falls short of the format the regulator under Article 12 or Fannie Mae LL-2026-04 will accept.

Can DeepInspect replace MLflow AI Gateway?

For deployments where the application code already addresses LLM providers directly with provider-managed credentials, DeepInspect adds the identity-bound policy and audit record layer without needing MLflow AI Gateway for the routing. For deployments that want the MLflow-anchored experimentation workflow with named routes, centralized credentials, and MLflow tracking integration, DeepInspect does not replace MLflow AI Gateway. The two compose.

How does the deployment topology work when both are in production?

Production application traffic addresses DeepInspect. DeepInspect evaluates the policy and commits the audit record. DeepInspect forwards the cleared request to the upstream, which may be MLflow AI Gateway (which then brokers the call to the provider) or the provider directly. MLflow-internal experimentation traffic from MLflow runs addresses MLflow AI Gateway by route name. The MLflow tracking event captures the run-level metadata. The DeepInspect audit pipeline ingests the tracking events for the offline workload audit consolidation.

What about the MLflow tracking event versus DeepInspect's audit record?

The MLflow tracking event is designed to capture the experimentation metadata for the MLflow run: the route, the call parameters, the response, the latency, the cost attribution. The DeepInspect audit record is designed to capture the regulatory audit metadata for the policy decision: the natural-person identity, the policy version, the data classification outcome, the policy decision outcome, and the cryptographic integrity signature. The two record formats serve different audit pipelines and can coexist for a workload that needs both the MLflow-side experimentation tracking and the regulatory-side audit obligation.

DeepInspect vs MLflow AI Gateway: Where Model Routing Stops and Policy Enforcement Starts

MLflow AI Gateway (formerly MLflow Deployments) is the open-source MLflow component that lets an ML platform team register LLM provider endpoints under a single MLflow control surface. The gateway accepts a YAML configuration that defines named routes, each route pointing at an upstream provider (OpenAI, Anthropic, MLflow-served custom model, Bedrock, Cohere, MosaicML, PaLM, HuggingFace TGI, and others) with the credentials and the default parameters. The MLflow client SDK calls the named route, and the gateway brokers the call to the upstream provider. The product targets the operational concerns of standardizing LLM access inside an MLflow-anchored ML platform. DeepInspect sits at the HTTP request boundary and answers a different question. It enforces identity-bound policy on prompt content, classifies prompt data against the regulated data types the organization recognizes, and commits a per-decision audit record that a reviewer under EU AI Act Article 12 or a Fannie Mae LL-2026-04 review accepts.

I want to walk through what MLflow AI Gateway does, what DeepInspect does, and where the two layers compose for regulated AI workloads.

TL;DR

MLflow AI Gateway centralizes LLM endpoint registration for MLflow-anchored teams: routes, provider credentials, default parameters, and key rotation through MLflow's configuration surface. DeepInspect enforces identity-bound policy on prompt content for any LLM endpoint and produces per-decision audit records formatted for regulatory review. Regulated deployments use DeepInspect in front of MLflow AI Gateway (or in front of the provider directly) for policy enforcement and the cross-route audit record.

MLflow AI Gateway: what it is and where it sits

MLflow AI Gateway ships as an open-source MLflow component. The gateway process reads a YAML configuration that defines named routes, each one specifying a provider type (openai, anthropic, mosaicml, bedrock, palm, ai21labs, cohere, mlflow-model-serving, huggingface-text-generation-inference), the provider credentials (typically environment variables or secret references), and the model endpoint. The MLflow client calls the route by name, and the gateway brokers the request to the upstream.

The feature set centers on standardization. The gateway lets one team manage provider credentials and rotate them without touching application code. Route definitions encode the provider-specific parameters (model name, default temperature, max tokens) so the client call is parameter-light. The gateway emits MLflow tracking events for the calls when the calling code is inside an MLflow run. The deployment is a single Python process behind a configurable network address.

The architectural sweet spot for MLflow AI Gateway is the ML platform team that has standardized on MLflow as the experiment and model tracking surface. Teams running LLM evaluations, prompt experimentation, and offline batch inference inside MLflow get the route abstraction on the same operator surface as the rest of their MLflow workflow.

What DeepInspect is and where it sits

DeepInspect sits at the HTTP request boundary, addressable from any application that calls any LLM endpoint over HTTP. It evaluates identity-bound policy on every request, classifies prompt data against the regulated data types the organization recognizes, and commits a per-decision audit record with cryptographic integrity. The decisions are deterministic, fail-closed, and independent of the model's behavior.

The feature set covers identity attribution at the model API call from the application's identity primitive (the natural-person identity, the tenant, the role, the route context), per-route policy enforcement for different application surfaces (the support route, the developer route, the legal route, the underwriting route), prompt-level data classification (PII, PHI, MNPI, source code, source-licensed content, regulated jurisdictional data), policy decisions that pass, block, or modify the request, and the per-decision audit record format that downstream audit pipelines consume.

The architectural sweet spot for DeepInspect is the regulated production workload. An organization that is the data controller for prompts crossing into a model provider needs evidence that satisfies the deployer obligations under Article 26, the audit obligations under Article 12, the lender record obligations under Fannie Mae LL-2026-04, and the sector-specific regimes (HIPAA, DORA, FedRAMP, ISO 42001) that the workload is subject to.

Where the two products overlap

Both products run as an HTTP layer that brokers calls to LLM providers. Both products can centralize provider credentials. The overlap is at the surface level. The underlying responsibilities differ.

MLflow AI Gateway's authentication is provider-level (the API key the gateway holds for the upstream) and route-level on the client side (the route name the client calls). The MLflow tracking event captures the calling MLflow run, the route, and the basic call metadata. The gateway log records the request fingerprint.

DeepInspect's authentication is identity-token based against the organization's identity provider. The identity carries the natural-person identity, the tenant, the role, and the route context. The audit record carries the policy version, the data classification outcome, the policy decision outcome, and the cryptographic integrity signature. The audit record is structured for regulatory review.

Both products produce records of requests. Only one of them produces records of policy decisions with the metadata that regulatory review applies.

Feature comparison

| Feature | MLflow AI Gateway | DeepInspect | |---|---|---| | HTTP proxy for LLM traffic | Yes | Yes | | Multi-provider routing | Yes (named routes per provider) | Forwards to a configured upstream | | Provider credential centralization | Yes | Out of scope | | MLflow tracking event integration | Yes | Out of scope | | Identity attribution at model API call | Route-level | Natural-person from IdP | | Per-route policy bundle | Route name only | Yes, policy bundle per route | | Prompt data classification | None | Classification engine for PII, PHI, MNPI | | Per-decision audit record | Tracking event | Cryptographically signed audit record | | Article 12 audit format | Tracking event plus translation | Native format | | Fannie Mae LL-2026-04 lender record format | Tracking event plus translation | Native format | | Self-hosted | Yes (open-source) | Yes |

Pick MLflow AI Gateway if

Pick MLflow AI Gateway if the team's primary need is standardizing LLM provider credentials and routes inside an MLflow-anchored workflow, and the regulatory audit requirement is satisfied elsewhere or the workload does not trigger a regulatory audit requirement. The MLflow tracking integration is the largest single benefit for teams that already run MLflow as the experiment and model tracking surface.

Pick MLflow AI Gateway if the workload is offline batch inference or LLM evaluation experimentation inside MLflow, with results flowing into MLflow runs for tracking and comparison. The MLflow-resident workflow gets first-class support.

Pick DeepInspect if

Pick DeepInspect if the workload is subject to EU AI Act Article 12, Fannie Mae LL-2026-04, HIPAA, DORA, FedRAMP, ISO 42001, or any sector regime that requires identity-bound per-decision audit records. DeepInspect produces the record format that the regulator accepts. MLflow's tracking events satisfy experiment metadata but fall short of the regulatory audit format.

Pick DeepInspect if the organization is the data controller for prompts crossing into model providers and the security team needs prompt-level data classification, identity attribution from the natural-person identity primitive, and policy enforcement that fails closed.

Pick both if the deployment needs MLflow-anchored experimentation and regulated-workload audit. The composition pattern works in production today.

Composition pattern in production

The deployment topology that runs in production combines the two layers based on which surface owns the call. For application traffic from production services to LLM endpoints, the application points its HTTP client at DeepInspect. DeepInspect verifies the caller's identity, applies the data classification rules, evaluates the policy bundle for the route, commits the per-decision audit record, and forwards the cleared request to the upstream endpoint. The upstream may be MLflow AI Gateway (which then brokers the call to the provider) or the provider directly.

For MLflow-internal experimentation traffic from MLflow runs to LLM endpoints, the MLflow run calls MLflow AI Gateway by route name. The gateway brokers the call to the provider. The MLflow tracking event captures the run-level metadata. The DeepInspect audit pipeline ingests the MLflow tracking events for the offline workload audit consolidation, applying the policy version and the data classification metadata against the experimentation traffic.

The audit pipeline carries the natural-person identity (or the MLflow run principal for experimentation traffic), the route, the policy version, the data classification outcome, the policy decision outcome, the upstream provider that served the request, and the integrity signature.

Pricing approach

MLflow AI Gateway is open-source under the Apache 2.0 license as part of MLflow. Self-hosted deployment is free. The Databricks Managed MLflow offering bundles the MLflow control surface and has its own Databricks-side pricing.

DeepInspect's pricing is communicated through sales conversations and depends on the deployment regime, the workload volume, and the audit-record retention requirements. The cost is meaningfully lower than the cost of an audit miss under EU AI Act Article 12, Fannie Mae LL-2026-04, or a sector regime.

DeepInspect

DeepInspect sits between calling applications and any LLM endpoint over HTTP. It evaluates identity-bound policy on every request, classifies prompt data against the regulated data types the organization recognizes, commits per-decision audit records with cryptographic integrity, and produces the record format that EU AI Act Article 12 and Fannie Mae LL-2026-04 reviewers accept. The architecture composes with MLflow AI Gateway by sitting in front of it for production traffic and by ingesting MLflow tracking events for the experimentation-side audit consolidation.

The composition gives organizations the MLflow-anchored experimentation workflow they have built around the MLflow tracking surface and the per-decision audit records they need for the production workload to survive regulatory review. The audit pipeline consumes one record format regardless of whether the call originated from a production service or an MLflow experimentation run.

If you are running MLflow AI Gateway today and the EU AI Act August 2 deadline applies to the production workload, let's talk.