Does the inspection layer change the user experience of the copilot?

The inspection layer adds under 50 ms to the request path. LLM inference takes 500 ms to 5 seconds. The overhead is invisible relative to the model's response time. The user experience changes when the policy decision blocks or modifies the request. A block surfaces as a structured error the copilot's UI handles (a message that the data is outside the user's scope). A modify surfaces as a redaction the user sees (PII removed from the prompt context with an indicator that redaction happened). The user experience is intentional and the copilot's UI handles each outcome explicitly.

How does the inspection layer handle copilots that use multiple upstream LLMs?

The inspection layer sits in front of each LLM endpoint the copilot calls. The record schema is the same across models. A copilot that routes between OpenAI for chat, Anthropic for long context, and Bedrock for SOC 2 boundaries produces a single audit record series with the upstream model and version stamped on each record. The downstream consumers (SIEM, GRC archive) read the series with a fixed parser regardless of which model served the request.

What does the inspection layer do when the copilot retrieves PHI into the prompt context?

The inspection layer's classifier detects the PHI in the retrieved content. The policy evaluates the user's PHI authorization against the detected content. A user with PHI authorization passes and the record captures the PHI classification on the record. A user without PHI authorization receives a block (the request does not forward) or a modify (the PHI is redacted from the prompt before forwarding). The record series carries both the detection result and the decision outcome.

How does the inspection layer integrate with the SSO and the identity provider?

The application authenticates the user with the SSO and propagates the user's identity to the inspection layer through an established pattern. Common patterns include attaching the user identifier as a request header, including the user identity in a signed JWT, or running the inspection layer behind a service mesh that propagates the user identity through mTLS context. The inspection layer's policy evaluates the identity the application supplies. The application's identity propagation is the upstream obligation the inspection layer relies on.

What audit records does the copilot vendor produce versus the inspection layer?

The copilot vendor produces application-level records of user sessions and tool invocations. The records live inside the application boundary and serve the vendor's operational needs. The inspection layer produces per-decision records at the HTTP boundary outside the application. The records carry the seven fields the regulator expects (identity, route, data classification, policy version, decision outcome, model and version, integrity metadata). The two record series compose: the application records cover the application's internal state, the inspection layer records cover the regulatory audit obligations.

AI Security for Internal Copilots: The Identity, Data, and Audit Controls a Production Deployment Has To Run

Internal copilots reach across the organization's data surfaces from a single authenticated session. A finance copilot reaches the data warehouse, the BI tools, and the company's financial reporting system. A sales copilot reaches the CRM, the call recordings, and the proposal repository. A support copilot reaches the ticketing system, the customer database, and the internal knowledge base. The deployment authenticates the user at the copilot's front door, attaches the user's identity to the session, and then makes a series of model calls and tool invocations on the user's behalf. The boundary between "the user is authenticated" and "the specific request, against the specific data, is permitted" is the post-authentication gap. The gap is where the data leaks live, and the gap is where the regulatory record-keeping has to be produced.

I want to walk through the request-time data the copilot reads, the identity-aware policy decisions the deployment has to commit at the request boundary, the audit record format that survives EU AI Act, HIPAA, and DORA review, and the architectural pattern that closes the post-authentication gap.

The data the internal copilot reads at request time

The copilot reads four classes of data at each request. The session identity (who is asking) carries the natural-person identifier from the SSO, the user's role and group memberships, the tenant, and any session-scoped authorization scopes. The request content (what is being asked) carries the prompt text, the system prompt, the model selection, the tool list the application surfaces, and any structured fields the application supplies. The retrieved context (what the copilot pulled in to answer) carries the documents, the database rows, the call recordings, or the ticket history the retrieval layer fetched. The state context (what state the policy needs) carries the policy version, the rate-limit counters, and any session-scoped metadata.

The four classes compose the picture the policy decision evaluates against. The same user role can ask the same prompt and the response varies based on the retrieved context the application supplied. A support agent who asks "what's on this customer's account" with a customer ID in the retrieval context produces a different decision than a support agent who asks the same prompt with a different customer ID. The identity-aware policy evaluates the specific combination at the specific moment.

The retrieval context is the part most internal copilot deployments treat as application-internal. The prompt and the response cross an HTTP boundary the inspection layer reads. The retrieval context lives inside the application's retrieval pipeline. The inspection layer reads what the application sends to the model, which includes the retrieval context as part of the prompt. The policy evaluates the retrieval content with the same identity context the prompt carries.

The identity-aware policy decisions

The decisions the deployment commits at the request boundary cover three orthogonal axes. The first is user-against-data. A user with role A can read data class X but not data class Y. A user in tenant 1 can read data tagged tenant 1 but not data tagged tenant 2. A user with no PHI authorization cannot retrieve PHI into the prompt context.

The second is user-against-action. A user with role A can call the refund tool up to $500. A user in tenant 1 can write into tenant 1 records but not tenant 2 records. A user with no MNPI authorization cannot draft a public disclosure that contains pre-announcement earnings data.

The third is data-against-model. A prompt that contains PHI cannot route to a model whose data processing terms do not cover PHI. A prompt that contains source code subject to export controls cannot route to a model whose inference happens outside the controlled jurisdiction. A prompt that contains MNPI cannot route to a model whose data handling does not exclude training-data inclusion.

The three axes compose. A user (role A, tenant 1, no PHI authorization) asks (a refund tool call) against (data tagged tenant 1, with no PHI in the retrieval context, routing to a model with appropriate processing terms). The combination produces a pass. A change on any axis (a PHI flag in the retrieval, a different tenant) produces a block or a modify.

The audit record format

The record at decision time covers the seven fields the regulator and the customer auditor consume. The identity carries the natural-person identifier (the user) and the agent identifier (the copilot session) when the copilot acts on behalf of the user. The route carries the route identifier (which copilot function, which retrieval pipeline) and the policy bundle binding at decision time. The data classification carries the inspection layer's classifier output on the prompt and the retrieval context (PII, PHI, MNPI, source code, regulated identifiers). The policy version carries the version of the policy bundle the policy decision point read at decision time. The decision outcome carries pass, block, or modify with the rule identifier that produced the outcome. The model and version carries the upstream LLM the request forwarded to. The integrity metadata carries the cryptographic signature and the hash chain link.

The format is the same format the EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, NIST AI RMF, HIPAA 45 CFR 164.312, and ISO 42001 reviewers read. A single audit pipeline produces records that satisfy each regime without per-regime transformation.

The post-authentication gap most internal copilots leave open

Most internal copilots authenticate the user at the front door and propagate the user's identity into the application's session state. The model API call and the downstream tool calls happen with the application's service account, not the user's identity. The records the application writes carry the application's identity. The records the model provider writes carry the application's identity. The records the downstream tool writes carry the application's service account.

The reviewer who asks "what did user X retrieve from the customer database at 14:32 through the copilot" reads either the application's internal logs (which lack the model selection and the retrieval context) or the downstream tool logs (which lack the user identity). Neither answers the question completely. The reconstruction requires joining streams that were never designed to be joined.

The Meta March 18 Sev-1 incident captured this gap publicly. An internal AI agent exposed sensitive user and company data to engineers who shouldn't have seen it. The agent was fully authenticated. The user was fully authenticated. The agent's downstream calls carried the agent's credentials and the data the agent retrieved did not pass through a layer that evaluated the user's authorization against the specific data. The post-authentication gap is the structural reason the incident was possible.

The architectural pattern that closes the gap

The pattern is an inspection layer at the HTTP boundary between the copilot's application and each upstream the copilot calls. The layer reads the prompt, the retrieval context (as part of the prompt), the identity the application propagates, and the policy state. The layer evaluates the user-against-data, user-against-action, and data-against-model policies. The layer applies pass, block, or modify, and commits the per-decision audit record before the response forwards.

The layer requires the application to propagate the user's identity in the request. Standard patterns include attaching the user identifier as a header, including the user identity in the request body, or running an authentication step at the inspection layer that maps the application's auth token to the user identity. The application's identity propagation is the upstream obligation. The policy evaluation and the audit record commit are the inspection layer's obligation.

The pattern works for HTTP-based LLM APIs (OpenAI, Anthropic, Bedrock, Azure OpenAI, on-prem Llama), HTTP-based tool APIs (internal services, CRMs, ticketing systems), and HTTP-based retrieval APIs (vector stores, search backends). The deployment runs the inspection layer in front of the upstreams whose calls produce decisions the regulator expects to audit.

The regulatory profile internal copilots have to satisfy

EU AI Act Article 12 expects records over the lifetime of the system with input data, identity, and the period of use. Internal copilots that touch employee data, customer data, or regulated workflows produce decisions the records cover. The August 2, 2026 high-risk system requirement reaches deployments that operate in classified high-risk domains (employment, credit, public services, health).

HIPAA 45 CFR 164.312 expects access records (who, what, when) for PHI access. Internal copilots that retrieve PHI into the prompt context (a clinical copilot, a payor copilot) produce PHI access events the records cover. The Business Associate Agreement chain has to cover the inspection layer and the upstream model providers.

DORA Article 19 expects records of operational events with timestamp and identity for financial services workflows. Internal copilots in banking, insurance, and asset management that touch operationally significant data fall inside the scope.

Fannie Mae LL-2026-04 expects records that establish audit trails for AI-assisted lending decisions. Internal lender copilots that assist underwriters or loan officers produce decisions the records cover.

The audit record fields the inspection layer commits cover all of these regimes because the regimes overlap on identity, time, decision outcome, and integrity.

DeepInspect

This is the gap DeepInspect closes for internal copilots. DeepInspect sits inline between the copilot's application and any HTTP upstream the copilot calls. The inspection layer reads the prompt, the retrieval context, and the user identity the application propagates. The layer evaluates identity-bound policy against the data classification and the policy state. The layer applies pass, block, or modify, and commits the per-decision audit record to durable, append-only storage with a cryptographic integrity signature before the response forwards.

The record series carries the natural-person identifier the user authenticated with, the copilot session identifier, the route, the data classification, the policy version, the decision outcome, the upstream model and version, and integrity metadata. The series satisfies EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, HIPAA 45 CFR 164.312, NIST AI RMF, and ISO 42001 record-keeping obligations from the same pipeline. End-to-end inspection-layer overhead measures under 50 ms in production.

If you are running an internal copilot in production and the audit reviewer asks who saw what data through the copilot, let's talk today.