DeepInspect for Heads of Security: AI Risk as a Production Control
Heads of Security own the production controls that prevent damage at machine speed. AI traffic is the data channel where the controls have to operate. The Mandiant 22-second handoff window and the IBM shadow AI numbers determine what counts as a working control today.

Heads of Security own the production controls. The CISO sits with the board. The Head of Security sits with the SOC, the incident response team, and the security engineering function. The question on the Head of Security's desk this year is not whether AI security is a priority. The question is what counts as a working control on AI traffic now that the median handoff time from initial access to a secondary threat group is 22 seconds, per Mandiant's M-Trends 2026 report, and the AI traffic is a first-class data channel that DLP cannot see. I want to walk through the architectural facts that determine what works, where the controls live, and how the operational posture changes when AI is in production.
The threat landscape that defines the control set
Three numbers shape the Head of Security's working model.
22-second handoff
Mandiant's M-Trends 2026 report, based on 500,000+ hours of frontline incident response, found that the median time between initial access and handoff to a secondary threat group collapsed from over 8 hours in 2022 to 22 seconds in 2025. The control that fires after the fact does not prevent the damage. The control that fires before the request reaches the model does. The architecture choice is between log-and-alert (forensic) and inline enforcement (preventive).
89% year-over-year growth in AI-enabled attacks
Foresiet reported that AI-enabled attacks grew 89% year-over-year in early 2026. A single automated AI tool compromised 600+ FortiGate firewalls across 55 countries with zero human operator involvement. The attacker tempo is machine speed. The defender tempo, if it is not also machine speed, falls behind.
$670,000 incremental cost of shadow AI breaches
IBM's Cost of Data Breach Report found that one in five breached organizations experienced breaches linked to shadow AI, that shadow AI breaches cost $670,000 more on average, and that they take 247 days to detect. The data channel is invisible to the controls most security teams operate today.
Why network DLP is blind to AI traffic
When an engineer pastes source code into ChatGPT, the data travels as an HTTPS POST to api.openai.com. Network DLP sees encrypted web traffic. The prompt content, which is the actual data, is not visible to DLP unless TLS inspection is configured for AI provider domains specifically and the API payload is parsed. Even with TLS inspection, the classification has to happen at the prompt layer, not the document layer.
Three structural failures of legacy DLP for AI traffic. Identity correlation: API calls authenticated with personal keys do not map to corporate identity. Data classification: DLP classifies documents, not prompt context windows. Policy enforcement: per Netwrix, only 37% of organizations have any AI-related governance policies, and 97% of those that suffered AI-related breaches lacked proper access controls for AI services.
The control set that works
Three controls compose into a working production posture.
Identity-aware policy enforcement at the AI request boundary
The enforcement layer evaluates each AI request against per-route, per-role policies with the verified human or agent identity attached as a claim. The decision is deterministic. The evaluation happens before the request reaches the model. The control is the same architectural shape as a per-request authorization decision in a zero-trust IAM deployment, applied to AI traffic.
Prompt-level data classification with inline redaction or block
The enforcement layer inspects the prompt content at the moment of the request and applies the classification policy to the prompt itself, not to the source document the prompt may have been pasted from. PII, NPI, PHI, and other sensitive classifications produce a redact or deny outcome before the prompt reaches the model.
Per-decision audit record committed before the response returns
Every decision the enforcement layer makes produces a signed, tamper-evident audit record with identity, role, policy version, classification, resource, outcome, and timestamp. The record is committed before the response returns to the application. The application that ran the request does not have custody of the write path. This is the self-attestation problem solved by construction.
Where the post-authentication gap shows up
Authentication answers who is calling. Authorization at the AI call layer answers whether this specific request, by this specific authenticated user, against this specific data classification, is permitted at this moment. Most deployments solve the first and skip the second. The Meta March 18 incident (an internal AI agent exposed sensitive data to engineers who were fully authenticated and had no business reading it) is the canonical example.
The Head of Security's working posture should treat AI authorization as a per-request control, not a session-level grant. The architecture that satisfies this requirement is the same architecture that satisfies the Article 12 evidence requirement.
Vendor and embedded AI in scope
The disclosure obligation under Fannie Mae LL-2026-04 and the EU AI Act follows the deployer. Vendor SaaS that embeds model calls under the hood puts the deployer on the hook for evidence the deployer cannot produce directly. The Head of Security's contracting posture should require vendor-side audit records the deployer can request on demand. Without that, the security review of vendor AI usage has a gap the deployer carries.
DeepInspect
This is the architecture DeepInspect provides for the security organization. DeepInspect sits at the AI request boundary as an external enforcement layer between authenticated users and agents and any LLM. Every request is evaluated against per-route, per-role policies using the identity context the application supplies. Every decision produces a signed audit record committed before the response returns to the application.
For the Head of Security, this is the production control on AI traffic. Inline. Identity-aware. Deterministic. The audit records feed the SOC's investigative tooling. The policy configuration is operated by the security team alongside the AI platform team. Vendor AI usage that traverses the enforcement layer produces the records the procurement and compliance teams need.
Frequently asked questions
- What is the highest-priority AI security control to land first?
The single control with the largest risk reduction is identity-aware enforcement at the AI request boundary with prompt-level data classification and a per-decision audit record. The control closes the post-authentication gap (which is the Meta-style risk), produces the Article 12 evidence (which is the regulatory risk), and surfaces shadow AI traffic the deployer was previously blind to (which is the $670,000 shadow AI risk). Programs that ship this control first usually find the other AI-specific controls become incremental hardening on a working architecture.
- How do we measure that the AI security control set is actually working?
Three metrics for the security program. First, the percentage of organizational AI traffic that traverses the enforcement layer (should approach 100% for sanctioned AI usage, and the gap is the shadow AI surface to address next). Second, the count of policy-denied requests per unit time, broken down by violation type (PII, NPI, role mismatch, model not approved for classification). Third, the time to reconstruct a specific AI interaction from the audit records (should be under one minute from query to record). Programs that have built the layer hit these numbers. Programs that have not built it cannot answer them.
- Is the SOC ready to operate AI traffic as a monitored channel?
Most SOCs are not, today. The investigation playbooks were written for endpoint, network, and identity telemetry. AI request telemetry is a new feed. The integration work includes onboarding the per-decision audit records into the SIEM, building correlation rules that connect AI activity to user behavior anomalies, and updating the incident response runbooks for AI-specific scenarios. The work is incremental. Teams that start with the enforcement layer first have a feed to integrate. Teams that have not built the layer have nothing to onboard.
- How does this fit alongside our existing zero trust program?
Zero trust assumes no implicit trust based on network location and evaluates each request against identity, device, and policy. AI request enforcement is a per-request authorization decision at the AI call layer that uses the same identity context as the zero trust IAM stack. The two compose. The IAM stack supplies the verified identity. The AI enforcement layer evaluates the per-request decision against AI-specific policy. Programs that already operate a zero trust posture can extend it to AI traffic by adding the AI request boundary as a new policy decision point.
- What does the incident response runbook for an AI-specific incident look like?
The runbook covers four moments. First, the policy violation alert fires (a denied request that should not have happened, a redaction that hit unexpected content, an attempt by an authorized user to bypass policy). Second, the responder queries the per-decision audit records for the user, the resource, and the time window. Third, the responder correlates the AI activity with adjacent identity, endpoint, and network telemetry. Fourth, the response action: revoke or scope the user's AI authorization, update the policy version, communicate with the affected business owner. The records are the spine of the runbook. Programs without the records have a runbook of guesses.