SOC 2 AI Controls: Which Trust Services Criteria a Policy Gateway Actually Evidences
SOC 2 does not have an AI-specific control category, but every AI deployment inside a Type II audit surfaces control gaps under the same five Trust Services Criteria. The auditor questions center on who accessed the model, what data flowed through it, whether policy enforcement is deterministic, and whether the audit trail is tamper-evident. Application-controlled logs fail the CC7 evidence bar. The fix is architectural.

The AICPA has not published an AI-specific SOC 2 supplement. Every SOC 2 Type II audit that includes AI systems in scope tests the AI deployment against the same five Trust Services Criteria that apply to any other production system. The pressure points shift. Auditors ask which identities called which model with which data, whether policy enforcement is deterministic, and whether the audit trail is independent of the application under test. Application-controlled logs, which pass a straight infrastructure SOC 2, tend to fail the AI-in-scope version. I want to walk through which Trust Services Criteria the auditor tests hardest when AI is in scope, and where the evidence has to come from.
The evidence a SOC 2 Type II auditor accepts for a boundary-crossing AI request is not the application's own log of what it did.
Which criteria stress under AI
The five 2017 Trust Services Criteria are Security, Availability, Processing Integrity, Confidentiality, and Privacy. Every SOC 2 Type II includes Security. The other four are optional depending on the report scope. AI in production stresses Security, Confidentiality, and Processing Integrity hardest.
Under Security, the Common Criteria (CC) series is the workload. CC6 covers logical access. CC7 covers system operations. CC8 covers change management. Each of these subsections generates questions that a standard application deployment answers with existing evidence, and that an AI deployment answers with evidence the application does not produce today.
Confidentiality adds a layer on top when the AI system processes data the report scopes as confidential. Processing Integrity comes into play when the AI system's output is a decision the customer relies on (credit adjudication, insurance claim triage, clinical decision support), because the auditor tests whether processing was authorized, complete, accurate, and timely.
CC6.1 logical access requires per-request identity
CC6.1 requires the entity to implement logical access controls over information assets to protect them from security events. For a conventional application, the evidence is the SSO configuration, the role-based access matrix, and the access review log.
For an AI system, the evidence has to answer the question the auditor asks: which identity called which model with which data. A calling application that authenticates to OpenAI or Anthropic with a shared service account cannot answer that question. The credential identifies the application, not the natural person or agent on whose behalf the application acted. The auditor's follow-up question is direct: if this credential were revoked, which users would lose access? A shared service credential fails the specificity test.
The ai agent identity pillar walks through the identity binding pattern. In short, every AI request needs a verified identity claim on the wire, propagated to the model call and to the audit log.
CC7.2 system monitoring requires tamper-evident audit trails
CC7.2 requires the entity to monitor system components and the operation of controls for anomalies indicative of malicious acts, natural disasters, and errors. The evidence is the SIEM configuration, alert routing, and incident tickets.
For AI in scope, the auditor's specific test is whether the audit trail of AI decisions can be trusted. Two failure modes reoccur. First, the application controls its own logs and can suppress or rewrite them. Second, the log lands in a store the application service account has write access to, which means the same identity that made the decision can rewrite the record of the decision.
The AI audit log immutability piece covers the storage-layer contract auditors accept. Object Lock, WORM buckets, and append-only stores with independent access credentials satisfy the tamper-evident bar. The AI audit log hashing patterns piece covers the cryptographic side.
CC7.4 incident response requires per-decision reconstructability
CC7.4 requires the entity to respond to identified security incidents by executing a defined incident response program. The evidence includes the runbook, the on-call rotation, and post-incident reviews.
For AI in scope, the auditor's stress test is whether the incident response team can reconstruct what an AI system did during the window of an incident. If a prompt injection attack against a customer support agent successfully caused the agent to exfiltrate customer records, the response team needs to answer three questions: which user or agent identity initiated the sessions, which records the model retrieved, and which policy decision (or absence of one) permitted the retrieval.
Application logs answer only the first question, and often incompletely. Per-decision audit records at the AI request boundary answer all three. The ai-audit-logs format spec covers the fields.
CC8.1 change management applies to policy, not just code
CC8.1 requires the entity to authorize, design, develop, or acquire, configure, document, test, approve, and implement changes to infrastructure, data, software, and procedures. For a conventional application, the evidence is the code review, deployment approval, and rollback plan.
For AI in scope, the criterion applies to the policies that govern what the AI can and cannot do. If a policy change (adding an allowed model, expanding a data classification's permitted use, loosening a rate limit) ships without change-management approval, the AI system's behavior changes without an audit trail. The policy-as-code piece covers the pattern that puts AI policies through the same review pipeline as code.
Confidentiality and PII
The Confidentiality criterion, when included in the report, requires the entity to identify and maintain confidential information consistent with its objectives. For AI, the specific test is whether confidential data flows through the model call and whether the flow is authorized.
The LLM DLP pillar covers the mechanism. In audit language, the evidence chain runs from data classification (which fields are confidential) to policy (which classes of data can enter which models) to enforcement (the request either passes or is blocked at the AI boundary) to record (the audit log shows which requests carried which classifications). Application-layer redaction produces the enforcement step but not the record step, so the evidence chain breaks.
DeepInspect
This is exactly what DeepInspect does. DeepInspect sits inline between the users or agents in your environment and the LLM APIs they call. For every request, the system binds the calling identity, evaluates policy against data classification and model authorization, and lands a per-decision record in an append-only audit store with write access separated from the application service account. For a SOC 2 Type II with AI in scope, that record is the evidence chain the auditor tests against CC6.1, CC7.2, CC7.4, and CC8.1 at once.
The audit evidence is designed for the SIEM formats auditors accept: OCSF, ECS, and CEF. The ai audit log SIEM formats piece covers the field mapping.
Book a technical deep dive at deepinspect.ai.
Frequently asked questions
- Does SOC 2 require an AI-specific control?
No. The AICPA has not published an AI supplement to the 2017 Trust Services Criteria. The AI deployment is tested against the same criteria as any other production system, which means the audit stress lands on CC6, CC7, and CC8 hardest.
- If our AI provider is SOC 2 compliant, are we covered?
Partially. OpenAI, Anthropic, and AWS Bedrock all carry their own SOC 2 reports, which cover their internal controls. The deployer's SOC 2 covers how the deployer's environment uses those providers. The auditor asks about the deployer's identity binding, audit trail, policy enforcement, and change management, none of which the provider's SOC 2 attests to.
- What is the difference between SOC 2 Type I and Type II for AI?
Type I attests that controls are designed appropriately at a point in time. Type II attests that controls operated effectively over a period, typically six or twelve months. The AI-in-scope stress is on Type II because the auditor tests operational evidence across the reporting period, which surfaces gaps that a point-in-time review misses.
- Can we exclude AI from SOC 2 scope?
Sometimes. If the AI system does not process customer data covered by the report and does not touch systems in scope, some organizations exclude AI. That exclusion is under pressure from customers and prospects who now include AI use as a diligence question. The AI security vendor evaluation criteria piece covers the buyer-side questions to expect.
- How does SOC 2 relate to ISO 42001?
SOC 2 is an attestation report against Trust Services Criteria. ISO 42001 is a management system standard for artificial intelligence. The controls overlap where CC6, CC7, and CC8 map to ISO 42001 Annex A controls on data governance, monitoring, and change management. The ISO 42001 vs ISO 27001 piece covers the ISO side.
- Where does policy-as-code fit in the SOC 2 evidence chain?
Under CC8.1. Policy changes that go through code review, testing, and deployment produce the change-management artifacts the auditor accepts. Policies that live in a UI configuration inside the AI gateway and change without version control produce a gap. The policy-as-code piece covers the pattern.