← Blog

SOC 2 AI Controls: Mapping the Trust Services Criteria to AI Deployments

SOC 2 reports cover five Trust Services Criteria: security, availability, processing integrity, confidentiality, and privacy. AI deployments touch all five. The audit evidence that AICPA expects has to be operational, not architectural. Application logs and policy documents fail. The records that pass are per request.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Compliance & Regulationsoc-2ai-governancecompliancetrust-services-criteriaauditaicpa
SOC 2 AI Controls: Mapping the Trust Services Criteria to AI Deployments

A SOC 2 report under the AICPA Trust Services Criteria covers five categories: security, availability, processing integrity, confidentiality, and privacy. Security is the only required category. The other four are added based on what the report describes. AI deployments touch all five. The auditor evaluates whether the controls described in the report were suitably designed and operated effectively during the report period. Application logs and policy documents pass the design test. They fail the operational test under the questions a Type II auditor asks about AI.

I want to walk through how each Trust Services Category applies to AI deployments, where the typical evidence falls short, and what records survive the auditor's testing.

Security and the common criteria

The security category is governed by the common criteria CC1 through CC9. The criteria cover the control environment, communication and information, risk assessment, monitoring of controls, control activities, logical and physical access controls, system operations, change management, and risk mitigation.

For AI deployments, the criteria that get specific testing under Type II audits are CC6 (logical and physical access controls) and CC7 (system operations). CC6.1 covers access controls to protect against threats from sources outside the system. CC6.6 covers logical access security for the system. CC6.7 covers transmission, movement, and removal of information. CC7.2 covers monitoring of system components for anomalies. CC7.3 covers evaluation of identified events and incidents.

The CC6.1 evidence the auditor expects shows the access controls applied to AI requests. Who is permitted to submit prompts to which AI services. What roles authorize what data classifications. The auditor will sample requests during the report period and trace the access control evaluation that took place at each one.

CC7.2 evidence covers the monitoring that runs against AI activity. The auditor expects records of monitoring actions, alerts generated, and the response to each alert. A monitoring program that produces no alerts during the report period is itself a finding because the auditor cannot test the response procedure.

Availability and the supplemental criteria

The availability category adds criteria A1.1 through A1.3 covering capacity planning, environmental protections, and recovery procedures. AI deployments raise availability questions when the AI workflow supports a business process that has its own availability requirement.

A1.1 covers capacity. The evidence shows that capacity planning for AI inference accounts for peak workload, vendor rate limits, and fallback paths when the primary AI service is unavailable. A1.2 covers environmental protections, which for AI deployments translates into the resilience of the cloud or on-premise infrastructure hosting the AI gateway. A1.3 covers recovery and the testing of recovery procedures.

The audit testing samples availability incidents during the report period. For each incident, the auditor asks for the detection timeline, the response actions, the recovery timeline, and the lessons-learned documentation. Availability incidents involving AI vendors are common because foundation model providers have shipped multiple multi-hour outages over the past 18 months. The auditor expects evidence of the financial entity's response to those outages.

Processing integrity and what the auditor tests

The processing integrity category adds criteria PI1.1 through PI1.5 covering the completeness, validity, accuracy, timeliness, and authorization of system processing. AI deployments raise processing integrity questions because model outputs are probabilistic.

PI1.4 covers timeliness. The auditor evaluates whether AI processing meets the timeliness expectations the entity has communicated to users. PI1.5 covers authorization. The auditor evaluates whether AI processing reflects the authorization established by the entity. The authorization piece is where most AI deployments fail. The entity has a written policy authorizing AI use for specific tasks. The operational evidence does not show which task the AI was actually used for at the moment of each request.

Processing integrity testing samples specific AI requests and traces each one to the authorization record. A request submitted to summarize customer service tickets should have an authorization record showing that the customer service workflow is approved for AI summarization. A request submitted to score credit decisions should fail because credit scoring is not on the approved use list. The auditor expects the operational record to align with the policy authorization. Application logs that record the API call without the use case classification fail this test.

Confidentiality and the AI vendor relationship

The confidentiality category adds criteria C1.1 and C1.2 covering identification, classification, and protection of confidential information. AI deployments are a primary confidentiality concern because prompts and responses leave the entity's environment.

C1.1 covers identification and classification. The auditor evaluates whether the entity has identified what information is confidential and classified it. For AI usage, the testing asks whether the entity can show that confidential information was classified before it appeared in a prompt. C1.2 covers disposal of confidential information. For AI usage, the testing covers what happens to prompts and responses at the AI vendor.

The evidence the auditor expects under C1.1 is per-request classification of prompt content. The entity should have an automated control that classifies the prompt before it leaves the entity's environment, records the classification, and applies the policy for that classification. The evidence under C1.2 is the contractual scope with the AI vendor, the vendor's retention practices for prompts and responses, and the deletion records that prove confidential information was disposed of within the agreed retention.

Privacy and the personal information lifecycle

The privacy category adds criteria P1 through P8 covering notice, choice and consent, collection, use, retention and disposal, access, disclosure to third parties, quality, and monitoring and enforcement. Privacy is the category most affected by AI deployments because personal information routinely appears in prompts.

P3 covers collection. The auditor evaluates whether the entity collects personal information consistent with its privacy notice. AI workflows that draw personal information into prompts from customer records, support tickets, or workforce inputs are a collection activity. The testing asks whether the privacy notice covers that collection and whether the entity records what was collected.

P5 covers retention and disposal. AI vendors that retain prompts for abuse monitoring or model improvement extend the retention beyond what the entity may have disclosed to data subjects. The auditor expects evidence of the vendor's retention practices, the entity's documentation of that retention, and the consistency with the privacy notice.

P7 covers disclosure to third parties. AI vendors are third parties. Each prompt submitted to an AI vendor is a disclosure of personal information when the prompt contains personal information. The auditor expects an accounting of those disclosures aligned with the privacy notice.

Where the audit evidence has to come from

For each Trust Services Criterion the report covers, the auditor needs operational records the report period. The records have to be specific to the AI usage, not borrowed from other system controls.

Per AI request, the record set the auditor can work with contains: the workforce member or agent identity, the role and authorization in effect, the AI service called, the data classification of the prompt, the policy version that governed the decision, the decision outcome, the response classification, and the timestamp. The record is committed independently of the application that made the request and signed at creation to prevent later modification.

That record set supports access control testing under CC6, monitoring testing under CC7, processing integrity testing under PI1.5, confidentiality testing under C1.1 and C1.2, and privacy testing under P3, P5, and P7. One record set, five Trust Services Categories.

The Type I to Type II transition

A SOC 2 Type I report covers the design of controls at a point in time. A Type II report covers the operating effectiveness of controls over a report period, typically 6 or 12 months. Many service organizations operate at Type I for the first cycle and move to Type II in the second cycle.

For AI controls, the transition surfaces the operational gap. Type I asks whether the entity designed controls to address each criterion. A written policy, a system description, and a control matrix typically pass Type I. Type II asks whether the controls operated effectively across the report period. The auditor samples instances and traces each one through the control. Type II testing is where application-logs-only evidence fails.

The entities that pass Type II without rework are the ones that built the operational evidence layer before the report period began. Adding the evidence layer mid-period leaves gaps the auditor cannot test through.

The relationship to other frameworks

SOC 2 is widely accepted as evidence of mature security controls. The Trust Services Criteria overlap with ISO 27001 Annex A controls, the NIST Cybersecurity Framework, and the HITRUST CSF. For AI deployments, the SOC 2 evidence layer also covers significant parts of EU AI Act Article 12, DORA Article 28, and NIST AI RMF Measure function expectations.

The infrastructure that produces SOC 2-grade AI evidence is the same infrastructure that produces evidence for other regulated frameworks. The Trust Services Criteria are mapped to AICPA assertions about service organization controls. The per-request record format is the implementation that satisfies the assertions.

DeepInspect

This is the architecture DeepInspect was built to provide. DeepInspect sits at the AI request boundary as a stateless proxy between authenticated users and agents and any LLM endpoint. Per-route policies enforce which AI vendor receives which workload, which roles can submit which classifications of data, and which retention scope applies to each request. Every decision produces a signed audit record covering identity, role, classification, vendor selected, policy version, decision outcome, and timestamp.

The record format satisfies SOC 2 Trust Services Criteria evidence requirements across all five categories. The same records support EU AI Act Article 12, DORA register data, NIST AI RMF Measure function, HIPAA audit controls, and ISO 27001 audit-logging requirements.

If you are preparing for a SOC 2 Type II report and your AI evidence is application logs and policy documents, the operational testing will surface the gap. Book a demo today.

Frequently asked questions

Do we need a separate SOC 2 for AI usage?

No. A single SOC 2 report covers all the services in scope. The AI controls become a section of the system description and a set of controls in the control matrix. The auditor tests the AI controls as part of the overall engagement. Entities that operate multiple distinct AI workloads sometimes split them across separate system descriptions when the data classifications and control sets differ materially.

Which Trust Services Criteria are most affected by AI?

Security (CC6 access controls and CC7 monitoring), confidentiality (C1.1 classification and C1.2 disposal), and privacy (P3 collection, P5 retention, and P7 disclosure to third parties) carry the largest AI-specific testing burden. Processing integrity (PI1.4 timeliness and PI1.5 authorization) and availability (A1.1 capacity, A1.3 recovery) apply when the AI workflow supports a business process with its own integrity or availability requirements.

How long is the SOC 2 Type II report period?

Type II reports typically cover 6 or 12 months. The first Type II after a Type I usually covers a 6-month period to compress the audit calendar. Subsequent Type II reports cover 12 months. The auditor expects operational evidence across the full report period. Adding controls during the period creates a gap the auditor flags as a control deficiency for the months before the control was operational.

Can the AI vendor's SOC 2 substitute for ours?

The AI vendor's SOC 2 covers the vendor's services. It does not cover the customer's use of those services. The customer's SOC 2 covers the customer's controls, including the controls that govern how the customer interacts with the AI vendor. The two reports are complementary. The customer should request and review the AI vendor's SOC 2 as part of its third-party risk assessment, and that review is itself a control in the customer's own SOC 2.

What is the difference between SOC 2 and SOC 3?

SOC 2 is a detailed report with controls testing results, intended for distribution under non-disclosure. SOC 3 is a public-facing summary report with the auditor's opinion but without detailed testing results. Most entities produce SOC 2 for customers and prospects who sign NDAs. Some entities produce SOC 3 alongside for public website use. Both reports cover the same Trust Services Criteria. The audit evidence requirements are identical.