The AI Vendor Security Questionnaire: 38 Questions Procurement Should Actually Ask
Most AI vendor security questionnaires are SOC 2 templates with two AI questions tacked on. The result is a procurement process that surfaces well-formatted SOC 2 reports while leaving the AI-layer risks unmapped. This article walks through 38 questions that surface what the vendor actually does at the AI request boundary: model coverage, identity context, per-decision audit, policy enforcement, prompt-injection handling, data residency, regulatory alignment, and incident response. The questions assume the vendor is supplying an AI-using service, not a model. Each question includes the answer pattern a defensible vendor produces and the answer pattern that should trigger a deeper review.

A typical AI vendor security questionnaire is a SOC 2 procurement template with two AI questions tacked on the end. The process surfaces well-formatted SOC 2 reports while leaving the AI-layer risks unmapped. The questions a CISO or compliance lead should actually ask sit at a different layer than the SOC 2 template covers: the AI request boundary, the model coverage, the per-decision audit, the policy enforcement, the prompt-injection handling, the data residency, the regulatory alignment, and the incident response that fires when something at the AI layer breaks.
I want to walk through 38 questions a defensible AI vendor security review should include, grouped by the seven categories the questions break into.
Category 1: Model coverage and isolation (Questions 1-6)
The first category surfaces which models the vendor uses, how the vendor isolates customer data from the model provider's training pipelines, and what happens when the vendor changes its model.
1. Which model providers and which specific model versions does the service call?
A defensible answer names the providers (OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, self-hosted) and the specific model versions in production. A vague answer ("we use leading LLM providers") should trigger a deeper review.
2. How does the vendor pin model versions and notify customers of model changes?
A defensible vendor pins versions per-tenant or per-route and gives customers a notice window before changing the underlying model. A vendor that silently rotates models leaves the customer exposed to behavior changes that affect compliance posture.
3. What contractual terms with the model provider prevent customer data from being used for model training?
Most major model providers offer enterprise-tier APIs with no-training guarantees. The vendor's contract should reference the specific term. A vendor that cannot point to the specific clause is operating on assumption.
4. Are customer prompts and responses subject to the model provider's retention?
The answer depends on the model provider's contract. OpenAI's enterprise API retains zero days by default for the workspace customer. Anthropic's enterprise API retains 30 days for safety review unless waived. The vendor should know the answer per provider in use.
5. How does the vendor isolate tenants at the model call layer?
The vendor either uses per-tenant API keys, per-tenant deployments, or shared keys with logical isolation. The latter pattern carries the most cross-tenant risk. Ask for the isolation diagram.
6. Can the customer route AI traffic through the customer's own provider account?
A vendor that allows BYO model account (the customer brings their own OpenAI workspace, their own Bedrock account, etc.) puts the customer in direct control of the model provider's data terms. A vendor that requires its own model account makes the customer dependent on the vendor's contract with the provider.
Category 2: Identity and authorization (Questions 7-12)
7. How does the service receive and verify the end-user identity making a request?
Most AI services receive an API key from the customer application, with the end-user identity passed as a payload field. Verification depends on the customer trusting the application to forward the identity correctly. Ask whether the vendor supports OIDC, SAML, or signed identity tokens that the vendor verifies directly.
8. What identity attributes are available for policy decisions?
The richer the identity context, the more granular the policy. A vendor that only sees a tenant ID and a user ID cannot enforce role-based or department-based policy. A vendor that consumes the customer's SSO claims (role, department, clearance) can enforce policy at the granularity the customer needs.
9. How does the vendor enforce least privilege at the AI request layer?
The answer should describe the per-route, per-role policy model the vendor implements. A vendor that says "we follow least privilege" without specifying the policy primitives is using a phrase, not a control.
10. How does the service handle service-account and agent identities?
Agent identities are the operative pattern in agentic deployments. The vendor should describe how it treats agent identities, what authorization model applies, and how it distinguishes an authenticated agent from a compromised one.
11. What is the authentication mechanism between the customer application and the service?
The answer should reference an industry-standard authentication pattern: mTLS, OAuth 2.0 with rotating tokens, signed JWT with short expiry. API keys without rotation are a procurement red flag.
12. How are administrative actions on the vendor's service authenticated and audited?
The vendor's administrative actions are the highest-risk path. The answer should describe MFA-required administrative access, separate-channel audit logs for admin actions, and SCIM-based access provisioning.
Category 3: Per-decision audit and traceability (Questions 13-18)
13. For each AI request, what is captured in the audit record?
The defensible record includes the requester identity, the policy version in force, the data classification of the prompt, the model called, the decision outcome, and a timestamp. A vendor that records only "request processed" produces evidence that fails the EU AI Act Article 12 traceability test.
14. Are the audit records tamper-evident?
Tamper-evident records carry a cryptographic signature or a hash chain that lets an inspector confirm the record has not been modified. The defensible answer references the specific tamper-evidence mechanism. "We use append-only storage" is a weak proxy.
15. Who has write access to the audit records?
If the application that produces the AI decision also writes the audit log, the customer has a self-attestation problem. The defensible architecture separates the write path from the application. The audit record exists regardless of what the application does after the fact.
16. How long are audit records retained, and who controls the retention period?
Retention should be a customer-configurable policy. Six months is the EU AI Act Article 26 floor for deployer logs. HIPAA requires six years. Financial services regimes typically require seven. The vendor should support the customer's regulatory regime.
17. Can the customer export the raw audit records?
The customer needs the ability to export records for regulatory inspection, internal audit, or migration. A vendor that locks audit records inside its own dashboard creates a single point of failure for compliance evidence.
18. Does the audit record include the policy state at the moment of the decision?
The policy state at the moment of decision is the difference between a defensible record and a useless one. A record that captures "request blocked" without the policy that made the block cannot reconstruct the decision under inspection. The defensible record pins the policy version to each decision.
Category 4: Policy enforcement and prompt-injection handling (Questions 19-24)
19. What policy primitives does the service expose?
The primitives are the building blocks the customer's compliance team will reason with. The defensible set includes per-route policy, per-role policy, per-data-classification policy, per-model-target policy, and per-tenant policy. A vendor with only on-off enforcement at the tenant level is operating at a coarse granularity.
20. How does the service handle policy ambiguity or evaluation failure?
The defensible posture is fail-closed: when the policy evaluation cannot complete, the request is denied. The reader should ask for the specific default and the configuration options. Fail-open defaults are a procurement red flag.
21. How does the service detect and respond to prompt injection?
The defensible answer includes signal-level detection (classifiers, pattern matching), policy-level enforcement (data classification escalation, route fail-closed), and audit-level evidence (the detection signal recorded with the decision). A vendor that says "our model is fine-tuned to resist prompt injection" is relying on the model rather than enforcing at the request layer.
22. How does the service handle indirect prompt injection in retrieved content?
The retrieved content path is the most common indirect injection surface. The defensible answer describes a content classification step on retrieved data, with the classifier output feeding the policy decision. A vendor that processes retrieved content without classification is exposed.
23. What is the latency overhead of the policy enforcement?
For a real-time deployment, the latency budget for enforcement is typically under 100ms. The vendor should be able to cite a specific number from internal benchmarking, and the customer should be able to verify it in a proof-of-concept.
24. Can the customer bring its own policy?
Customer-authored policy is the path to a defensible compliance position. The customer's compliance team should be able to express the policy in a primitive the vendor implements, without relying on the vendor to interpret the regulatory regime correctly. A vendor that only supports vendor-curated policy leaves the regulatory interpretation in vendor hands.
Category 5: Data residency and processing locations (Questions 25-29)
25. In which geographies is the customer data processed?
The defensible answer is specific by region and by data type. A vendor that says "we operate in multiple regions" leaves the customer exposed to data localization regimes like the EU AI Act, the Data Protection Act in the UK, and sector-specific rules.
26. Can the customer pin processing to a specific region?
EU customers typically need EU-only processing, especially under the EU AI Act high-risk regime. The vendor should support region pinning at the model-call layer, the audit-storage layer, and the support-access layer.
27. Does the model provider's data processing also stay in the pinned region?
A vendor that pins its own processing to the EU but calls a model API that processes in the US has not satisfied region pinning. The vendor should be able to demonstrate the model call routes to a region-aligned endpoint.
28. What is the data classification taxonomy the service applies to prompts and responses?
The defensible taxonomy includes PII, PHI, payment card data, source code, intellectual property, and customer-defined classes. A vendor that classifies only on a "sensitive / not sensitive" binary cannot support most regulated workflows.
29. How does the service handle data subject rights under GDPR for prompts that contain personal data?
The defensible answer describes the process for handling access, rectification, erasure, and portability requests against data that flowed through prompts. The audit records support the process; a vendor without per-decision records cannot satisfy the rights without manual reconstruction.
Category 6: Regulatory alignment (Questions 30-34)
30. How does the service support EU AI Act Article 12 traceability?
The defensible answer references the per-decision audit records, the retention configuration, the export capability, and the deployer-vs-provider obligation split. A vendor that says "we are EU AI Act ready" without specifying the Article 12 mechanism is producing a marketing answer.
31. How does the service support HIPAA audit requirements for AI-mediated decisions on PHI?
The defensible answer references the BAA, the per-decision audit records that include PHI access metadata, the six-year retention, and the access controls that align with the HIPAA Security Rule.
32. How does the service support the NIST AI Risk Management Framework?
The defensible answer maps the service's capabilities to the GOVERN, MAP, MEASURE, and MANAGE functions, with specific reference to which artifacts the service produces for each function.
33. How does the service support Fannie Mae LL-2026-04 governance requirements for AI-supported lending decisions?
The defensible answer (for vendors selling into lending) references the per-decision records the lender needs to demonstrate AI-supported decisions and the policy enforcement that bounds the AI's influence on the decision.
34. What is the vendor's position on emerging state-level AI legislation (Colorado SB 26-189, Texas TRAIGA, California AI Transparency Act)?
A vendor with active monitoring of state-level legislation produces an answer with specific reference to the act and the service's alignment. A vendor without a structured monitoring program is operating on assumption.
Category 7: Incident response (Questions 35-38)
35. What is the vendor's incident response process for an AI-layer compromise?
The defensible answer references the AI-layer incident classes (prompt injection in production, indirect prompt injection, agent tool-call escalation, model extraction, LLM-driven post-exploitation), the detection signals, the containment actions, and the customer notification timeline.
36. How is the customer notified of an incident affecting the customer's data or workloads?
The notification timeline should be specified by SLA in the contract. The notification channel should be more reliable than email-only. The customer should receive enough detail to begin its own incident response.
37. How does the vendor coordinate with the customer's SOC during an incident?
The defensible answer references a runbook, a contact path, and the data the vendor can share with the customer's SOC during an incident. A vendor without a SOC coordination plan delays the customer's response.
38. What is the vendor's process for post-incident review and customer post-mortem?
The defensible answer includes a written post-mortem within a specified time window, with the root cause, the remediation, and the changes the vendor will make to prevent recurrence. Post-mortems are part of the vendor's evidence trail and should be available to the customer's audit and compliance teams.
DeepInspect
This is the architecture DeepInspect is built on. DeepInspect sits at the AI request boundary as a stateless proxy between authenticated users or agents and the LLM endpoints, enforces identity-bound policy on every request, and records per-decision audit records with policy version, identity context, data classification, and decision outcome.
For procurement teams running an AI vendor security review, DeepInspect's architecture maps the questions above to concrete answers: named model providers and pinned versions, signed identity tokens, customer-authored policy, tamper-evident per-decision records, region pinning, and an AI-layer incident response playbook with SOC coordination.
If you are running the procurement gate on an AI deployment and the questions above surface gaps in the vendor's answers, book a demo today.
Beyond the questionnaire
The questionnaire is not the end of the procurement gate. The defensible procurement process includes a proof-of-concept that runs the vendor's service against the customer's own traffic, an audit-evidence review that confirms the audit records meet the customer's regulatory regime, and a contractual review that pins the data terms, the retention, the region, and the incident response SLA.
The vendor's answers to the questionnaire are the input to the process. The proof-of-concept and the contract are where the answers get tested.
Frequently asked questions
- How long should the questionnaire take to complete?
A defensible vendor with mature AI-layer practices completes a 38-question review in two to four weeks, including supporting documentation. A vendor that takes longer is either still building the answers or is unable to map the questions to specific architecture. Both are signals.
- Should the questionnaire be the same across all AI vendor categories?
The questions are organized so the customer can scope to the vendor's category. A model-using SaaS vendor needs all 38. A pure model provider needs the first six and the data residency category. An agent framework vendor needs the first six, the policy enforcement category, and the incident response category.
- How does this questionnaire interact with SIG, CAIQ, or other industry frameworks?
The questionnaire is meant to complement SIG and CAIQ, not replace them. The SIG and CAIQ frameworks cover the cloud security baseline. The 38 questions above cover the AI-specific layer the standard frameworks do not address.
- What if the vendor refuses to answer specific questions?
A vendor refusal is itself an answer. The procurement team records the refusal, the vendor's stated reason, and the customer's residual risk. A vendor that refuses to answer questions about model coverage, audit records, or incident response is a higher residual-risk choice than a vendor that answers transparently.
- How often should the questionnaire be re-run for an existing vendor?
Annually at minimum, with a triggered re-run when the vendor announces a material change (new model provider, new region, new policy primitive, post-incident remediation). The annual cadence aligns with most enterprise vendor management programs.
- What about open-source AI components in the vendor's stack?
The vendor's open-source components are part of the supply chain that the AI-layer incident response has to cover. The vendor should be able to enumerate the open-source AI components in the stack and the monitoring it applies to known CVEs (the Marimo CVE-2026-39987 is the recent case). A vendor without supply-chain visibility leaves the customer exposed.