← Blog

HIPAA-compliant LLMs: what the deployer has to produce when OCR shows up

HIPAA does not approve LLMs. HIPAA places obligations on covered entities and business associates around how PHI gets used, accessed, and audited. When OCR opens a complaint review of a clinical AI deployment, the questions are specific: who accessed PHI in what context, with what authorization, with what evidence. This piece walks through what HIPAA actually requires from an AI deployment, what a Business Associate Agreement does and does not cover, and the architecture that produces the audit artifact OCR will ask for.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Compliance & Regulationhipaahealthcarephicomplianceauditclinical-ai

A hospital runs an LLM-backed clinical decision support tool. A clinician submits a patient's chart to the model for diagnosis suggestions. The model returns three differential diagnoses. The clinician acts on one. Six months later, the Office for Civil Rights opens a complaint review after the patient's PHI surfaces in an unrelated data leak. OCR asks the hospital for the audit record of every AI interaction with the patient's chart in the prior 12 months. The hospital's application produces a list of sessions. The list shows the clinician's user ID and the timestamp. It does not show the prompt, the model version, the data classification, or whether the chart was shared beyond the hospital's BAA-covered providers. The session log does not satisfy the request.

I want to walk through what HIPAA actually obliges from an AI deployment, what the Business Associate Agreement does and does not cover, where the gap shows up under OCR review, and the architectural shape of the artifact a hospital has to produce.

What HIPAA obliges around AI

HIPAA's Privacy Rule and Security Rule both predate generative AI. The text of 45 CFR Parts 160 and 164 does not mention LLMs. The obligations apply anyway because AI processing PHI is a use or disclosure under the Privacy Rule and an access of ePHI under the Security Rule.

Three obligations bear on AI deployments specifically.

Minimum necessary. Section 164.502(b) limits use and disclosure of PHI to the minimum necessary for the intended purpose. Sending an entire chart to a model when only the medication list is required violates the standard. The deployer has to evidence that the prompt contained the minimum necessary content.

Access controls. 164.312(a) requires identity-based access to ePHI with audit trail. The AI request that touched ePHI has to carry the identity of the natural person on whose behalf the access happened. The application service credential identifying the EHR is not the access identity OCR is looking for.

Audit controls. 164.312(b) requires hardware, software, and procedural mechanisms that record and examine activity in information systems containing ePHI. The audit has to be examinable. The per-AI-decision record falls under this obligation.

What a BAA covers and what it does not

The Business Associate Agreement is the contract that makes a vendor a HIPAA business associate. The BAA obliges the vendor to:

  • Use PHI only as permitted by the BAA.
  • Apply safeguards consistent with the Security Rule.
  • Report breaches to the covered entity.
  • Make PHI available to the individual, to the covered entity, and to HHS on request.
  • Return or destroy PHI at termination.

What the BAA does not do: shift the covered entity's audit and disclosure obligations to the vendor. The hospital still has to produce its own evidence under OCR review. The BAA is necessary but not sufficient.

A material share of LLM vendors signs BAAs at the enterprise plan tier. OpenAI's Enterprise plan, Anthropic's enterprise tier, Microsoft Azure OpenAI, and Amazon Bedrock all support BAA arrangements. The signed BAA is the entry condition for clinical use. It is not the audit artifact.

The OCR review pattern

OCR reviews proceed by questions. The questions on an AI-related complaint review have followed a predictable pattern across the cases that have surfaced publicly. The covered entity must produce, in writing:

  1. The list of AI tools that processed the affected PHI in the relevant period.
  2. The BAA status of each tool.
  3. The minimum-necessary justification for the data sent to each tool.
  4. The identity of the natural person on whose behalf each AI call was made.
  5. The policy that governed each call (data classification, redaction rules, model selection).
  6. The audit record showing the request, the response, and the action taken.

The application's session log produces evidence for items 1 and 4 partially. Items 2, 3, 5, and 6 require an audit artifact the application does not produce.

Cloud Radix found that 57% of healthcare professionals use unauthorized AI

Cloud Radix's 2026 survey reported that 57% of healthcare professionals use unauthorized AI tools to process PHI: SOAP notes, diagnostic plans, billing summaries. None of those tools sit under a BAA. None produce the audit artifact. None apply the minimum-necessary test. The hospital owns the disclosure obligation anyway because the workflow is happening inside the hospital's environment.

The shadow AI surface is the dominant compliance exposure in healthcare. The remediation is not a policy memo telling clinicians to stop. The remediation is making the sanctioned path the path of least resistance and instrumenting the entire AI surface with audit.

What an architecture has to do

Four properties answer the OCR question set.

Coverage of all AI traffic, sanctioned and shadow. The hospital needs visibility into every AI request touching PHI, including the LLM calls inside vendor SaaS that the hospital does not directly control. A gateway in the network path captures the sanctioned traffic. A combination of egress monitoring plus identity-bound DNS-level controls catches the shadow path that does not go through the gateway.

Identity at the request layer. The clinician's identity has to attach to the request before it crosses the boundary to the LLM. SSO at the application is not enough; the identity has to be on the AI call itself.

Per-decision audit log. Every AI request produces a record with the principal, the model and endpoint, the prompt after minimum-necessary redaction, the response after policy treatment, the policy version, and the decision outcome. Six-year retention to match the longest medical-record obligation under typical state law.

Minimum-necessary enforcement at the gateway. The policy that decides what content goes to the model runs at the gateway, not at the application. The deployer can evidence the minimum-necessary determination as a policy-enforced check, not a clinician-by-clinician judgment call.

Special cases

Cross-state and cross-provider clinical AI. When the LLM is hosted by a provider with operations in multiple states or outside the US, the BAA covers the cross-border data flow. The audit record has to evidence which data left which jurisdiction. The gateway's per-decision log answers the question because the destination endpoint is on the record.

Research use of clinical AI. Research access to PHI under 164.512(i) has its own consent and IRB obligations. AI used in research has to produce evidence of the consent regime that authorized the use. The audit log carries the research authorization as part of the policy version on the record.

Patient-facing AI assistants. A chatbot that interacts with a patient about scheduling, symptoms, or billing accesses PHI in real time. The patient's identity, the disclosure made to the patient, and the data the chatbot fed to the model all have to land on the audit record.

Penalties and willful neglect tier

HIPAA's penalty tiers run from a low of $137 per violation (with cap) for unknowing violations to $2.067 million per violation per year for willful neglect not corrected (as of the 2024 adjusted amounts; check the current Federal Register). A breach of unsecured PHI affecting more than 500 individuals triggers OCR notification, media notification, and the high-visibility settlement risk that has produced multi-million-dollar resolutions in past cases.

The pattern OCR has followed: settlements scale with the entity's failure to produce an audit artifact. A hospital that can produce a clean per-decision log and demonstrate the policy that governed the AI access is in a stronger negotiating position than one that produces session logs and a BAA folder.

DeepInspect

DeepInspect is the policy gateway in the AI request path. The deployment pattern for a HIPAA-covered entity is a proxy or sidecar in front of clinical AI traffic, vendor-embedded AI usage, and the egress paths that catch shadow AI. Identity resolves at the gateway through the hospital's SSO. Minimum-necessary policy evaluates at the gateway. The per-decision audit log writes to an append-only signed store with six-year retention.

For a hospital or health system running clinical or operational LLMs and facing the question of what OCR will ask for, DeepInspect produces the per-decision audit artifact the review expects.

If you are a covered entity mapping your AI program against HIPAA, let's talk today.

Frequently asked questions

Is an LLM HIPAA-compliant by itself?

No. HIPAA does not approve products. The covered entity is HIPAA-obligated. The product is a tool the entity uses. A signed BAA with the model provider is the entry condition; the audit artifact is the deployer's responsibility.

Does the BAA cover audit?

The BAA covers the vendor's obligation to maintain its own safeguards and report breaches. The deployer's audit obligation is independent. The deployer needs an audit artifact regardless of the vendor's BAA.

What about HHS-OIG and the FTC?

HHS-OIG enforces fraud and abuse in federal health programs. The FTC has been active on AI in healthcare through its 2026 enforcement actions on diagnostic claims and patient consent. The audit artifact is useful evidence for both agencies, not just OCR.

How does this interact with the EU AI Act for a multinational health system?

A multinational system has to satisfy HIPAA in the US, GDPR plus the EU AI Act in the EU, and sector-specific laws in each jurisdiction. The architecture that satisfies all of them at the request layer is the policy gateway with the per-decision audit log. The legal mapping varies; the technical artifact is consistent.

What is the shortest path to a defensible deployment?

Four steps. One: sign BAAs with every model and vendor that processes PHI. Two: route all clinical AI traffic through a gateway. Three: instrument identity at the gateway so the clinician on whose behalf the call was made is on every record. Four: retain the per-decision log for the longest applicable obligation. Step four is the artifact OCR will ask for.