Is AI RMF mandatory for US federal contracts?

The framework is voluntary as published. The GSA AI Acquisition Resource Guide (updated 2025) names AI RMF as the baseline reference for federal AI procurement, and OMB Memorandum M-24-10 requires federal agencies to map AI governance practices to AI RMF. The practical effect is that AI RMF compliance is required to compete for most federal AI contracts, even though the framework itself remains voluntary.

How does AI RMF differ from ISO/IEC 42001?

ISO/IEC 42001 is the AI management system standard, certifiable through an accredited assessor. AI RMF is a risk-management framework, not certifiable. The two are complementary: ISO 42001 organizes the management system, AI RMF organizes the risk treatment within the management system. A deployer that holds ISO 42001 certification typically uses AI RMF as the risk-control catalog inside the management system documentation.

Does the Generative AI Profile change anything for gateway implementers?

The Generative AI Profile (NIST AI 600-1, July 2024) adds 12 GenAI-specific risks on top of the AI RMF baseline. The risks include data leakage, harmful content generation, intellectual property exposure, and prompt injection. Each maps to a gateway control already named in the table above. The Profile does not add new gateway requirements; it makes the existing controls more explicit for GenAI deployments.

What evidence does a NIST Tier 2 assessment require from the gateway?

A Tier 2 assessment asks for documentary evidence of each subcategory the deployer claims to satisfy. For the gateway-anchored subcategories, the assessor will ask for: the policy bundle hash history, a sample of chained audit records spanning the assessment window, the operator attestation chain for policy deployments, and the reconciliation between the request inventory and the AI system register.

NIST AI RMF Mapping for AI Gateways: How the Four Functions Land on Request-Layer Controls

NIST released the AI Risk Management Framework (AI RMF 1.0) on January 26, 2023. The companion Generative AI Profile (NIST AI 600-1) shipped July 2024 and extends the framework to GenAI-specific risks. The framework is voluntary, but Fannie Mae LL-2026-04 names AI RMF Pillar 3 (Manage) as the evidence standard for action lineage, and the GSA AI Acquisition Resource Guide treats AI RMF as the baseline for federal AI procurement. The two together push AI RMF into the de-facto required category for any deployer touching federal contracts or US mortgage lending.

I want to walk through each of the four functions, the specific subcategories that map to runtime gateway controls, and the audit record each control produces.

The four functions

AI RMF organizes 72 categories and subcategories under four top-level functions:

Govern. Policies, accountability structures, risk-management roles. Mostly organizational, not technical.
Map. Inventory of AI systems, intended uses, deployment contexts, dependencies.
Measure. Continuous testing, performance metrics, drift monitoring.
Manage. Risk treatment, incident response, action lineage, decommissioning.

Map, Measure, and Manage land on the gateway. Govern lives in the policy documents the deployer signs.

Map: inventory and traceability

Map asks the deployer to answer "what AI is running, where, and on whose behalf." The relevant subcategories:

MAP 1.5: AI system context is documented

The gateway is the inventory authority. Every request that touches an LLM endpoint flows through the gateway, so the gateway's request log is the authoritative AI usage inventory. Application teams cannot deploy a new AI integration that bypasses the gateway without leaving a discoverable trace.

The audit record field that satisfies MAP 1.5 is route (model vendor and model ID) combined with subject_type (which application, agent, or human initiated the request).

MAP 4.1: AI system components and dependencies are mapped

The gateway names the model endpoint each request was forwarded to. Across a quarter of traffic, the request log produces a complete picture of which models the organization actually uses, by team, by use case, and by data class. This is the same view the AI inventory NIST asks for, generated as a byproduct of normal traffic.

Measure: testing and monitoring

Measure asks "is the AI system performing within the bounds the deployer attested to." Most of the categories sit inside the model team's evaluation pipeline. Two land on the gateway:

MEASURE 2.6: Performance and reliability are tracked across deployment

The gateway records latency, error rate, and decision rate for each policy version. Trend the numbers across the policy lifecycle and the deployer has a continuous performance signal that holds up under a NIST Tier 2 assessment.

MEASURE 4.2: Risk indicators are tracked and reported

Risk indicators include classification verdicts (PII rate, PHI rate, secret-in-prompt rate, prompt-injection detection rate). The gateway emits each verdict as a field in the per-request record. The aggregate is the risk indicator series MEASURE 4.2 asks for.

Manage: action lineage and incident response

Manage is where the gateway carries the most weight. Three subcategories anchor the implementation:

MANAGE 1.3: Risk responses are documented for AI system actions

Every per-request audit record names the decision, the reason code, the policy version, and the verified subject. A regulator, an internal auditor, or an incident response team can reconstruct the action lineage from the chained records without depending on application logs.

This is the Pillar 3 evidence Fannie Mae LL-2026-04 (effective August 6, 2026) names by reference. The lender that deploys an AI model for a credit decision has to produce the action lineage on demand within 24 hours of a regulator request. The gateway's chained record set is the operational answer.

MANAGE 2.3: Incident response plans cover AI-specific failure modes

The gateway is the inspection point for AI-specific incidents (prompt injection, data exfiltration through completions, refusal-rate drift, unauthorized model access). The per-request record carries the forensic detail. The incident responder reading the record after an event can name the subject, the route, the data class, and the policy verdict without an additional discovery phase.

MANAGE 4.1: Decommissioning is documented

When a model endpoint is retired or a use case is sunsetted, the gateway is where the policy change is enforced. The transition record (last-traffic timestamp, decommissioning policy version, redirect target) sits inside the same chained log set as the operational records. The decommissioning audit is a query against the existing infrastructure, not a separate documentation exercise.

The mapping in a table

| AI RMF subcategory | Gateway control | Audit field | |---|---|---| | MAP 1.5 | Per-request route logging | route, subject_type | | MAP 4.1 | Aggregate route inventory | route over time | | MEASURE 2.6 | Latency/error tracking | timestamp_end - timestamp_start, decision | | MEASURE 4.2 | Classification verdict emission | data_class, reason_code | | MANAGE 1.3 | Per-decision chained audit | full record + writer_signature | | MANAGE 2.3 | Incident-ready forensic record | full record + chain replay | | MANAGE 4.1 | Policy transition record | policy_version series |

What Govern needs from the gateway

The Govern function sits with the organization, not the gateway. But the gateway produces two artifacts the Govern documentation references:

Policy bundle hash history. Every policy version that was in effect at any point in the system lifetime, hashed and signed. This is the evidence the AI risk officer attests to in the Govern register.
Operator attestation. Each deploy of a new policy carries the attestation of the operator who deployed it (verified identity, timestamp, change summary). The Govern function's accountability section references this attestation chain directly.

What MEASURE asks of the model team

Three categories under Measure sit with the model team and are out of scope for the gateway: MEASURE 1.1 (test set design), MEASURE 2.4 (bias and fairness metrics on training data), and MEASURE 3.2 (interpretability artifacts). A complete AI RMF posture requires both surfaces - the gateway carries runtime evidence, the model team carries pre-production evidence.

DeepInspect

DeepInspect implements MAP 1.5, MAP 4.1, MEASURE 2.6, MEASURE 4.2, MANAGE 1.3, MANAGE 2.3, and MANAGE 4.1 in one deployment. The audit record schema is the schema named above. The chained record set is queryable through the standard SQL interface for any time window, any subject, any route. The mapping document is published in the platform's compliance section and aligns each AI RMF subcategory to the platform control that satisfies it.

For deployers that have to produce AI RMF evidence for Fannie Mae LL-2026-04 by August 6, 2026, the platform is operational in 8 to 12 weeks from contract signature.

If you are facing the August deadline, let's talk.