NIST AI Risk Management Framework: GOVERN, MAP, MEASURE, MANAGE at the request layer
The NIST AI Risk Management Framework organizes AI risk into four functions: GOVERN, MAP, MEASURE, MANAGE. The framework is voluntary in name and effectively mandatory for federal contractors, critical infrastructure operators, and any organization whose AI program will be measured against US guidance. The text reads at a higher level of abstraction than implementation. This piece walks through each function with the artifact a real organization has to produce, then maps the artifacts to the request-layer architecture that produces them.
The NIST AI Risk Management Framework 1.0 was published in January 2023 and has been refined through profile documents (Generative AI Profile, July 2024) and ongoing companion work. The framework organizes AI risk into four functions: GOVERN, MAP, MEASURE, MANAGE. The text reads at the level of strategy. The implementation reads at the level of artifacts and architecture. The artifacts that the four functions actually produce, and the runtime architecture that produces them, are what separates organizations that have a framework on paper from organizations that pass a measurement.
NIST is also teeing up the COSAiS overlays for Single-Agent and Multi-Agent systems, plus the AI RMF Profile for Critical Infrastructure. Federal contractors and critical infrastructure operators will be measured against the overlays. The runtime architecture that satisfies the AI RMF today is the same architecture the overlays will assume.
I want to walk through each function with the artifact a real organization produces and the request-layer architecture that makes the artifact maintainable at scale.
GOVERN: the organizational scaffolding
GOVERN is the function that establishes accountability, policies, and the connection between AI risk and organizational risk. The artifacts are policy documents and assigned accountabilities.
What GOVERN really produces:
- AI usage policy signed by executive leadership. The policy defines acceptable use, prohibited use, oversight requirements, and incident reporting.
- Accountability assignments for the AI program. Named individuals for governance, security, compliance, and operations.
- Risk appetite statement for AI. The acceptable level of risk, expressed in operational terms, not just "moderate appetite."
- Tiered review mapping the categories of AI use to the level of review (executive committee, AI review board, technical review).
GOVERN does not directly touch the request layer. The connection to runtime is that GOVERN sets the policies that MEASURE and MANAGE then evaluate against. A policy that the runtime cannot evaluate is a policy on paper. A policy the runtime can evaluate is a policy with teeth.
MAP: identifying where AI is in use and what risk it creates
MAP is the function where the organization documents its AI deployments and the risks each one creates. The artifacts are the AI inventory and the per-deployment risk assessment.
What MAP really produces:
- AI inventory listing every model, vendor, and deployment in production and development. Includes vendor-embedded AI inside SaaS products.
- Use case classification for each deployment: the function, the user population, the data classification, the impact on individuals or operations.
- Risk register entries linking each deployment to the risk categories: bias, privacy, security, reliability, accountability.
- Stakeholder map for each deployment: provider, deployer, end user, affected populations.
MAP fails most often on the inventory. Vendor-embedded AI in SaaS products often does not appear in IT inventories because the SaaS product was not procured as AI. The inventory has to extend to embedded AI usage.
The runtime connection: a gateway in the AI request path observes every AI call from the organization's network. The gateway's traffic logs become the source-of-truth inventory. The MAP exercise stops being a manual update and becomes a derived view of the runtime telemetry.
MEASURE: evaluating the AI for risk
MEASURE is the function where the organization actually evaluates each AI system for the risks MAP identified. The artifacts are test plans, evaluation reports, and ongoing measurement results.
What MEASURE really produces:
- Pre-deployment evaluations: bias testing, resilience testing, security testing, performance benchmarks.
- Red-teaming reports: adversarial testing of the deployment for prompt injection, jailbreaks, and policy bypasses.
- Monitoring plans: what gets measured continuously, with what thresholds, by what mechanism.
- Measurement results: the ongoing data that proves the system is still operating inside the risk envelope.
MEASURE breaks down at the continuous side. Pre-deployment evaluation is finite and project-shaped. Continuous measurement is operational and requires the runtime to emit the data needed to evaluate.
The runtime connection: per-decision audit data is the input to continuous measurement. The gateway's per-request record contains the principal, the model, the prompt characteristics, and the decision outcome. Continuous measurement queries that data for drift, bias indicators, policy hits, and incident signal.
MANAGE: acting on the risk
MANAGE is the function where the organization actually does something about the risk MEASURE surfaced. The artifacts are incident response procedures, mitigation actions, and the decision records for risk treatment.
What MANAGE really produces:
- Incident response procedures for AI-specific incidents: model failure, data leak through a model, prompt injection success, policy bypass.
- Mitigation actions in response to measurement findings: retraining, policy updates, deployment rollback, model swap.
- Risk treatment decisions documented for each material risk: accept, mitigate, transfer, avoid.
- Communication plan for AI incidents to affected stakeholders.
MANAGE breaks down when the runtime cannot enforce the mitigation. The team decides to disable a model for a specific use case. The application keeps calling the model anyway because the policy is in the wiki and the runtime is in the code.
The runtime connection: the gateway is the enforcement point. A policy decision (block this model for this use case, require redaction on this data classification, route this traffic to a different provider) is a policy change at the gateway. The change is auditable, reversible, and effective immediately.
How the four functions stack at the request layer
The pattern that holds across the four functions: the framework on paper depends on the runtime to produce the evidence. A framework without a runtime is a check-box exercise. A runtime without a framework is uncoordinated engineering.
The gateway sits at the intersection. GOVERN sets the policies. MAP uses gateway traffic as the source of truth for inventory. MEASURE uses gateway audit data as the input for continuous measurement. MANAGE uses the gateway as the enforcement point for mitigation actions.
End-to-end gateway overhead measures under 50ms in production tests. The latency budget is well inside the 500ms-to-5-second LLM inference baseline.
How AI RMF maps to other frameworks
A gateway-based runtime that satisfies AI RMF MEASURE and MANAGE also satisfies request-layer obligations across:
- EU AI Act: Article 12 (lifetime logging), Article 19 (log fields), Article 26 (deployer obligations).
- HIPAA: 164.312(a) access controls, 164.312(b) audit controls.
- DORA: Article 9 resilience, Article 28 third-party register.
- Fannie Mae LL-2026-04: per-decision record of AI-influenced loan decisions.
- NIST AI agent identity Pillars 1-3: identity, authorization, audit at the request boundary.
The mapping is structural. The runtime artifact the gateway produces (the per-decision audit log with identity, policy, and decision context) is the common artifact across the frameworks.
COSAiS overlays and Critical Infrastructure Profile
The COSAiS Single-Agent and Multi-Agent overlays NIST is preparing will add specific guidance on single-agent and multi-agent systems. The agentic-skills layer the OWASP Top 10 for Agentic Applications 2026 identifies will be covered. A runtime that already enforces and audits on the tool-call boundary will satisfy the overlay obligations without architectural change.
The AI RMF Profile for Critical Infrastructure will translate the framework to critical-infrastructure operators (energy, finance, transportation, healthcare, water). The profile will set the runtime expectations for operators whose AI usage has direct impact on essential services.
DeepInspect
DeepInspect is the policy gateway in the AI request path. The gateway resolves identity at the request boundary, evaluates policy against the request and the policy version, writes a per-decision audit record, and returns the treated response. The same artifact serves MAP, MEASURE, and MANAGE.
For a federal contractor, critical infrastructure operator, or any organization whose AI program will be measured against NIST guidance, DeepInspect produces the artifacts the framework expects with the runtime overhead the framework's continuous-measurement function requires.
If you are mapping your AI deployment against AI RMF and want a concrete walk-through of the gateway pattern, let's talk today.
Frequently asked questions
- Is the NIST AI RMF mandatory?
The framework is voluntary in its statutory base. In practice, federal contractors are increasingly measured against it through FAR clause incorporation and individual agency requirements. Critical infrastructure operators face similar measurement pressure from sector-specific regulators.
- What is the relationship between the AI RMF and ISO 42001?
ISO 42001 is the certifiable management system standard for AI. The AI RMF is the risk management framework. The two complement: ISO 42001 sets the management-system requirements; AI RMF sets the risk-management approach. The runtime evidence that satisfies AI RMF MEASURE also feeds the ISO 42001 audit.
- How often does the AI RMF need to be re-applied?
Continuously for the MANAGE function. Periodically for MAP (when new deployments come online or vendor-embedded AI changes) and MEASURE (when the threat landscape or the model changes). GOVERN is reviewed on the cadence the organization sets for its top-level policies.
- What is the gap most AI RMF implementations have?
The gap is between GOVERN and MEASURE / MANAGE. The policy exists. The runtime cannot evaluate the policy. The result is a paper framework with no operational teeth. The fix is to move the policy decision and the audit record to a runtime that can actually evaluate and record.