EU AI Act Article 72: Post-Market Monitoring as a Runtime Architecture Requirement
Article 72 of the EU AI Act requires providers of high-risk AI systems to set up and document a post-market monitoring system that actively and systematically collects data on the performance of the AI throughout its lifetime. The monitoring has to feed back into the risk management process under Article 9 and into the technical documentation under Article 11. The architectural requirement is for a runtime evidence pipeline, not for periodic reporting. Most providers run product analytics and call it post-market monitoring, and the regulator will not accept that under inspection.

Article 72 of the EU AI Act requires providers of high-risk AI systems to set up and document a post-market monitoring system that actively and systematically collects, documents, and analyzes data on the performance of the AI system throughout its lifetime. The monitoring system has to enable the provider to evaluate continuous compliance with the high-risk requirements set out in Chapter III, Section 2. The output of the monitoring feeds the risk management process under Article 9 and the technical documentation under Article 11. The August 2, 2026 effective date for high-risk obligations triggers the post-market monitoring requirement.
The text reads as a process. The architecture that satisfies it runs at the request layer.
I want to walk through what Article 72 actually mandates, where standard product analytics fall short, and the runtime pipeline that produces the evidence the regulator and the provider's own risk management process both expect.
Mandate
Article 72 has three operative requirements. First, the provider must establish a post-market monitoring system proportionate to the nature of the AI technologies and the risks of the high-risk AI system. Second, the monitoring system must actively and systematically collect, document, and analyze relevant data on the system's performance. Third, the data has to be analyzed to evaluate continuous compliance with Chapter III, Section 2 requirements and to feed back into the risk management process.
The Commission is expected to adopt an implementing act detailing the post-market monitoring plan template. The detailed template is not the operative obligation. The operative obligation is on the substance: structured evidence about the AI's runtime behavior, analyzed against the compliance requirements, feeding back into the risk management.
What "actively and systematically" rules out
The wording rules out passive observation. A provider that captures application logs and looks at them occasionally is not doing active, systematic collection. The monitoring has to be a deliberate pipeline, with defined metrics, defined cadence, and defined feedback loops to the risk management process.
The lifetime of the system
The monitoring has to run over the lifetime of the system, not just at major milestones. A provider that runs an analysis at launch and another at the annual conformity re-assessment is collecting two data points, not a monitoring pipeline. The requirement is continuous.
Feedback into Article 9 risk management
The risk management system under Article 9 has to consume the post-market monitoring output. When the monitoring detects a deterioration in performance, an emerging anomaly, or a use case the system was not designed for, the risk register updates, the mitigations adjust, and the technical documentation under Article 11 reflects the change. The flow from monitoring to risk management to documentation has to be a real process.
Compliance gap
Most providers running AI in production already collect telemetry. The telemetry rarely satisfies Article 72.
Product analytics measures usage, not performance
Product analytics dashboards count requests, users, error rates, and conversion. The metrics serve product decisions. They do not measure the AI system's compliance with the high-risk requirements. Article 72 monitoring has to measure performance characteristics relevant to the high-risk requirements: accuracy, resilience under adversarial inputs, fairness across demographic groups, drift relative to the baseline used in the conformity assessment, and the runtime behavior of the safety-relevant components.
The records lack identity and policy context
A monitoring pipeline that aggregates request counts and error rates without identity context cannot evaluate fairness across user groups or detect demographic drift. A pipeline that does not capture the policy version in effect at each decision cannot link a performance change to a policy change. The evidence the regulator needs to reconstruct the system's behavior is granular per-decision evidence, not aggregated counts.
Risk register updates are not driven by the monitoring
The risk management system under Article 9 mandates that the risks identified at design time be revisited and updated based on the system's runtime behavior. Most engineering teams maintain a risk register that gets updated at quarterly reviews based on incidents that surfaced through other channels. The monitoring system rarely feeds the register directly. The Article 72 requirement is that the monitoring drives the risk management, not that the two run on parallel tracks.
Vendor-supplied AI is monitored at the wrong layer
When a high-risk AI system embeds vendor LLMs or vendor models, the post-market monitoring obligation runs against the deployed system as a whole. Monitoring only the provider's own application layer leaves the vendor's contribution to the system's behavior outside the evidence. The provider has to monitor the system at the layer where the AI decisions actually emerge, which is the AI request boundary.
Mandate vs Compliance
Article 72's text reads as a process. The infrastructure to satisfy it produces structured runtime evidence at a granularity that supports the risk management feedback loop.
What the regulator will ask for
A market surveillance inspection under Article 72 will ask the provider to produce the post-market monitoring plan, the actual collected data, the analyses run on the data, and the documented adjustments to the risk management and the technical documentation that resulted from the analyses. The records have to span the lifetime of the system in production, not just a recent window.
What surviving the inspection actually requires
A post-market monitoring system that survives Article 72 inspection has three layers. The runtime evidence layer captures per-decision records with identity context, policy version, data classification, and outcome. The analysis layer runs structured queries against the evidence, computing performance metrics, fairness metrics, drift metrics, and policy effectiveness metrics on the cadence the monitoring plan specifies. The feedback layer takes the analysis outputs and writes back into the risk register, the technical documentation, and any provider-side incident response.
The runtime evidence layer is where most providers have the largest gap. The other two layers can be built on top of evidence that exists; they cannot be built on top of evidence that was never collected.
Linkage to Article 79 and corrective actions
Article 79 requires providers to take corrective actions when a non-compliance is detected. The corrective action workflow runs on the output of the post-market monitoring. A monitoring system that does not detect non-compliance produces no corrective actions; a regulator that finds a non-compliance the monitoring missed will assess the monitoring system as deficient. The two articles depend on each other.
DeepInspect
This is the runtime evidence layer DeepInspect produces. DeepInspect sits at the AI request boundary as a stateless proxy between authenticated users or agents and the LLM endpoints, enforces identity-bound policy on every request, and records a per-decision audit record that includes the identity, the policy version, the data classification, the decision outcome, and a tamper-evident signature.
For post-market monitoring under Article 72, the per-decision records are the evidence the analysis layer runs against. Fairness across demographic groups can be computed by aggregating decisions by identity attributes. Drift relative to baseline can be computed by comparing the distribution of decisions over time. Policy effectiveness can be computed by linking policy versions to the decisions they governed. The monitoring plan can specify the metrics; DeepInspect's records are the source data.
If you are placing a high-risk AI system on the EU market and your post-market monitoring depends on product analytics rather than on structured runtime evidence, the inspection that follows August 2 will surface the gap. Book a demo today.
Beyond Article 72
The post-market monitoring pattern appears under different names in adjacent regimes. The NIST AI Risk Management Framework MEASURE function expects continuous measurement of AI system performance. ISO/IEC 42001 requires a management system that runs ongoing performance evaluation. The Fannie Mae LL-2026-04 framework requires lenders to monitor AI-assisted decisions for governance purposes.
The architecture that satisfies Article 72 produces the evidence each of these regimes expects. The vocabulary differs across regulators. The underlying requirement is the same: structured runtime evidence at a granularity that supports the feedback loop into risk management.
Frequently asked questions
- What metrics are required under Article 72?
The regulation does not specify a metrics catalog. The provider's monitoring plan specifies the metrics, calibrated to the AI system and the risks identified in the Article 9 risk management. For most high-risk systems, the metrics cover accuracy and adversarial resilience on production data, fairness across the protected groups relevant to the use case, drift relative to the baseline used in the conformity assessment, the rate of edge cases where the safety-relevant components fired, and the policy effectiveness on the safety-relevant components. The Commission's implementing act, when adopted, will provide a template that operationalizes these categories.
- How often does the monitoring data have to be analyzed?
The regulation requires the analysis to be continuous, scaled to the risks of the AI system. A real-time use case (credit scoring at the point of application) requires analysis on a daily or near-real-time cadence. A batch use case (periodic risk re-scoring of a customer portfolio) can analyze on the batch cadence. The cadence has to be defined in the monitoring plan and has to be sufficient to detect emerging issues before they cause harm at scale.
- Does the monitoring system have to detect every issue automatically?
The monitoring system has to actively and systematically collect and analyze the data. It does not have to detect every issue automatically. The analyses can produce signals that human reviewers act on. The reviewers have to be assigned, trained, and authorized to take action. The combination of automated metrics, dashboards, and human review is what the regulation expects.
- How does Article 72 interact with vendor-supplied AI?
A provider running a high-risk AI system that embeds vendor AI inherits the monitoring obligation for the deployed system as a whole. The vendor's monitoring of its own service is necessary but does not satisfy the provider's Article 72 obligation. The provider has to monitor the AI behavior at the level the system delivers to users, which means monitoring at the AI request boundary in the deployed system. The vendor's contribution to that behavior is part of what the monitoring measures.
- What is the relationship between Article 72 and Article 12?
Article 12 requires the system to log events automatically. Article 72 requires the provider to collect, analyze, and act on data about the system's performance. The Article 12 logs are part of the data the Article 72 monitoring runs against. A provider that has satisfied Article 12 has the source data; satisfying Article 72 requires building the analysis and feedback layers on top of that data. A provider that has not satisfied Article 12 cannot satisfy Article 72 because the data nee