← Blog

EU AI Act Foundation Models: How the Regulation Treats Pre-Training, Fine-Tuning, and Substantial Modification

The EU AI Act does not use the term "foundation model" in its operative text. The regulation treats the underlying systems as general-purpose AI models under Article 51 and triggers systemic-risk obligations at 10^25 training FLOPs under Article 52. Fine-tuning and integration into downstream systems are handled separately by Article 25. The result is a layered obligation set that depends on whether the model is pre-trained, fine-tuned, or repurposed into a high-risk system.

ByParminder Singh· Founder & CEO, DeepInspect Inc.
Compliance & Regulationeu-ai-actfoundation-modelsgpaicomplianceai-governanceregulation
EU AI Act Foundation Models: How the Regulation Treats Pre-Training, Fine-Tuning, and Substantial Modification

The EU AI Act does not use the phrase "foundation model" in its operative text. The regulation treats the underlying systems as general-purpose AI models under Article 51, triggers systemic-risk obligations at 10^25 training FLOPs under Article 52, and handles fine-tuning and integration into downstream systems separately under Article 25. The result is a layered obligation set: the pre-training entity carries GPAI provider obligations, the fine-tuner can carry downstream-provider obligations or remain a GPAI provider depending on the change, and the entity integrating the model into a high-risk system carries the Article 16 through 27 obligations on top. The GPAI obligations took effect August 2, 2025. The high-risk obligations take effect August 2, 2026.

I want to walk through the three-layer structure the Act imposes on foundation models, where each set of obligations attaches, and what the enterprise integrating a third-party model into its own products owes.

Mandate

The EU AI Act distinguishes three actors across the foundation model life cycle, and each one carries a distinct obligation set.

Layer 1: the pre-training GPAI provider (Articles 51-55)

The entity that pre-trains and releases the foundation model is the GPAI provider. Article 51 obligations include technical documentation under Annex XI, downstream-provider information disclosure under Annex XII, a Union copyright compliance policy, and a published training-data summary under Annex VIII Section B. Article 52 designates the model as systemic-risk if training compute crosses 10^25 FLOPs cumulatively. Systemic-risk providers also owe model evaluations, adversarial testing, serious-incident reporting, and cybersecurity obligations under Article 55.

Layer 2: the fine-tuner

A fine-tuner that materially modifies a GPAI model can become a GPAI provider for the fine-tuned model. The decisive question under Article 53(2) is whether the fine-tuning is substantial. The Code of Practice operationalizes this in 2026 with thresholds related to additional training compute, modification of model weights, and changes to the model's intended capability profile. Where the fine-tuned model retains significant generality, the fine-tuner inherits Article 51 obligations for the new model. Where the fine-tuning collapses the generality into a single-task model, the GPAI classification falls away but the fine-tuner may become the provider of a downstream system.

Layer 3: the downstream high-risk integrator (Article 25)

Article 25 recharacterizes a downstream entity as the provider of a high-risk AI system when the entity puts its name on the system, makes a substantial modification, or modifies the intended purpose into the high-risk category. An enterprise that takes a foundation model, builds a clinical decision support tool with it, and places the tool on the market is the provider of a high-risk AI system under Article 25. The full Article 16 through 27 obligations attach: risk management, data governance, technical documentation, record-keeping, transparency, human oversight, accuracy and resilience, cybersecurity, quality management system, conformity assessment, and post-market monitoring.

Multiple layers can apply to the same entity

A large fintech may take a pre-trained foundation model, fine-tune it on proprietary data, and integrate the fine-tuned model into a high-risk credit-decisioning system. The fintech is the fine-tuned-model GPAI provider for Layer 2 and the high-risk system provider for Layer 3. The pre-trained model's original provider retains Layer 1 obligations for the pre-trained release.

Compliance gap

The layered structure produces a documentation chain that depends on each upstream layer producing what the downstream layer needs.

The Annex XII downstream disclosure must support the high-risk integration

Article 53 requires the GPAI provider to make information available to downstream providers integrating the model. Annex XII lists the categories of information: the model's intended tasks and the type of users, the acceptable use policies, the model's general technical specifications, the date and length of training, the type and provenance of data, the computational resources, and the consumption profile. Downstream providers preparing high-risk conformity files draw on this information. Where the upstream disclosure is sparse, the downstream file is incomplete by reference.

Fine-tuning creates a documentation gap

A fine-tuner who does not produce its own training-data summary, evaluation results, and updated technical documentation leaves the downstream integrator with a gap. The downstream high-risk system carries Article 12 record-keeping at the operational boundary, but the fine-tuning context that affects the system's behavior is undocumented. An auditor reviewing the high-risk conformity file may push the finding upstream to the fine-tuner where the documentation is sparse.

The operational record-keeping is independent of the upstream layers

Article 12 record-keeping at the high-risk system level is the integrator's obligation regardless of the upstream layers' documentation quality. The high-risk system must produce per-decision records with identity, classification, policy state, and decision outcome. The records are operational. They do not depend on the upstream provider documenting the pre-training data or the fine-tuner documenting the adjustment.

Copyright compliance flows through the chain

Article 51(1)(c) requires the GPAI provider to put in place a policy to comply with Union copyright law, with attention to the Article 4(3) DSM Directive opt-out for text and data mining. A downstream provider integrating the model inherits residual risk where the upstream provider's policy is weak or unverifiable, even though the formal obligation sits with the GPAI provider.

Mandate vs Compliance

The text of the Act layers obligations across the model life cycle. The infrastructure that survives a regulatory review across all three layers sits below the text.

Disclosure test

A national competent authority opens an investigation into a high-risk system that produced harm. The authority asks the downstream integrator for the conformity file, the per-decision records under Article 12, and the upstream GPAI documentation under Annex XII. A compliant integrator produces all three within the regulatory deadline. A non-compliant integrator produces a partial file with upstream gaps and downstream operational logs missing identity, classification, or policy state.

Vendor liability across layers

Each layer is the regulated party for its own obligations. Contractual indemnification between layers does not transfer the regulatory finding. The pre-training provider remains the regulated party for Article 51 even where the downstream integrator agrees to indemnify. The downstream integrator remains the regulated party for Article 25 high-risk obligations regardless of upstream agreements.

Compliance gap

The compliance gap at the operational boundary is the same regardless of which layer the integrator sits in. The high-risk system must produce contemporaneous, identity-bound, classification-aware, tamper-evident per-decision records. The architecture that produces these records sits at the AI request boundary, not inside the application or the model.

DeepInspect

This is the architecture the high-risk downstream provider needs at the operational boundary regardless of the upstream layers. DeepInspect sits at the AI request boundary as an external enforcement layer that operates as a stateless proxy between authenticated users or agents and any foundation model endpoint. Every HTTP request to a pre-trained or fine-tuned foundation model passes through the proxy. The proxy evaluates per-route, per-role policies using identity context the calling application supplies. The per-decision audit record is committed by the proxy, independent of the foundation model provider and independent of the calling application.

The record contains a verified identity for the requester, the role and authorization context, the data classification applied to the prompt, the foundation model and version actually called, the policy version that governed the decision, the decision outcome, and a cryptographic signature that prevents post-hoc modification. The proxy enforces per-route policies on which foundation models a given role can call, which categories of data can flow to which model, and which policy bypasses require independent approval. The Article 12 record-keeping at the high-risk system level is satisfied at the operational boundary, regardless of how the underlying foundation model was trained or fine-tuned.

If you are facing the August deadline, let's talk.

Beyond the EU AI Act

The same operational architecture supports adjacent obligations on foundation model integration. The NIST AI RMF puts the integrator inside the Map, Measure, and Manage functions for any model the integrator uses, including third-party foundation models. UK AI guidance and emerging US state laws including Colorado SB 169 and California SB 1047 treat the integrator as the regulated entity for operational decisions. Each regime accepts the same architectural record: an inspection layer on the AI request path that writes a tamper-evident, identity-bound, classification-aware record at decision time.

Frequently asked questions

Why does the EU AI Act not use the term "foundation model"?

The Act was drafted as a horizontal regulation that needed terminology durable across architecture changes. The drafters used "general-purpose AI model" to capture both the current foundation models and future architectures that display significant generality. The term applies whether the model is a transformer, a state-space model, a mixture-of-experts system, or a future architecture not yet deployed.

Does fine-tuning automatically make me a GPAI provider?

No. Fine-tuning that does not materially change the model's capabilities does not make the fine-tuner a GPAI provider. The Code of Practice operationalizes the substantial-modification threshold in 2026. Light parameter-efficient fine-tuning, RLHF on small datasets, and LoRA adapters that adjust narrow behaviors typically do not cross the threshold. Full continued pre-training on a large new corpus does cross it.

What happens if I integrate a foundation model into a non-high-risk system?

The downstream Article 25 obligations do not attach because the system is not high-risk. The GPAI provider's Article 51 obligations still apply to the foundation model. The downstream entity may carry transparency obligations under Article 50 if the system interacts with natural persons. The Article 12 record-keeping mandate does not attach to non-high-risk systems, but adjacent regimes (DORA, GDPR) may impose record requirements regardless.

How do I know if my foundation model is systemic-risk?

The Article 52 threshold is cumulative training compute of 10^25 floating-point operations. Providers are required to notify the AI Office when training compute crosses the threshold. The AI Office can also designate a model as systemic-risk below the threshold based on capabilities or reach. A provider that believes a designation is incorrect can challenge it under Article 52(2), but the rebuttal burden sits with the provider.

What does the GPAI provider have to disclose to me as a downstream integrator?

Annex XII lists the categories: intended tasks and users, acceptable use policies, technical specifications, training date and duration, type and provenance of training data, computational resources, energy consumption, and known limitations. The Code of Practice operationalizes the format and detail level. Providers signing the Code commit to specific disclosure templates that the AI Office accepts as a presumption of complia