Open Source LLM Guardrails: The Libraries Available, Where They Sit, and What They Cannot Replace
Open source LLM guardrails libraries cover prompt-side and response-side filtering inside the application or inference path. Llama Guard, NeMo Guardrails, Guardrails AI, LMQL, and Rebuff each occupy a different position in the stack and produce different control surfaces. This piece walks through the libraries available, the architectural position each one takes, the controls they produce, and the regulatory profile that requires an external inspection layer on top of any of them.

Open source LLM guardrails libraries took shape over 2023 to 2025 as a response to prompt injection, data leakage, and adversarial input patterns inside the application's inference path. Meta released Llama Guard. NVIDIA released NeMo Guardrails. Guardrails AI shipped an open source SDK for output validation. LMQL framed guardrails as a constrained-generation language. Rebuff focused on prompt injection detection. The libraries cover different points in the request path, produce different kinds of records, and compose differently with an external inspection layer. The choice between them is an architectural choice about which layer in the stack the guardrails operate at, not a feature comparison.
I want to walk through the open source guardrails libraries that show up in production deployments, the architectural position each one takes, the control they produce, and the regulatory profile that requires an external inspection layer above any of them.
Llama Guard: model-side prompt and response classification
Llama Guard is a Meta-released model fine-tuned for classifying prompts and responses against a taxonomy of unsafe content. The library runs Llama Guard as an additional inference call on the prompt before it reaches the primary model and on the response before it returns to the caller. The classification produces a label (safe or unsafe with a category) that the application acts on.
The position is model-side, inside the inference path. The library is part of the application that calls the primary model. The classification quality depends on the Llama Guard model version and the taxonomy the application configures. The library produces no native audit record series. Any logging the application does is application-controlled.
The library is useful for catching common unsafe content patterns at the model layer before the primary model responds. The library is not a substitute for an inspection layer at the HTTP request boundary because the inference call is part of the application boundary the regulator's write-path independence test applies to.
NeMo Guardrails: programmable conversational rails
NeMo Guardrails is an NVIDIA-released SDK for defining conversational rails in a domain-specific language (Colang). The library runs inside the application and evaluates the user input, the model response, and the conversation context against the rails the application defines. The rails can route the conversation, refuse certain topics, sanitize input, or call out to external tools.
The position is application-side. The library is part of the application that orchestrates the conversation. The library produces a runtime trace the application can log. The trace is application-controlled.
The library is useful for shaping the conversation flow in domains where the application has strong intent constraints (a customer support bot, a medical triage assistant). The library is not a substitute for an inspection layer at the HTTP request boundary because the same application that runs the rails also writes the records.
Guardrails AI: output validation SDK
Guardrails AI is an open source SDK for validating LLM outputs against typed schemas and content checks. The SDK runs inside the application and validates the model response before the application acts on it. Validators cover structured-output conformance (JSON schemas), content checks (regex, classifier outputs), and rerun-on-failure flows.
The position is application-side, response-side. The SDK validates the response after the model has generated it. The SDK does not inspect the prompt at the request boundary and does not produce a request-time record. The SDK produces a validation result the application acts on.
The SDK is useful for enforcing structured-output contracts and for catching content issues in the response before the application acts on it. The SDK does not produce a per-decision audit record series at the request boundary and is not a substitute for an inspection layer.
LMQL and Outlines: constrained generation
LMQL and Outlines are open source libraries for constrained generation. The libraries shape the model's output at generation time by restricting the token space the model samples from to outputs that match a grammar or a type. The result is the model produces only outputs that conform to the constraint.
The position is inside the inference path. The libraries operate at the model's sampling step and are part of the inference stack. The libraries produce no audit record series at the request boundary.
The libraries are useful for enforcing strict output structure (a specific JSON shape, a specific enum). The libraries operate on the model's output and do not inspect the prompt at the request boundary, do not evaluate identity-bound policy, and do not produce records the regulator consumes.
Rebuff: prompt injection detection
Rebuff is an open source library focused on prompt injection detection. The library applies multiple detection layers (heuristics, vector-similarity to known injection patterns, an LLM-based detector, and a canary-word check) and produces a detection signal the application acts on.
The position is application-side, request-side. The library inspects the prompt before the request reaches the primary model and produces a detection score the application acts on. The library does not produce a per-decision audit record series at the request boundary.
The library is useful as a detection layer inside the application. Detection is not the same as enforcement. The application has to act on the signal. The signal generation and the action are both inside the application boundary. The library does not satisfy the regulator's expectation for an independent record series at the HTTP request boundary.
What the libraries collectively cover and what they leave open
The libraries collectively cover model-side prompt classification (Llama Guard), conversational rails (NeMo Guardrails), output validation (Guardrails AI), constrained generation (LMQL, Outlines), and prompt injection detection (Rebuff). Each library produces controls that operate inside the application or the inference path. The controls are useful as defense-in-depth layers.
The libraries leave four obligations open. The first is identity-bound policy at the request boundary. The libraries do not evaluate the natural-person identity, the agent identity, or the role context against the request. The second is independent audit records. The libraries produce records inside the application boundary, which fails the regulator's write-path independence test. The third is policy versioning. The libraries do not manage versioned policy bundles the policy administration point operates. The fourth is integrity-stamped audit records. The libraries do not produce cryptographically signed records with hash-chain integrity.
The regulator's expectation is for an inspection layer at the HTTP request boundary that satisfies the four obligations. The libraries compose with the inspection layer: the inspection layer is the primary control layer, the libraries are defense-in-depth layers inside the application.
What the regulatory profile expects
EU AI Act Article 12 expects records over the lifetime of the system that ensure traceability and that include input data, identity of natural persons, and the period of use. The records have to be produced by a system independent of the application. DORA Article 19 expects records of operational events with timestamps and identity. The records have to support audit replay. Fannie Mae LL-2026-04 expects records that establish audit trails for AI-assisted lending decisions. The records have to be retainable under the lender's retention obligations.
The open source libraries produce records inside the application. The records lack the cryptographic integrity stamp the regulator expects. The records lack the write-path independence the regulator expects. The records lack the field set (route, policy version, decision outcome with rule identifier) the regulator consumes. The libraries cover useful defense-in-depth layers and are not a substitute for the inspection layer the regulatory regime expects.
DeepInspect
This is the gap DeepInspect closes. DeepInspect sits inline between authenticated users or agents and any LLM over HTTP. The inspection layer evaluates identity-bound policy at the request boundary, applies pass, block, or modify decisions, and commits a per-decision audit record to durable, append-only storage with a cryptographic integrity signature before the response forwards. The record series carries identity, route, policy version, data classification outcome, decision outcome, model and version, and integrity metadata in a format that EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, NIST AI RMF, and HIPAA 45 CFR 164.312 reviewers consume.
The open source libraries continue to operate inside the application as defense-in-depth layers. Llama Guard catches unsafe model output before it returns to the caller. NeMo Guardrails shapes the conversation flow. Guardrails AI validates output structure. The inspection layer at the HTTP request boundary handles the policy and the audit record series the regulator expects.
If you are running open source guardrails and want to add the inspection layer at the request boundary the regulator expects, book a demo today.
Frequently asked questions
- Can open source guardrails libraries satisfy EU AI Act Article 12 record-keeping?
The libraries produce records inside the application boundary. EU AI Act Article 12 expects records over the lifetime of the system that include input data, identity, and the period of use, produced by a system independent of the application. The libraries fail the write-path independence test by construction. The libraries are useful as defense-in-depth layers inside the application. They do not stand alone as the primary control layer Article 12 expects.
- Which open source guardrails library is the most production-ready?
Each library targets a different position in the stack. Llama Guard is production-ready as a model-side classification layer. NeMo Guardrails is production-ready as a conversational-flow layer in domains with strong intent constraints. Guardrails AI is production-ready as an output-validation layer. The choice depends on what control the application needs. None of the libraries replaces the inspection layer at the HTTP request boundary, so the production-ready criterion the buyer applies is which library composes with the inspection layer rather than which library is sufficient alone.
- Can a deployment skip the inspection layer and rely only on open source guardrails?
A deployment that wants to satisfy EU AI Act Article 12, DORA Article 19, Fannie Mae LL-2026-04, NIST AI RMF, or HIPAA 45 CFR 164.312 record-keeping cannot skip the inspection layer at the request boundary. The open source libraries produce records inside the application and fail the write-path independence test. A deployment that does not have a regulatory record-keeping obligation can run the libraries as defense-in-depth layers without the inspection layer, but the libraries do not replace the regulatory record series.
- Do the libraries compose with the inspection layer at the request boundary?
Yes, and the composition is the production pattern. The inspection layer evaluates identity-bound policy and commits the audit record at the HTTP request boundary. The libraries operate inside the application as defense-in-depth layers. The inspection layer produces the record series the regulator consumes. The libraries shape the conversation, validate the output, and catch model-side patterns before the response returns to the caller. The two layers cover different obligations and the deployment runs both.
- How does Rebuff's prompt injection detection compare to an inspection-layer policy?
Rebuff produces a detection signal the application acts on. The signal generation and the action are both inside the application. The inspection layer at the request boundary produces a deterministic policy decision with a versioned rule identifier and commits the decision to the audit record before the response forwards. The two layers can compose: Rebuff inside the application as a detection signal, the inspection layer at the request boundary as the enforcement and audit point. The composition produces the defense-in-depth posture and the record series the regulator consumes.