← All posts

Platform & Architecture

97 posts on platform & architecture.

AI Agent Control Plane: Identity, Authorization, and Action Lineage

An AI agent control plane is the architectural layer that authorizes agent actions, enforces identity-bound policy on each action, and records action lineage for audit. The pattern emerged because the chatbot architecture (one prompt, one response, one log) does not cover the action surface autonomous agents produce. This piece walks through the control plane primitives, the integration points with the agent framework, and the performance characteristics the layer needs to maintain under production load.

ai-agentsagentic-aiai-control-planeauthorizationai-securityarchitecture
Read post →

AI Gateway Performance Benchmark: What to Measure and How

AI gateway performance benchmarks compare proxy products on latency, throughput, and behavior under load. The benchmarks that matter for production deployment are p95 and p99 latency under realistic concurrency, tail-latency behavior when policy evaluation gets expensive, throughput ceiling per node, and behavior under upstream provider degradation. This piece walks through the benchmark methodology that produces production-actionable numbers and the comparison points worth tracking.

ai-gatewayperformancebenchmarklatencyengineering
Read post →

AI Gateway Architecture: The Components That Sit Between an Enterprise Caller and an LLM Endpoint

An AI gateway architecture has six core components: TLS termination, identity binding, request inspection, policy evaluation, the model router, and the audit record emitter. Each component is a placement decision that ties to a regulatory obligation or an operational property. This piece walks through the components, the placement decisions, and how the gateway integrates with the corporate IdP and the SIEM.

ai-gateway-architectureai-gatewayai-securityinline-enforcementaudit-logs
Read post →

Zero Trust LLM: How the Zero-Trust Principles Apply to AI Request Flows

Zero trust applied to LLM traffic means three things at the architectural level. Identity is verified at every request, not just at the session. Authorization is evaluated per request against the user, agent, role, and resource. The audit record is written independently of the application or the model that handled the request. The three principles map directly to the inspection-layer pattern that closes the post-authentication gap in AI deployments.

zero-trustllm-securityinline-enforcementai-policy-enforcementidentityaudit-logs
Read post →

AI Gateway Latency: Why Sub-50ms Overhead Sits Below the Noise Floor of LLM Inference

LLM inference takes 500 ms to 5 seconds per response. A well-engineered AI gateway adds under 50 ms of overhead in internal testing. The 10x gap between inference time and gateway overhead is the architectural fact that makes inline enforcement viable for regulated production AI. The latency budget across policy evaluation, prompt classification, identity validation, and audit commit fits inside the 50 ms envelope under realistic load.

ai-gatewaylatencyinline-enforcementperformanceai-policy-enforcementaudit-logs
Read post →

Model Context Protocol Security: How the MCP Transport Layer Changes the Inspection Boundary

The Model Context Protocol standardizes how an LLM client connects to tool servers and exchanges context, tool calls, and tool results. The transport layer carries the agent identity, the tool call payloads, and the tool return values. The inspection boundary an MCP deployment owes is the HTTP leg between the MCP client and the MCP server. This piece walks through the transport modes MCP supports, the inspection target on each, the identity-aware policy decisions the deployment commits per call, and the audit record format that survives an Article 12 review.

mcpmodel-context-protocolagent-securityai-toolinginline-enforcementaudit-logs
Read post →

Zero Trust AI: Per-Request Evaluation at the Model Boundary

Zero trust applied to AI means evaluating every model request against verified identity, current policy, and prompt-level classification. The architectural pattern is an enforcement proxy at the HTTP AI request boundary. The post-authentication gap is the most common failure mode in current deployments.

zero-trustai-securityidentity-and-authorizationpolicy-enforcementinline-enforcementarchitecture
Read post →

22-Second Breach Windows: Why AI Enforcement Must Be Inline

Mandiant M-Trends 2026 measured median attack handoff at 22 seconds. At that tempo, log-and-alert fails as a control. Inline enforcement at the AI request boundary makes the policy decision before the request reaches the model. Under 50 ms enforcement overhead is invisible against 500 ms to 5 second model inference.

ai-securityinline-enforcementpolicy-enforcementcybersecurityarchitecturezero-trust
Read post →

Model Guardrails Are Probabilistic, Not Enforceable Controls

Model guardrails are trained behaviors inside the inference process. They degrade under fine-tuning, adversarial prompting, and role-play framing. External enforcement at the AI request boundary produces deterministic controls and identity-bound audit records that guardrails alone cannot.

ai-securityllm-securityprompt-injectionpolicy-enforcementarchitectureinline-enforcement
Read post →

AI Agent Identity: NIST Pillar 1 in Production Deployments

NIST Pillar 1 names verified agent identity as the foundation of the AI agent identity and authorization framework. Per-agent identifiers, delegated authority from the authorizing user, and structured propagation to the model API call are the production requirements. Static service credentials fail the test.

agentic-aiidentity-and-authorizationnist-ai-rmfai-securityarchitecturezero-trust
Read post →

AI Security for Engineering Copilots: The Identity, Source-Code, and Audit Controls a Production Deployment Has To Run

Engineering copilots reach across the source repository, the build infrastructure, the package registry, and the production credential store. The decisions the copilot supports cross export-control boundaries, the customer source-code confidentiality terms, and the secret-handling rules the security team has built. This piece walks through the identity-aware policy decisions an engineering copilot deployment has to commit at the request boundary, the audit record format that survives SOC 2 Type II and customer audit, and the architectural pattern that closes the gap.

engineering-copilotsource-codesecrets-managementai-securitysoc-2audit-logs
Read post →

Securing the Inference Lifecycle: The Five Stages Where the Enforcement Layer Has To Sit

The AI inference lifecycle is the sequence the application runs every time the model produces a response. Most security programs cover model training and the post-deployment monitoring stages but leave the inference path itself uninstrumented. This piece walks through the five stages of the inference lifecycle, the control points each stage exposes at the request boundary, the per-decision audit record the deployment has to commit, and the architectural pattern that closes the inference-time gaps a 2022-era AppSec program leaves open.

inference-lifecycleai-securityinline-enforcementaudit-logsai-architecturepolicy-enforcement
Read post →