Can a small security team run this framework, or does it need dedicated headcount?

A two-person security team can run the discovery framework if they treat it as the primary project for the six weeks. The framework was designed assuming a CISO plus one senior security engineer, with part-time support from compliance, legal, and IT operations during specific weeks. Larger organizations may run the framework with a dedicated AI security working group, which compresses the timeline marginally but adds coordination overhead. The bottleneck is rarely team size. It is access to the data sources (DNS logs, CASB, SSO, SaaS audit APIs) and decision authority for the sanctioned-tool register.

What if we have no CASB?

The CASB layer is helpful but optional for the discovery framework. Without a CASB, Week 2 substitutes the native audit APIs of the major SaaS tools the organization uses. Microsoft 365 exposes Copilot usage through its audit log API. Google Workspace exposes AI activity through Reports API. Most major SaaS platforms now offer a way to extract AI feature usage records. The integration is more labor-intensive than a CASB single-pane configuration, but it produces the same data. Discover the available APIs in Week 1, build the extraction in Week 2.

How do we handle senior leaders whose use case is approved but the framework would block them?

The framework supports per-role policy from Week 4 forward. Senior leaders whose role requires access to data classes that would otherwise be blocked get a documented exception bound to their identity and role. The exception is in the policy, not as a permanent bypass of inspection. Their requests still produce audit records. The audit records show the elevated permission and the policy version that authorized it. This is the model that satisfies the disclosure-on-demand requirements under regulatory regimes and provides the documentary defense if the elevated access is later questioned.

Does this framework apply to AI agents and autonomous systems, not just human users?

Yes, with calibration. AI agents and autonomous systems generate AI traffic at higher volumes than human users and use service-account credentials rather than personal SSO. The framework's Week 4 identity mapping addresses this by requiring agent identity (NIST Pillar 1 framing) and delegated authority (NIST Pillar 2) for every agent. Agent traffic gets inspected and recorded the same way human traffic does. The volume scaling matters at the inspection layer: an agent that issues 10,000 requests per minute requires inspection infrastructure sized for that throughput.

What happens when the discovery surface changes after Week 6?

New AI tools enter the organization continuously. The Week 6 governance handoff includes a recurring discovery cadence (typically quarterly) that re-runs the Week 1 DNS and expense baseline, the Week 3 employee survey (annually rather than quarterly to manage survey fatigue), and the Week 5 inspection coverage review. Tools that appear in the new baseline are routed through the governance committee for the approve / contract / prohibit decision. The discovery framework completes once. The discovery cadence runs forever.

Shadow AI Discovery Framework: The Six-Week Path From Blind to Inventoried

86% of IT leaders are completely blind to employee AI interactions (Cloud Radix). Only 37% of organizations have any detection or governance policies in place for AI usage (Netwrix). The standard organizational response is to buy a shadow AI discovery tool, deploy it, get a long list of alerts, and watch the list sit in a backlog because the team has no framework to act on it. The tool is not the gap. The framework is.

I want to walk through a working six-week discovery framework that begins with the data the organization already has and adds inspection only after the surface is mapped.

Week 1: DNS resolution and expense reports

The first week produces the discovery baseline from data the organization is already collecting. DNS resolution logs from the corporate resolver show which AI provider domains have been resolved by which devices over the last 90 days. Expense reports show which AI subscriptions have been reimbursed (often by department, sometimes by individual). SSO sign-in logs show which AI tools employees have authenticated to with their corporate identity.

This three-source baseline produces the first inventory of the AI tools in active use across the organization. The list is incomplete (personal-account access from corporate devices does not appear in SSO logs; out-of-pocket subscriptions do not appear in expense data; mobile data does not appear in corporate DNS), but it captures the bulk of the sanctioned and semi-sanctioned surface in approximately three to five business days.

The deliverable from Week 1 is a spreadsheet listing each AI tool, the count of employees who appear to use it, the most active department, and a flag for whether the tool has a corporate contract. The CISO walks this spreadsheet through the next leadership meeting.

Week 2: CASB and SaaS audit logs

Week 2 expands the surface from network-level signal to application-level signal. CASB integration with the organization's primary SaaS tenants (Microsoft 365, Google Workspace, Salesforce, Notion, ServiceNow, Slack) exposes AI feature usage inside the SaaS. The CASB shows which users have generated Copilot prompts in Word, which have used Gemini in Sheets, which have invoked Notion AI.

The application-level signal closes a gap the DNS baseline missed: AI usage that flows through enterprise SaaS does not show up as a DNS resolution of an AI provider domain because the inference happens through the SaaS vendor's backend. Without this layer, the organization underestimates AI usage by a factor that typically runs two to five depending on how much enterprise SaaS the organization runs.

Week 2 also surfaces vendor-embedded AI inside SaaS tools the organization may not have explicitly opted into. Customer support platforms that summarize tickets with LLMs, sales tools that generate email drafts, project management tools that produce status summaries. The audit logs name the feature, the user, and the activity volume.

Week 3: Employee survey and use case inventory

Week 3 adds the data the technical signals miss. An anonymous employee survey asks which AI tools each respondent uses for work, which tasks they use AI for, and what data they typically include in prompts. The survey design matters. Asking "do you ever use unauthorized AI tools" produces under-reporting. Asking "for which work tasks have you found AI tools helpful in the last 90 days" produces honest answers about activity that the respondents would never frame as a policy violation.

The survey output is cross-checked against the technical baseline. Gaps surface: 30% of survey respondents reported using ChatGPT for work, but the DNS log shows only 12% of corporate devices resolving chatgpt.com. The gap is the personal-device-and-mobile-data path. The organization now knows the dimension of the unmonitored surface.

The use case inventory is the more important survey output. The list of work tasks employees are using AI for becomes the input to the sanctioned-tool decision: which use cases are the organization willing to support with a contracted tool, which should be prohibited because the underlying data class is restricted, which need additional review.

Week 4: Identity and access mapping

Week 4 maps the identity model for AI access. Every sanctioned tool gets a documented owner, an SSO integration plan, and a data-class permission list. The deliverable is a register that names each tool, the contract status (enterprise tier, no contract, individual subscriptions), the data classes the tool may handle, and the authentication path users must follow.

This week is where the friction surfaces. Some tools the survey shows employees rely on do not offer enterprise tiers. Others offer enterprise tiers at price points the organization does not currently fund. Others have BAA-eligible tiers for HIPAA-covered entities but require contract renegotiation. The register tracks the decisions explicitly: approve, contract-needed, prohibit, deferred.

The identity mapping also addresses the shared-credential anti-pattern. Engineering teams that have been using a single OpenAI API key for production workloads get a per-developer key plan. Marketing teams that have been sharing a ChatGPT Team account get individual SSO-bound accounts. The audit records produced in later weeks become attributable to a named person because the credential model supports it.

Week 5: Inline traffic inspection

Week 5 deploys inline inspection in front of the highest-volume sanctioned AI endpoints, typically OpenAI and one internal model. The deployment runs in logging-only mode for the full week. The inspection layer records every prompt, every response, and every policy decision without enforcing.

The logging-only week produces three calibration outputs. The false-positive rate per detection rule (how often the rule fires on legitimate traffic) tells the team which rules need tuning. The data class distribution (what data is actually appearing in prompts) tells the team which prohibited classes are most active. The user activity distribution (which teams generate the most prompts, which generate the riskiest prompts) tells the team where to focus enforcement first.

The week is bounded so the calibration data does not turn into a months-long observation project. At the end of the week, the team has the data to switch enforcement on with a defensible expectation of impact.

Week 6: Enforcement mode and governance handoff

Week 6 switches inline inspection from logging-only to enforcement for the calibrated rule set. Blocks fire. Audit records commit. The user experience changes for the first time. The week also runs the governance handoff: the discovery program transitions from a security-led project to a permanent governance process owned by the AI governance committee (typically CISO, CCO or General Counsel, and a designated business sponsor).

The governance committee inherits the inventory register, the policy document, the audit log access, and the review cadence. The committee meets monthly to approve new tools, retire approved tools that are no longer in use, and review enforcement metrics. The discovery framework completes. The governance program continues.

This six-week sequence works because each week builds on the previous week's output without creating dependencies on tools or controls the organization has not yet deployed. The result at week six is an inventoried, monitored, enforced shadow AI surface that produces audit-ready evidence.

DeepInspect

This is the architecture DeepInspect was built to provide. The inline inspection deployed in weeks 5 and 6 of the framework is the layer that produces enforcement and evidence. DeepInspect sits at the AI request boundary as a stateless proxy in front of any HTTP-based LLM endpoint. Every request is evaluated against organizational policy. Every decision produces a per-decision audit record containing identity, policy version, data class, outcome, and a tamper-evident signature.

The discovery framework can complete with any inline inspection tool. The post-discovery operational steady state is where the proxy's audit record quality, performance characteristics, and policy expressiveness become the metrics that determine whether the program scales to additional AI endpoints and additional regulated data classes.

For organizations starting the discovery framework against the August 2 EU AI Act deadline or any of the sector-specific 2026 mandates, the enforcement layer is the architectural choice that determines whether the inventory becomes a regulatory defense or a documentation exercise. Book a demo today.