Audience: Engineering teams building governed AI agents with the OpenAI Agents SDK, platform architects implementing agentic scaffolding for enterprise deployments, compliance officers mapping agent governance to SOC 2 and EU AI Act requirements, and CISOs evaluating independent evidence layers for GPT-based systems.

1. What OpenAI's Governance Guide Requires

In February 2026, OpenAI published Building Governed AI Agents: A Practical Guide to Agentic Scaffolding. The guide argues that most AI agent deployments fail not because the LLM underperforms, but because the scaffolding around it lacks governance. Its core message: governance must be infrastructure, not a launch-time afterthought.

The guide defines a five-layer governance stack:

Additionally, OpenAI's Trusted Access for Cyber program (May 2026) extends GPT-5.5-Cyber to government, critical infrastructure, and financial institutions, requiring phishing-resistant authentication and advanced account security for all participants.

2. The Independent Witness Gap

OpenAI's cookbook provides excellent scaffolding for building governed agents. However, the governance evidence it produces (traces, guardrail logs, policy evaluations) lives inside the application layer that runs on OpenAI's API. If a prompt injection causes an agent to escape its application sandbox, it can modify the local logging files or compromise the inline guardrail wrapper.

The structural problem: When the governance layer runs in the same runtime as the AI model, the system can be manipulated to alter its own audit trail. Regulatory bodies under the EU AI Act, NIST AI RMF, and SOC 2 require evidence from a system that is architecturally separated from the model runtime. This is not a limitation of OpenAI's scaffolding. It is an inherent property of any inline governance architecture.

SWT3 complements OpenAI's scaffolding by providing an independent, out-of-band witness layer. It records what the scaffolding did, whether it matched declared policy, and produces tamper-evident evidence stored outside the application runtime.

3. Requirement-to-Procedure Mapping

OpenAI RequirementSWT3 ProcedureWhat SWT3 WitnessesCoverage
Non-human agent identityAI-ID.1Agent ID, HMAC-SHA256 signing, per-agent isolationFull
Observability and tracingAI-INF.1, AI-AUDIT.1Inference fingerprints, audit trail integrityFull
Zero data retentionClearing levels 0-3Progressive content stripping, hash-only transmissionFull
Input guardrails (pre-flight)AI-GRD.1, AI-GRD.2Guardrail presence, threat detection resultsPartial
Output guardrails (post-flight)AI-GRD.1, AI-GRD.3Output classification, policy version at evaluation timePartial
Tool authorizationAI-TOOL.1Tool name, input/output hashes, allowlist complianceFull
Multi-agent handoffsAI-MULTI.1, AI-CHAIN.1Agent delegation, chain trust, handoff boundariesFull
PII detection/redactionClearing engine (CL2+)Content cleared before transmission, PII never in anchorFull
Centralized policy config.swt3.yamlPolicy version hash in every anchorFull
Trace export to internal systemsOTel exporterOpenTelemetry spans routed to Datadog, Splunk, ELKFull
Scope-limited agent rolesAI-ACC.1Authorization decisions, resource access recordsFull
Prompt injection detectionAI-GRD.1, AI-SEC.1Guardrail status, security scan scoresPartial

Coverage key: Full = SWT3 directly witnesses the control and produces auditable evidence. Partial = SWT3 witnesses whether the control was active; the primary defense (prompt filtering, PII scanning) requires OpenAI's guardrail library or equivalent tooling.

4. Non-Human Agent Identity

AI-ID.1

Agent Identity Binding

OpenAI requires: Every AI agent should have a distinct non-human identity in your identity management system, enabling attribution: when something goes wrong, you can trace exactly which agent acted, under what permissions, initiated by which user, at what time.

How SWT3 addresses it: The agent_id field is included in every witness anchor, binding each recorded action to a specific agent instance. The HMAC-SHA256 payload_signature provides cryptographic proof of which SDK instance minted the anchor. Unlike application-layer identity (which can be spoofed if the runtime is compromised), SWT3's signing key is held outside the model context and the signature is verifiable independently.

What to show the examiner: Query the witness ledger filtered by agent_id to see every action taken by a specific agent. Verify any anchor's payload_signature using the agent's signing key to confirm provenance.

5. Observability and Tracing

AI-INF.1 + AI-AUDIT.1

Inference Witnessing and Audit Trail

OpenAI requires: Capture all LLM calls, tool executions, and handoffs in structured format using trace context managers.

How SWT3 addresses it: Every witness() call produces a deterministic fingerprint that captures the model, prompt hash, response hash, and timestamp. The fingerprint formula is locked and cross-language verified across 5 SDKs. Unlike OpenAI's tracing (which lives in the application runtime), SWT3 anchors are transmitted to an independent ledger the moment they are created. The cycle_id field groups related inferences into named traces, comparable to OpenAI's trace() context manager but with independent, tamper-evident storage.

What to show the examiner: Compare OpenAI trace IDs with SWT3 cycle_id values. Both should cover the same execution paths, with SWT3 providing the independent corroboration that the OpenAI traces were not modified after the fact.

6. Zero Data Retention Compliance

Clearing Engine (Levels 0-3)

Progressive Content Clearing

OpenAI requires: Support for organizations prohibiting OpenAI data storage. Disable tracing entirely or export spans to internal systems only. Redact PII before internal storage.

How SWT3 addresses it: The clearing engine provides four levels of content stripping. At Level 0 (Analytics), full content is retained for internal analysis. At Level 1 (Standard, default), raw prompts and responses are replaced with irreversible SHA-256 hashes. At Level 2 (Sensitive), model identifiers are generalized and additional metadata is stripped. At Level 3 (Classified), even the model name is hashed. At all levels, the witness fingerprint remains verifiable. The jurisdiction and legal_basis fields ensure that data handling respects GDPR, CCPA, and sector-specific retention requirements.

What to show the examiner: Present anchors at the organization's configured clearing level. Demonstrate that no raw prompt or response content exists in the witness ledger. The fingerprint proves the inference happened; the clearing level proves the content was handled according to policy.

7. Input Guardrails (Pre-Flight)

AI-GRD.1 + AI-GRD.2

Guardrail Presence and Threat Detection

OpenAI requires: Block invalid queries before LLM processing using InputGuardrail functions. Built-in guardrails include: Moderation, Jailbreak, Prompt Injection Detection, PII, Secret Keys, URL Filter, and Off Topic Prompts.

How SWT3 addresses it: SWT3 does not perform prompt filtering. OpenAI's guardrail library handles the detection and blocking. What SWT3 does is witness whether the guardrails were active. AI-GRD.1 records which guardrails were configured at inference time. AI-GRD.2 records the classification result of input evaluation. If a guardrail trips and blocks an inference, the witness anchor records the blocked attempt, the guardrail that triggered, and the threat classification. This creates independent evidence that the governance layer was operational.

What to show the examiner: Query AI-GRD.1 anchors across a time range. Every inference should have a corresponding guardrail attestation. Gaps indicate periods where the governance layer was not active.

8. Output Guardrails (Post-Flight)

AI-GRD.1 + AI-GRD.3

Output Classification and Policy Binding

OpenAI requires: Redact or block sensitive data in LLM responses. Detect credit cards, SSN, email, IP addresses, medical licenses in outputs.

How SWT3 addresses it: AI-GRD.3 records the policy version that was active when the output was evaluated, creating a cryptographic link between the output guardrail configuration and the specific inference. The clearing engine provides an additional layer: even if an output guardrail misses a PII element, the clearing engine strips raw response content from the witness anchor before it leaves the deployment infrastructure. At Level 2+, the response is represented only as a SHA-256 hash in the witness record.

What to show the examiner: Present the policy_version hash from witness anchors and correlate it with the guardrail configuration in version control. This proves which output rules were in effect for each inference.

9. Tool Authorization

AI-TOOL.1

Tool Call Witnessing

OpenAI requires: Agent access to tools must be explicitly granted. Each agent has a distinct tool set. Functions decorated with @function_tool.

How SWT3 addresses it: wrapTool() intercepts every tool call and records the tool name, input hash, output hash, execution latency, and result status. The .swt3.yaml policy declares which tools each agent is authorized to use via mcp_policy.tool_allowlist. Any tool call that falls outside the allowlist is still witnessed but flagged as a policy violation. This creates independent evidence of whether the agent respected its declared tool boundaries, regardless of what the application-layer scaffolding permitted.

What to show the examiner: Export AI-TOOL.1 anchors and compare the tool names against the declared .swt3.yaml allowlist. Any discrepancy between what the scaffolding permitted and what the policy declared is immediately visible.

10. Multi-Agent Handoffs

AI-MULTI.1 + AI-CHAIN.1

Agent Delegation and Chain Witnessing

OpenAI requires: Route queries to specialist agents based on domain expertise. Each specialist has scoped instructions limiting operational scope. Handoffs managed via handoff_description parameters.

How SWT3 addresses it: AI-MULTI.1 records each agent-to-agent delegation with source agent, destination agent, task description, and trust level. AI-CHAIN.1 records the chain of custody across the full reasoning path. The cycle_id links all steps in a multi-agent workflow into a single auditable trace. The Trust Mesh protocol verifies agent credentials at each handoff boundary, ensuring that a specialist agent cannot receive delegated work without presenting valid identity.

What to show the examiner: Reconstruct a multi-agent workflow by querying all anchors with a shared cycle_id. Each handoff boundary includes the trust level and credential verification result from both sides.

11. Centralized Policy Configuration

.swt3.yaml Trust Mesh

Policy-as-Code with Version Binding

OpenAI requires: Define governance once, apply organization-wide. Package as pip-installable module via Git repository. Track policy changes with audit trail.

How SWT3 addresses it: The .swt3.yaml configuration file declares all governance policies in a single, version-controlled file: clearing level, required procedures, tool allowlists/blocklists, trust mesh settings, signing requirements, and industry profile. The swt3 doctor CLI validates the configuration against the schema before deployment. Every witness anchor includes a policy_version hash, creating a cryptographic link between each inference and the exact policy configuration that was active. The swt3 init command generates a starter configuration from 14 industry-specific templates.

What to show the examiner: Present the .swt3.yaml file from version control alongside the policy_version hashes in the witness ledger. The hash chain proves that policy was consistently applied and any configuration change is traceable.

12. Trace Export to Internal Systems

OTel Exporter

OpenTelemetry Integration

OpenAI requires: Custom trace processors that export spans to internal monitoring without OpenAI storage. Integration points: Datadog, Splunk, ELK, internal databases.

How SWT3 addresses it: The SWT3 OTel exporter converts witness anchors into OpenTelemetry spans that route directly to any OTel-compatible collector. Each span includes the procedure ID, fingerprint, factors, verdict, and clearing level as span attributes. This means organizations can correlate SWT3 witness evidence with their existing observability stack without storing any data at OpenAI or at TeNova. Combined with the clearing engine, the OTel export respects data residency requirements by stripping content before it enters the telemetry pipeline.

What to show the examiner: Show the OTel collector configuration and a sample span from Datadog/Splunk/ELK containing SWT3 witness attributes. Demonstrate that the evidence flows from the agent to the internal system without traversing any third-party infrastructure.

13. Quick Start

Add SWT3 witnessing to an OpenAI Agents SDK deployment:

# Python
pip install swt3-ai openai-agents

from swt3_ai import SWT3Witness
witness = SWT3Witness(
    tenant_id="YOUR_TENANT",
    agent_id="deal-screening-agent",
    signing_key_env="SWT3_SIGNING_KEY"
)

# Witness each inference in your agent loop
result = witness.witness(
    model="gpt-4.1",
    prompt_hash=sha256(prompt),
    response_hash=sha256(response),
    procedure="AI-INF.1"
)

# Witness tool calls from @function_tool decorated functions
result = witness.wrap_tool("search_deals", input_data, output_data)

# Witness agent handoffs
result = witness.witness_chain(
    source_agent="triage-agent",
    dest_agent="deal-screening-agent",
    procedure="AI-MULTI.1"
)
# TypeScript
npm install @tenova/swt3-ai openai

import { SWT3Witness } from '@tenova/swt3-ai';
const witness = new SWT3Witness({
  tenantId: 'YOUR_TENANT',
  agentId: 'deal-screening-agent',
  signingKeyEnv: 'SWT3_SIGNING_KEY'
});

// Witness each inference
const result = await witness.witness({
  model: 'gpt-4.1',
  promptHash: sha256(prompt),
  responseHash: sha256(response),
  procedure: 'AI-INF.1'
});

// Witness tool calls
const result = await witness.wrapTool('search_deals', inputData, outputData);
Positioning note: SWT3 complements OpenAI's scaffolding. OpenAI's guardrail library performs the detection and blocking. SWT3 witnesses that the guardrails were active, records what they detected, and stores the evidence independently. The scaffolding provides the governance. SWT3 provides the proof that the governance was operational. Both are necessary for regulated deployments.

14. Coverage Summary

OpenAI RequirementSWT3 ProceduresCoverage
Non-human agent identityAI-ID.1Full
Observability and tracingAI-INF.1, AI-AUDIT.1Full
Zero data retentionClearing levels 0-3Full
Input guardrails (pre-flight)AI-GRD.1, AI-GRD.2Partial
Output guardrails (post-flight)AI-GRD.1, AI-GRD.3Partial
Tool authorizationAI-TOOL.1Full
Multi-agent handoffsAI-MULTI.1, AI-CHAIN.1Full
PII detection/redactionClearing engine (CL2+)Full
Centralized policy config.swt3.yamlFull
Trace export to internal systemsOTel exporterFull
Scope-limited agent rolesAI-ACC.1Full
Prompt injection detectionAI-GRD.1, AI-SEC.1Partial

9 of 12 requirements: Full coverage through direct witnessing and evidence production.

3 of 12 requirements: Partial coverage. SWT3 witnesses whether the control was active; the primary defense (prompt filtering, PII scanning, output guardrails) requires OpenAI's guardrail library or equivalent tooling.