Audience: Engineering teams building governed AI agents with the OpenAI Agents SDK, platform architects implementing agentic scaffolding for enterprise deployments, compliance officers mapping agent governance to SOC 2 and EU AI Act requirements, and CISOs evaluating independent evidence layers for GPT-based systems.
1. What OpenAI's Governance Guide Requires
In February 2026, OpenAI published Building Governed AI Agents: A Practical Guide to Agentic Scaffolding. The guide argues that most AI agent deployments fail not because the LLM underperforms, but because the scaffolding around it lacks governance. Its core message: governance must be infrastructure, not a launch-time afterthought.
The guide defines a five-layer governance stack:
- Pre-Execution: Input guardrails that block invalid queries before LLM processing.
- Execution: Agent role scoping with narrow instructions and explicit tool access per agent.
- Post-Execution: Output guardrails that detect and redact sensitive data in responses.
- Audit: Tracing with custom processors for zero-data-retention compliant observability.
- Distribution: Centralized policy configuration packaged as pip-installable modules.
Additionally, OpenAI's Trusted Access for Cyber program (May 2026) extends GPT-5.5-Cyber to government, critical infrastructure, and financial institutions, requiring phishing-resistant authentication and advanced account security for all participants.
2. The Independent Witness Gap
OpenAI's cookbook provides excellent scaffolding for building governed agents. However, the governance evidence it produces (traces, guardrail logs, policy evaluations) lives inside the application layer that runs on OpenAI's API. If a prompt injection causes an agent to escape its application sandbox, it can modify the local logging files or compromise the inline guardrail wrapper.
SWT3 complements OpenAI's scaffolding by providing an independent, out-of-band witness layer. It records what the scaffolding did, whether it matched declared policy, and produces tamper-evident evidence stored outside the application runtime.
3. Requirement-to-Procedure Mapping
| OpenAI Requirement | SWT3 Procedure | What SWT3 Witnesses | Coverage |
|---|---|---|---|
| Non-human agent identity | AI-ID.1 | Agent ID, HMAC-SHA256 signing, per-agent isolation | Full |
| Observability and tracing | AI-INF.1, AI-AUDIT.1 | Inference fingerprints, audit trail integrity | Full |
| Zero data retention | Clearing levels 0-3 | Progressive content stripping, hash-only transmission | Full |
| Input guardrails (pre-flight) | AI-GRD.1, AI-GRD.2 | Guardrail presence, threat detection results | Partial |
| Output guardrails (post-flight) | AI-GRD.1, AI-GRD.3 | Output classification, policy version at evaluation time | Partial |
| Tool authorization | AI-TOOL.1 | Tool name, input/output hashes, allowlist compliance | Full |
| Multi-agent handoffs | AI-MULTI.1, AI-CHAIN.1 | Agent delegation, chain trust, handoff boundaries | Full |
| PII detection/redaction | Clearing engine (CL2+) | Content cleared before transmission, PII never in anchor | Full |
| Centralized policy config | .swt3.yaml | Policy version hash in every anchor | Full |
| Trace export to internal systems | OTel exporter | OpenTelemetry spans routed to Datadog, Splunk, ELK | Full |
| Scope-limited agent roles | AI-ACC.1 | Authorization decisions, resource access records | Full |
| Prompt injection detection | AI-GRD.1, AI-SEC.1 | Guardrail status, security scan scores | Partial |
Coverage key: Full = SWT3 directly witnesses the control and produces auditable evidence. Partial = SWT3 witnesses whether the control was active; the primary defense (prompt filtering, PII scanning) requires OpenAI's guardrail library or equivalent tooling.
4. Non-Human Agent Identity
Agent Identity Binding
OpenAI requires: Every AI agent should have a distinct non-human identity in your identity management system, enabling attribution: when something goes wrong, you can trace exactly which agent acted, under what permissions, initiated by which user, at what time.
How SWT3 addresses it: The agent_id field is included in every witness anchor, binding each recorded action to a specific agent instance. The HMAC-SHA256 payload_signature provides cryptographic proof of which SDK instance minted the anchor. Unlike application-layer identity (which can be spoofed if the runtime is compromised), SWT3's signing key is held outside the model context and the signature is verifiable independently.
agent_id to see every action taken by a specific agent. Verify any anchor's payload_signature using the agent's signing key to confirm provenance.
5. Observability and Tracing
Inference Witnessing and Audit Trail
OpenAI requires: Capture all LLM calls, tool executions, and handoffs in structured format using trace context managers.
How SWT3 addresses it: Every witness() call produces a deterministic fingerprint that captures the model, prompt hash, response hash, and timestamp. The fingerprint formula is locked and cross-language verified across 5 SDKs. Unlike OpenAI's tracing (which lives in the application runtime), SWT3 anchors are transmitted to an independent ledger the moment they are created. The cycle_id field groups related inferences into named traces, comparable to OpenAI's trace() context manager but with independent, tamper-evident storage.
cycle_id values. Both should cover the same execution paths, with SWT3 providing the independent corroboration that the OpenAI traces were not modified after the fact.
6. Zero Data Retention Compliance
Progressive Content Clearing
OpenAI requires: Support for organizations prohibiting OpenAI data storage. Disable tracing entirely or export spans to internal systems only. Redact PII before internal storage.
How SWT3 addresses it: The clearing engine provides four levels of content stripping. At Level 0 (Analytics), full content is retained for internal analysis. At Level 1 (Standard, default), raw prompts and responses are replaced with irreversible SHA-256 hashes. At Level 2 (Sensitive), model identifiers are generalized and additional metadata is stripped. At Level 3 (Classified), even the model name is hashed. At all levels, the witness fingerprint remains verifiable. The jurisdiction and legal_basis fields ensure that data handling respects GDPR, CCPA, and sector-specific retention requirements.
7. Input Guardrails (Pre-Flight)
Guardrail Presence and Threat Detection
OpenAI requires: Block invalid queries before LLM processing using InputGuardrail functions. Built-in guardrails include: Moderation, Jailbreak, Prompt Injection Detection, PII, Secret Keys, URL Filter, and Off Topic Prompts.
How SWT3 addresses it: SWT3 does not perform prompt filtering. OpenAI's guardrail library handles the detection and blocking. What SWT3 does is witness whether the guardrails were active. AI-GRD.1 records which guardrails were configured at inference time. AI-GRD.2 records the classification result of input evaluation. If a guardrail trips and blocks an inference, the witness anchor records the blocked attempt, the guardrail that triggered, and the threat classification. This creates independent evidence that the governance layer was operational.
8. Output Guardrails (Post-Flight)
Output Classification and Policy Binding
OpenAI requires: Redact or block sensitive data in LLM responses. Detect credit cards, SSN, email, IP addresses, medical licenses in outputs.
How SWT3 addresses it: AI-GRD.3 records the policy version that was active when the output was evaluated, creating a cryptographic link between the output guardrail configuration and the specific inference. The clearing engine provides an additional layer: even if an output guardrail misses a PII element, the clearing engine strips raw response content from the witness anchor before it leaves the deployment infrastructure. At Level 2+, the response is represented only as a SHA-256 hash in the witness record.
policy_version hash from witness anchors and correlate it with the guardrail configuration in version control. This proves which output rules were in effect for each inference.
9. Tool Authorization
Tool Call Witnessing
OpenAI requires: Agent access to tools must be explicitly granted. Each agent has a distinct tool set. Functions decorated with @function_tool.
How SWT3 addresses it: wrapTool() intercepts every tool call and records the tool name, input hash, output hash, execution latency, and result status. The .swt3.yaml policy declares which tools each agent is authorized to use via mcp_policy.tool_allowlist. Any tool call that falls outside the allowlist is still witnessed but flagged as a policy violation. This creates independent evidence of whether the agent respected its declared tool boundaries, regardless of what the application-layer scaffolding permitted.
.swt3.yaml allowlist. Any discrepancy between what the scaffolding permitted and what the policy declared is immediately visible.
10. Multi-Agent Handoffs
Agent Delegation and Chain Witnessing
OpenAI requires: Route queries to specialist agents based on domain expertise. Each specialist has scoped instructions limiting operational scope. Handoffs managed via handoff_description parameters.
How SWT3 addresses it: AI-MULTI.1 records each agent-to-agent delegation with source agent, destination agent, task description, and trust level. AI-CHAIN.1 records the chain of custody across the full reasoning path. The cycle_id links all steps in a multi-agent workflow into a single auditable trace. The Trust Mesh protocol verifies agent credentials at each handoff boundary, ensuring that a specialist agent cannot receive delegated work without presenting valid identity.
cycle_id. Each handoff boundary includes the trust level and credential verification result from both sides.
11. Centralized Policy Configuration
Policy-as-Code with Version Binding
OpenAI requires: Define governance once, apply organization-wide. Package as pip-installable module via Git repository. Track policy changes with audit trail.
How SWT3 addresses it: The .swt3.yaml configuration file declares all governance policies in a single, version-controlled file: clearing level, required procedures, tool allowlists/blocklists, trust mesh settings, signing requirements, and industry profile. The swt3 doctor CLI validates the configuration against the schema before deployment. Every witness anchor includes a policy_version hash, creating a cryptographic link between each inference and the exact policy configuration that was active. The swt3 init command generates a starter configuration from 14 industry-specific templates.
.swt3.yaml file from version control alongside the policy_version hashes in the witness ledger. The hash chain proves that policy was consistently applied and any configuration change is traceable.
12. Trace Export to Internal Systems
OpenTelemetry Integration
OpenAI requires: Custom trace processors that export spans to internal monitoring without OpenAI storage. Integration points: Datadog, Splunk, ELK, internal databases.
How SWT3 addresses it: The SWT3 OTel exporter converts witness anchors into OpenTelemetry spans that route directly to any OTel-compatible collector. Each span includes the procedure ID, fingerprint, factors, verdict, and clearing level as span attributes. This means organizations can correlate SWT3 witness evidence with their existing observability stack without storing any data at OpenAI or at TeNova. Combined with the clearing engine, the OTel export respects data residency requirements by stripping content before it enters the telemetry pipeline.
13. Quick Start
Add SWT3 witnessing to an OpenAI Agents SDK deployment:
# Python
pip install swt3-ai openai-agents
from swt3_ai import SWT3Witness
witness = SWT3Witness(
tenant_id="YOUR_TENANT",
agent_id="deal-screening-agent",
signing_key_env="SWT3_SIGNING_KEY"
)
# Witness each inference in your agent loop
result = witness.witness(
model="gpt-4.1",
prompt_hash=sha256(prompt),
response_hash=sha256(response),
procedure="AI-INF.1"
)
# Witness tool calls from @function_tool decorated functions
result = witness.wrap_tool("search_deals", input_data, output_data)
# Witness agent handoffs
result = witness.witness_chain(
source_agent="triage-agent",
dest_agent="deal-screening-agent",
procedure="AI-MULTI.1"
)
# TypeScript
npm install @tenova/swt3-ai openai
import { SWT3Witness } from '@tenova/swt3-ai';
const witness = new SWT3Witness({
tenantId: 'YOUR_TENANT',
agentId: 'deal-screening-agent',
signingKeyEnv: 'SWT3_SIGNING_KEY'
});
// Witness each inference
const result = await witness.witness({
model: 'gpt-4.1',
promptHash: sha256(prompt),
responseHash: sha256(response),
procedure: 'AI-INF.1'
});
// Witness tool calls
const result = await witness.wrapTool('search_deals', inputData, outputData);
14. Coverage Summary
| OpenAI Requirement | SWT3 Procedures | Coverage |
|---|---|---|
| Non-human agent identity | AI-ID.1 | Full |
| Observability and tracing | AI-INF.1, AI-AUDIT.1 | Full |
| Zero data retention | Clearing levels 0-3 | Full |
| Input guardrails (pre-flight) | AI-GRD.1, AI-GRD.2 | Partial |
| Output guardrails (post-flight) | AI-GRD.1, AI-GRD.3 | Partial |
| Tool authorization | AI-TOOL.1 | Full |
| Multi-agent handoffs | AI-MULTI.1, AI-CHAIN.1 | Full |
| PII detection/redaction | Clearing engine (CL2+) | Full |
| Centralized policy config | .swt3.yaml | Full |
| Trace export to internal systems | OTel exporter | Full |
| Scope-limited agent roles | AI-ACC.1 | Full |
| Prompt injection detection | AI-GRD.1, AI-SEC.1 | Partial |
9 of 12 requirements: Full coverage through direct witnessing and evidence production.
3 of 12 requirements: Partial coverage. SWT3 witnesses whether the control was active; the primary defense (prompt filtering, PII scanning, output guardrails) requires OpenAI's guardrail library or equivalent tooling.