Who this is for: Infrastructure architects, compliance officers, and MLOps teams operating GPU clusters for AI training and inference. Applicable to any organization running dedicated GPU infrastructure that serves regulated markets (EU, US federal, financial services, healthcare, defense).

Context: GPU clusters purpose-built for AI training and inference are becoming the defining infrastructure of the decade. They produce predictions the way power plants produce electricity. And like power plants, they need metering. The EU AI Act (enforcement August 2, 2026) requires automatic logging of AI system behavior (Art. 12). Today, no standard compliance layer exists for GPU inference infrastructure.

Contents

1. What is an AI Factory 2. The Regulatory Landscape 3. The Compliance Gap 4. SWT3 as the Compliance Metering Layer 5. Architecture: Where SWT3 Sits 6. Procedure Mapping for AI Factories 7. Deployment Options 8. Industry Profiles 9. Getting Started

1. What is an AI Factory

An AI factory is a large-scale GPU cluster purpose-built for training and serving AI models. Unlike general-purpose cloud compute, AI factories are optimized for one thing: turning data into predictions at scale.

The core components of a modern AI factory:

At scale, a single AI factory can serve millions of inference requests per day across dozens of models. Sovereign AI deployments (national GPU clusters in France, Japan, India, Singapore, Saudi Arabia) are multiplying this pattern globally.

2. The Regulatory Landscape

AI factories are subject to a growing web of regulations that demand evidence of responsible AI operation. The common thread: regulators want to know what the AI did, when, why, and whether it was operating within bounds.

Regulation Key Requirement AI Factory Impact
EU AI Act Art. 12 Automatic logging of AI system behavior Every inference must produce a traceable record
EU AI Act Art. 9 Risk management system Guardrail decisions must be documented
EU AI Act Art. 11 Technical documentation Model identity, weights, and versions must be tracked
EU AI Act Art. 15 Accuracy, robustness, cybersecurity Drift detection and adversarial testing evidence required
EU AI Act Art. 50 Transparency for AI-generated content Content provenance marking at the inference layer
EO 14110 Sec. 4.2 Dual-use foundation model reporting Training compute and model capability documentation
NIST AI RMF MAP / MEASURE / MANAGE / GOVERN Continuous monitoring of AI system behavior
CMMC Level 2+ CUI protection, access control, audit Classified inference requires hardware attestation
SR 11-7 Model risk management Drift detection, baseline comparison, validation evidence
Colorado AI Act Algorithmic impact assessment Automated decision documentation (effective Jan 2027)

Enforcement timeline: EU AI Act general provisions enforce August 2, 2026. Colorado AI Act enforces January 1, 2027. CMMC 2.0 rulemaking is final. SR 11-7 examinations are ongoing. The compliance window is closing.

3. The Compliance Gap

AI factories have world-class infrastructure for running models. They have no standard infrastructure for proving what those models did.

You would never run a data center without audit logs. AI factories are the new data centers, and most of them have no compliance audit trail.

4. SWT3 as the Compliance Metering Layer

SWT3 is a cryptographic witnessing protocol that sits alongside the inference pipeline and produces compliance evidence without interfering with model serving.

What SWT3 is:

What SWT3 is not:

How It Works

For every inference event, SWT3 produces a Witness Anchor: a compact, cryptographically signed record containing:

The anchor fingerprint formula is locked and produces identical results across Python, TypeScript, Rust, C#, and Ruby SDKs. Raw prompts and responses never leave the inference host. Only hashes are transmitted.

5. Architecture: Where SWT3 Sits

SWT3 runs parallel to the inference path. It observes completion events and mints anchors in background threads. The inference pipeline is never blocked.

AI FACTORY INFERENCE PATH +-----------+ +-------------------+ +-----------+ | | | | | | | Client +---->+ Triton / Dynamo +---->+ Response | | Request | | Inference Server | | | | | | | | | +-----------+ +--------+----------+ +-----------+ | (completion event) | +--------v----------+ | | | SWT3 Witness | (background thread) | Adapter | | | +--------+----------+ | (SHA-256 anchor) | +--------v----------+ | | | Compliance | | Ledger | | | +-------------------+

Three integration points for AI factories:

Triton BLS Middleware

Decorate any Triton Python backend with @witness_execute(). The decorator intercepts the execute() method, hashes request/response tensors, and mints an anchor after each batch completes.

from swt3_ai.adapters.triton import witness_execute

class TritonPythonModel:
    @witness_execute()
    def execute(self, requests):
        # your inference logic unchanged
        return responses

Dynamo Async Generator

Wrap streaming inference endpoints with @witness_endpoint(). Chunks pass through untouched in real-time. The anchor mints after stream completion from accumulated data.

from swt3_ai.adapters.dynamo import witness_endpoint

@witness_endpoint()
@dynamo_endpoint()
async def generate(self, request):
    async for chunk in self.backend.generate(request):
        yield chunk

OpenShell OCSF Consumer

For agent runtimes that produce OCSF v1.7.0 events, the OpenShell adapter consumes structured logs and mints anchors from five event classes: network activity, process activity, file activity, configuration changes, and security findings.

from swt3_ai.adapters.openshell import OpenShellWitness

observer = OpenShellWitness()
observer.process_event(ocsf_event)  # mints anchor from event

6. Procedure Mapping for AI Factories

Each AI factory compliance concern maps to one or more SWT3 procedures from the Unified Compliance Taxonomy.

AI Factory Concern Procedure What It Witnesses Regulation
Inference provenance AI-INF.1 Prompt hash, response hash, model ID, timestamp Art. 12
Model identity AI-MDL.1 Model weight file integrity (SHA-256) Art. 11
Adapter stack AI-MDL.6 LoRA/QLoRA adapter versions and hashes Art. 11
Quantization AI-MDL.7 Quantization method, bit depth, calibration Art. 15
Hardware attestation AI-HW.1 GPU inventory, health, topology SI-7, CMMC
Hardware root of trust AI-HW.3 TPM 2.0 / Pluton PCR register state CMMC
Guardrail decisions AI-GRD.1 Policy check result, rule version Art. 9
Agent tool calls AI-TOOL.1 Tool name, input hash, output hash Art. 14
Access control AI-ACC.1 Credential routing, authorization gate AC-2
Model drift AI-DRIFT.1 Statistical divergence from baseline Art. 15, SR 11-7
Adversarial testing AI-REDTEAM.1 Red team scope, findings, methodology Art. 9(8)
Supply chain AI-SBOM.1 Component manifest, dependency hashes EO 14028
Content provenance AI-MARK.1 AI-generated content marking Art. 50
Audit trail integrity AI-AUDIT.1 Tamper-evident log with Merkle rollup Art. 12
Data governance AI-DATA.1 Input classification, data source Art. 10

Key Procedure Cards

AI-INF.1

Inference Request Witnessing

Every inference request produces a witness anchor binding the prompt hash, response hash, model identifier, and millisecond-precision timestamp into a single cryptographic fingerprint. This is the foundational procedure for Article 12 automatic logging compliance.

Assessor Tip

Request the tenant's daily Merkle root for any date range. Each root covers all inference anchors for that day. Verify any individual anchor's inclusion via the proof API.

AI-HW.1

Hardware Attestation

Witnesses the GPU/accelerator inventory, health status, memory topology, and interconnect configuration of the inference host. For AI factories, this provides the link between a compliance record and the physical hardware that produced it.

Assessor Tip

Cross-reference AI-HW.1 anchors with AI-INF.1 anchors to trace any inference back to the specific hardware that served it. The hardware attestation anchor includes GPU UUIDs.

AI-DRIFT.1

Model Drift Detection

Witnesses statistical divergence metrics comparing current model behavior against a registered baseline. For AI factories serving multiple model versions, drift detection provides continuous evidence that models are operating within validated bounds.

Assessor Tip

For SR 11-7 examinations, drift anchors provide the continuous monitoring evidence that model risk management requires. Look for the baseline_id and divergence_score in the anchor factors.

AI-AUDIT.1

Audit Trail Integrity

Witnesses the integrity of the compliance audit trail itself. Daily Merkle rollups bind all anchors for a tenant into a single root hash. Any tampering with historical records invalidates the Merkle proof chain.

Assessor Tip

The Merkle root is the single artifact that proves the entire day's compliance record is intact. Request it via the /api/v1/merkle endpoint.

7. Deployment Options

SWT3 integrates with AI factories at two levels, depending on the deployment model and compliance requirements.

Option 1: SDK Direct

Install the Python SDK and add a decorator to the inference backend. Best for single-model deployments or teams with direct access to the inference code.

pip install swt3-ai

# Set connection string
export SWT3_DSN=https://axm_live_xxx@sovereign.tenova.io/MY_ENCLAVE

# Add one decorator to the Triton backend
@witness_execute()
def execute(self, requests):
    return responses

Best for: Single model, direct code access, fastest integration path.

Option 2: Sidecar Container

Deploy a separate container alongside the inference server that consumes completion events from the Triton metrics endpoint or Dynamo event stream. No changes to the inference code.

Best for: Multi-model Kubernetes clusters, existing CI/CD pipelines, teams that cannot modify model code.

Configuration

Both deployment options use the same configuration pattern:

# Option A: Connection string (recommended)
SWT3_DSN=https://axm_live_xxx@sovereign.tenova.io/MY_ENCLAVE

# Option B: Individual environment variables
SWT3_ENDPOINT=https://sovereign.tenova.io
SWT3_API_KEY=axm_live_xxx
SWT3_TENANT_ID=MY_ENCLAVE

# Optional: Set clearing level (0=Analytics, 1=Standard, 2=Sensitive, 3=Classified)
SWT3_CLEARING_LEVEL=1

If no configuration is set, all SWT3 adapters operate as transparent no-ops. You can instrument your code today and activate witnessing when ready.

8. Industry Profiles

SWT3 ships 14 industry compliance profiles that pre-configure the required procedures, clearing levels, and signing tiers for specific regulatory environments. Four profiles are directly relevant to AI factory operators:

Profile Framework Clearing Level Key Procedures
defense-govcon CMMC + NIST 800-171 3 (Classified) AI-HW.1, AI-HW.3, AI-INF.1, AI-ACC.1, AI-SBOM.1
fintech-model-risk SR 11-7 2 (Sensitive) AI-DRIFT.1, AI-BASE.1, AI-INF.1, AI-AUDIT.1
healthcare-clinical HIPAA 2 (Sensitive) AI-CONSENT.1, AI-DATA.1, AI-INF.1, AI-EXPL.1
autonomous-systems Safety-critical 3 (Classified) AI-SAFE.1, AI-ROBUST.1, AI-HW.1, AI-CHAIN.1

Profiles are YAML files that can be loaded via swt3 init --profile defense-govcon or by setting SWT3_PROFILE in the environment.

9. Getting Started

Three steps to compliance-metered inference:

Step 1: Install

# Python
pip install swt3-ai

# TypeScript
npm install @tenova/swt3-ai

Step 2: Instrument

# Triton backend (one decorator)
from swt3_ai.adapters.triton import witness_execute

class TritonPythonModel:
    @witness_execute()
    def execute(self, requests):
        return responses

# Or Dynamo endpoint (one decorator)
from swt3_ai.adapters.dynamo import witness_endpoint

@witness_endpoint()
async def generate(self, request):
    async for chunk in self.backend.generate(request):
        yield chunk

Step 3: Verify

# Run the zero-config demo to see anchors locally
python -m swt3_ai.demo

# Verify an anchor fingerprint
# https://sovereign.tenova.io/verify

For connected mode (cloud ledger), set the SWT3_DSN environment variable and anchors will stream to the compliance ledger automatically.

Related Guides