How to Attest AI Compliance Across Multi-Silicon Kubernetes Clusters - SWT3

1. Quick Start 2. How It Works 3. GKE with Google TPU 4. EKS with AWS Trainium 5. AKS with NVIDIA GPU 6. On-Prem and Mixed Silicon 7. Upgrading to Cloud Mode 8. Log Aggregation Patterns 9. Regulatory Mapping

Audience: Platform engineers, DevOps teams, and compliance officers responsible for AI workloads on Kubernetes. No prior SWT3 experience required.

1. Quick Start

The swt3-witness Helm chart deploys a DaemonSet that runs on every node in your cluster. It auto-discovers accelerator hardware and emits attestation evidence as structured JSON.

Install (local mode, zero config)

helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness

That's it. No API keys, no accounts, no network calls. Every node in your cluster now emits hardware attestation anchors to stdout.

Verify

# View attestation output
kubectl logs -l app.kubernetes.io/name=swt3-witness -f

# Check node coverage
kubectl get pods -l app.kubernetes.io/name=swt3-witness -o wide

# Health check
kubectl port-forward ds/swt3-swt3-witness 9090:9090
curl http://localhost:9090/health

What the output looks like

{
  "swt3_witness": true,
  "procedure": "AI-HW.1",
  "anchor_fingerprint": "96b7d56c0245",
  "anchor_epoch": 1750507200,
  "factor_a": 8,
  "factor_b": 1,
  "factor_c": 1,
  "clearing_level": 1,
  "agent_id": "witness-node-gpu-01",
  "silicon_vendor": "nvidia",
  "discovery_method": "nvidia-smi",
  "topology": "DGX-H100",
  "accelerator_count": 8,
  "gpu_count": 8,
  "total_memory_mb": 655360,
  "hostname_hash": "a1b2c3d4e5f6",
  "timestamp": "2026-06-24T12:00:00.000Z"
}

The swt3_witness: true field lets your log pipeline (Fluentd, Vector, Datadog) filter SWT3 attestation lines from other container output.

2. How It Works

The DaemonSet pod on each node runs a discovery cycle on a configurable interval (default: 1 hour). Each cycle:

Discovers accelerator hardware using six vendor-specific paths, tried in priority order
Hashes all identifiers (GPU UUIDs, bus IDs, hostnames) with SHA-256 at the point of discovery. Raw values never leave the pod.
Mints an AI-HW.1 anchor with a cryptographic fingerprint using the locked SWT3 formula
Emits the anchor as structured JSON to stdout (local mode) or flushes to the SWT3 clearing house (cloud mode)

Discovery Priority

Priority	Silicon	Method	Status	What It Detects
1	NVIDIA	`nvidia-smi`	Live	GPU name, memory, driver version, topology (NVL72, DGX, HGX), interconnect (NVSwitch, NVLink, PCIe)
2	Google TPU	`TPU_NAME` env var	Live	TPU version, worker count. GKE sets this automatically on TPU VMs.
3	AMD	`rocm-smi`	Live	MI300X, MI325X, MI250 series. Memory, bus ID.
4	AWS	`neuron-ls`	Live	Trainium2, Trainium, Inferentia. NeuronCore count.
5	Intel	`hl-smi`	Live	Gaudi3, Gaudi2 series. Memory, serial number.
6	Any	PCI fallback	Live	Reads `/sys/bus/pci/devices` for 3D controllers and processing accelerators. Vendor identification via PCI vendor ID.

All six discovery paths are live in v0.5.8. Each node reports its silicon vendor, discovery method, and per-accelerator detail. Non-accelerator nodes produce a valid anchor attesting "no accelerator detected" -- absence of hardware is also auditable fact.

Anchor Factors

Factor	Meaning	Example
`factor_a`	Accelerator count	8 (eight GPUs detected)
`factor_b`	Health status	1 (accelerators present and responding)
`factor_c`	Topology code	1 (multi-gpu / DGX class)

3. GKE with Google TPU

GKE automatically sets TPU_NAME and TPU_WORKER_HOSTNAMES environment variables on TPU node VMs. The DaemonSet detects these and reports TPU version, chip count, memory, and topology (tpu-single or tpu-pod).

# Deploy to a GKE cluster
helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness

TPU nodes report silicon_vendor: "google-tpu", discovery_method: "tpu-env", and per-chip accelerator detail. Memory is inferred from TPU generation (v4: 32 GB, v5e: 16 GB, v5p: 95 GB, v6e/Trillium: 32 GB).

4. EKS with AWS Trainium

The DaemonSet discovers Trainium and Inferentia devices via neuron-ls --json-output. Reports model name, memory, device ID, and PCI bus address per NeuronCore.

# Deploy with tolerations for Neuron device plugin taints
helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness \
  --set 'tolerations[0].key=aws.amazon.com/neuron' \
  --set 'tolerations[0].operator=Exists' \
  --set 'tolerations[0].effect=NoSchedule'

5. AKS with NVIDIA GPU

Azure Kubernetes Service GPU node pools typically include the NVIDIA device plugin and container toolkit. Setting runtimeClassName to nvidia gives the DaemonSet pod access to nvidia-smi for detailed GPU information.

# Deploy with NVIDIA runtime for full discovery
helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness \
  --set runtimeClassName=nvidia

Without the NVIDIA runtime, the DaemonSet uses PCI fallback to detect NVIDIA GPUs via /sys/bus/pci/devices (vendor ID 0x10de). This provides device identification but not detailed memory or topology data. For full GPU detail, set runtimeClassName: nvidia.

With NVIDIA runtime

topology: "DGX-H100", gpu_count: 8, full topology and interconnect data via nvidia-smi

6. On-Prem and Mixed Silicon

The DaemonSet deploys across all nodes regardless of silicon vendor. Each node runs all six discovery paths and reports whichever accelerators are present. Mixed clusters with NVIDIA, AMD, and Intel nodes each report vendor-specific detail. The silicon_vendor field reads "mixed" when multiple vendors are detected on the same node.

# Deploy across all nodes
helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness

# Optional: use tolerations to cover tainted GPU nodes
helm install swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness \
  --set 'tolerations[0].operator=Exists'

7. Upgrading to Cloud Mode

Local mode runs indefinitely with zero external dependencies. When you want anchors to persist in the SWT3 clearing house for independent verification and auditor access, upgrade to cloud mode:

helm upgrade swt3 oci://ghcr.io/tenova-labs/charts/swt3-witness \
  --set config.mode=cloud \
  --set cloud.apiKey=axm_YOUR_KEY \
  --set cloud.tenantId=YOUR_TENANT

Cloud mode flushes anchors to the clearing house AND continues emitting JSON to stdout. Your log pipeline keeps working. The clearing house adds independent verification, auditor portal access, and Merkle rollup integrity.

Configuration Reference

Value	Default	Description
`config.mode`	`local`	local (stdout) or cloud (clearing house)
`config.interval`	`3600`	Seconds between attestation cycles
`config.clearingLevel`	`1`	Data clearing level (0-3)
`config.agentId`	auto	Agent identity tag (auto-generates from hostname)
`cloud.apiKey`	empty	SWT3 API key (required for cloud mode)
`cloud.tenantId`	empty	Tenant identifier (required for cloud mode)
`cloud.signingKey`	empty	HMAC-SHA256 signing key (optional)
`runtimeClassName`	empty	Set to `nvidia` for full GPU discovery
`sysMount.enabled`	`true`	Mount /sys read-only for PCI fallback

8. Log Aggregation Patterns

Every attestation anchor includes "swt3_witness": true. Use this field to filter SWT3 evidence from other container output in your log pipeline.

Fluentd

<filter kubernetes.**swt3-witness**>
  @type grep
  <regexp>
    key log
    pattern /swt3_witness/
  </regexp>
</filter>

Vector

[transforms.swt3_filter]
type = "filter"
inputs = ["kubernetes_logs"]
condition = '.swt3_witness == true'

Datadog

Use a log processing pipeline with a JSON parser and filter on @swt3_witness:true.

kubectl (quick check)

kubectl logs -l app.kubernetes.io/name=swt3-witness --all-containers \
  | jq 'select(.swt3_witness == true)'

9. Regulatory Mapping

Hardware attestation evidence produced by this chart maps to specific regulatory requirements:

EU AI Act Article 12

Automatic logging. High-risk AI systems must log infrastructure state during operation. The DaemonSet produces timestamped, cryptographically fingerprinted records of accelerator hardware present at each attestation cycle.

NIST AI RMF MG-3.2

Resource provisioning documentation. Infrastructure inventory for AI systems must be maintained. AI-HW.1 anchors provide continuous, verifiable hardware inventory without manual documentation.

EO 14110 Section 4

AI infrastructure transparency. Federal AI deployments must maintain records of compute resources. The clearing house provides independent verification of hardware attestation evidence.

For cross-silicon compliance across competing accelerator architectures, see the companion guide: How to Maintain Compliance When AI Workloads Span NVIDIA, Google TPU, and AMD Silicon.

Security Model

The DaemonSet runs with minimal privileges:

runAsNonRoot: true -- no root access
readOnlyRootFilesystem: true -- no writes to container filesystem
allowPrivilegeEscalation: false -- no privilege elevation
capabilities: drop ALL -- zero Linux capabilities
/sys mounted read-only -- read-only sysfs for PCI discovery

The DaemonSet is less privileged than the NVIDIA device plugin already running on GPU-enabled nodes. All hardware identifiers are SHA-256 hashed at discovery time. Raw GPU UUIDs, bus IDs, and hostnames never leave the pod.

In local mode, no network calls are made. In cloud mode, outbound HTTPS to sovereign.tenova.io:443 is required.

Contents