Document Type
Security Threat Model
Audience
Security Teams, Auditors, C3PAOs, NBs
Threat Vectors
9 Analyzed
1. Protocol Overview
SWT3 is a cryptographic witnessing protocol that mints tamper-evident anchors for AI system events. The protocol has three layers that can be attacked independently:
- Client layer: SDKs running in the deployer's environment (Python, TypeScript, Rust, C#, Ruby)
- Transport layer: HTTPS ingestion endpoints receiving witness payloads
- Ledger layer: PostgreSQL database storing anchors, Merkle rollups, and verdict rules
The locked fingerprint formula is: SHA-256(WITNESS:{tenant}:{procedure}:{fa}:{fb}:{fc}:{ts_ms}) truncated to 12 hex characters.
2. Trust Boundaries
| Boundary |
Trust Assumption |
Violated By |
| SDK to Server |
TLS 1.3 encrypts transit; HMAC-SHA256 signature authenticates origin |
Signing key compromise, TLS downgrade |
| Server to Ledger |
Service-role credentials, Row Level Security per tenant |
Service-role credential leak, RLS bypass |
| Ledger to Verifier |
Public fingerprint formula allows independent re-derivation |
Formula change, factor tampering before hashing |
| SDK Internal |
Factors are captured honestly at the point of observation |
Compromised SDK, malicious deployer, adversarial middleware |
3. Threat Analysis
T-1: Fingerprint Truncation Collision
Attack: The fingerprint is truncated to 12 hex characters (48 bits). An attacker could attempt to find a second set of inputs that produces the same truncated hash, allowing them to substitute a FAIL anchor with a PASS anchor sharing the same fingerprint.
| Severity | MEDIUM |
| Pre-image resistance | SHA-256 pre-image resistance is not reduced by truncation. Finding any input that produces a specific 48-bit output requires approximately 2^48 (281 trillion) hash operations. |
| Birthday bound | Collision probability reaches 50% at approximately 2^24 (16.7 million) anchors per tenant. At current volumes (thousands per day), this threshold is years away for any single tenant. |
| Mitigation Status | MITIGATED |
Mitigations:
- The full anchor token includes the epoch timestamp, procedure ID, and verdict, all of which must match. An attacker cannot just match the fingerprint; they must also match the structured anchor format.
- The
payload_signature (HMAC-SHA256, full 256-bit) independently authenticates the payload. Even if fingerprints collide, signatures will not.
- The Merkle rollup captures the full fingerprint in the leaf hash. Post-rollup tampering breaks the tree root.
- If volume approaches 2^24 per tenant, fingerprint length can be extended (spec allows up to 64 hex characters). This is a non-breaking change since verifiers accept variable-length fingerprints.
T-2: Replay Attack (Valid Signature Replay)
Attack: An attacker intercepts a valid, signed witness payload and re-submits it to the ingestion endpoint. The signature is valid, so the server accepts the duplicate anchor.
| Severity | MEDIUM |
| Mitigation Status | MITIGATED |
Mitigations:
- Every fingerprint includes
timestamp_ms (millisecond precision). Replayed payloads produce duplicate fingerprints, which are detectable by uniqueness constraints on the ledger.
- The ingestion endpoint enforces a timestamp window. Payloads with timestamps older than 5 minutes are rejected as stale.
- TLS 1.3 prevents network-level interception. An attacker would need to compromise the SDK process or the signing key to capture valid payloads.
- The daily Merkle rollup creates a tamper-evident boundary. Replayed anchors after rollup would alter the next day's tree root.
T-3: Clock Skew and Time Manipulation
Attack: An attacker manipulates the system clock on the machine running the SDK to backdate or future-date anchors, creating false evidence of compliance at a specific point in time.
| Severity | HIGH |
| Mitigation Status | PARTIAL |
Mitigations:
- The server records its own
received_at timestamp on ingestion. Auditors can compare fingerprint_timestamp_ms (client-claimed) against received_at (server-observed). Significant divergence flags manipulation.
- The 5-minute timestamp window on ingestion rejects payloads with grossly incorrect timestamps.
- The AI-HW.1 hardware attestation can include NTP synchronization status in future versions.
Residual risk: Within the 5-minute window, an attacker with local clock control can adjust timestamps by up to 5 minutes. For most compliance use cases, this precision is sufficient. For high-frequency trading or sub-second audit requirements, deployers should implement NTP monitoring independently.
T-4: Signing Key Compromise
Attack: An attacker obtains a tenant's HMAC-SHA256 signing key. They can now mint arbitrary anchors with valid signatures, fabricating compliance evidence for any procedure.
| Severity | CRITICAL |
| Mitigation Status | PARTIAL |
Mitigations:
- Signing keys are stored encrypted at rest using AES-256-GCM with an envelope key on the server side.
- Row Level Security ensures keys are only accessible to the owning tenant.
- The protocol supports
key_id for key rotation. Compromised keys can be revoked and replaced without invalidating historical anchors signed with the old key.
- AI-REV.1 (Anchor Revocation) allows bulk revocation of anchors minted during the compromised period.
- The Merkle rollup provides a tamper-evident cutoff. Anchors minted before the rollup are protected; anchors after compromise can be identified by the revocation window.
Residual risk: Between key compromise and detection, the attacker can mint valid anchors. Detection depends on the deployer's key management practices and anomaly detection. The protocol spec recommends signing key rotation every 90 days (configurable). Future: OIDC ephemeral signing eliminates long-lived keys entirely.
T-5: Factor Manipulation Before Witnessing
Attack: A compromised SDK or malicious middleware intercepts factor values before they reach the fingerprint computation. For example, changing factor_b from 0 (guardrail missing) to 1 (guardrail active) before AI-GRD.1 is minted. The resulting anchor has a valid fingerprint and signature, but the factors are false.
| Severity | CRITICAL |
| Mitigation Status | PARTIAL |
Mitigations:
- The SDK is open-source (Apache 2.0). Deployers and auditors can inspect the factor computation logic and verify it has not been tampered with.
- The server independently re-derives the fingerprint from the submitted factors. If factors are altered after fingerprinting but before submission, the server detects the mismatch and rejects the payload.
- Supply chain integrity: SDK packages are published to 5 registries (npm, PyPI, crates.io, NuGet, RubyGems) with checksums. Deployers can pin versions and verify hashes.
- AI-MDL.5 (Weight File Integrity) can be applied to the SDK binary itself, creating a recursive attestation chain.
Residual risk: If the attacker controls the runtime environment where the SDK executes, they can manipulate factors before hashing. This is the fundamental "garbage in, garbage out" limitation of any cryptographic witnessing system. SWT3 guarantees that once factors are hashed, the record is immutable. It cannot guarantee that the factors were honest at the point of capture. This is why AI-HW.1 (hardware attestation) and AI-TRUST.1 (mutual trust verification) exist: they extend the trust boundary closer to the physical layer.
T-6: Insider with Ledger Access
Attack: A database administrator or service-role credential holder directly modifies, deletes, or inserts rows in the witness ledger, bypassing all application-level controls.
| Severity | CRITICAL |
| Mitigation Status | MITIGATED |
Mitigations:
- Daily Merkle tree rollup computes a root hash over all anchors for the day. Any modification, deletion, or insertion of rows after rollup breaks the tree root. Auditors can verify the Merkle root independently.
- Merkle leaf hashes use domain separation (
SWT3:LEAF: prefix) to prevent cross-domain collision attacks.
- Inclusion proofs allow verifying a specific anchor's membership in the daily tree without accessing the full ledger.
- Database audit logging tracks all administrative operations. Combined with the Merkle boundary, unauthorized changes are detectable within 24 hours.
- The public verify endpoint re-derives fingerprints from factors. An insider who modifies factors will produce a fingerprint mismatch when the anchor is independently verified.
T-7: Tenant Impersonation
Attack: An attacker submits witness payloads using another tenant's ID, polluting their compliance ledger with false records.
| Severity | HIGH |
| Mitigation Status | MITIGATED |
Mitigations:
- API key authentication on all ingestion endpoints. The API key is bound to a specific tenant; the server overrides any
tenant_id in the payload with the key's tenant.
- HMAC-SHA256 payload signatures are tenant-specific. Cross-tenant signature verification fails automatically.
- Row Level Security on the ledger ensures queries only return rows matching the authenticated tenant.
T-8: Denial of Service on Ingestion
Attack: An attacker floods the ingestion endpoint with high volumes of witness payloads, preventing legitimate anchors from being recorded and creating gaps in the compliance timeline.
| Severity | MEDIUM |
| Mitigation Status | MITIGATED |
Mitigations:
- Rate limiting per API key (configurable per tier).
- Batch endpoint accepts up to 500 anchors per request, reducing connection overhead.
- SDK buffer with on_flush callback stores anchors locally until delivery is confirmed. If the server is unavailable, anchors are retained client-side and retried.
- Reverse proxy provides TLS termination, HTTP/2, and connection limiting before traffic reaches the application.
T-9: Clearing Level Downgrade
Attack: An attacker or misconfigured SDK submits anchors at Clearing Level 0 (full context) when the tenant's policy requires Level 2 or 3, leaking sensitive AI context (prompts, model names, provider details) into the ledger.
| Severity | HIGH |
| Mitigation Status | PARTIAL |
Mitigations:
- The server can enforce a minimum clearing level per tenant (configurable). Payloads below the minimum are rejected or upgraded server-side.
- The SDK defaults to the clearing level specified in the configuration. Changing clearing level requires modifying the SDK configuration, not a runtime parameter.
- CJT fields (jurisdiction, legal_basis, purpose_class) survive all clearing levels, so compliance metadata is never lost regardless of clearing level.
Residual risk: If the server-side minimum clearing level enforcement is not configured, a misconfigured SDK can leak context. Deployers operating in classified or sensitive environments must verify their tenant's minimum clearing level setting.
4. Summary Matrix
| ID |
Threat |
Severity |
Status |
Primary Mitigation |
| T-1 |
Fingerprint truncation collision |
MEDIUM |
MITIGATED |
Payload signature (full 256-bit), Merkle rollup, extensible length |
| T-2 |
Replay attack |
MEDIUM |
MITIGATED |
Timestamp uniqueness, 5-min window, TLS 1.3 |
| T-3 |
Clock skew / time manipulation |
HIGH |
PARTIAL |
Server-side received_at comparison, timestamp window |
| T-4 |
Signing key compromise |
CRITICAL |
PARTIAL |
AES-256-GCM at rest, key rotation, AI-REV.1 bulk revocation |
| T-5 |
Factor manipulation before witnessing |
CRITICAL |
PARTIAL |
Open-source SDK, server re-derivation, supply chain integrity |
| T-6 |
Insider with ledger access |
CRITICAL |
MITIGATED |
Daily Merkle rollup, domain-separated leaves, inclusion proofs |
| T-7 |
Tenant impersonation |
HIGH |
MITIGATED |
API key binding, tenant-scoped signatures, RLS |
| T-8 |
Denial of service on ingestion |
MEDIUM |
MITIGATED |
Rate limiting, SDK buffer, Connection limiting |
| T-9 |
Clearing level downgrade |
HIGH |
PARTIAL |
Server-side minimum enforcement, SDK configuration default |
5. Residual Risk Statement
Three threat categories carry residual risk that cannot be fully mitigated by protocol design alone:
- T-4 (Signing key compromise): Mitigated by key rotation and revocation, but the window between compromise and detection remains a deployer responsibility. Future OIDC ephemeral signing will eliminate long-lived keys.
- T-5 (Factor manipulation): The fundamental "garbage in" problem. SWT3 guarantees immutability after hashing but cannot guarantee input honesty. AI-HW.1 and AI-TRUST.1 push the trust boundary closer to hardware but do not eliminate it entirely.
- T-3 (Clock manipulation): Server-side timestamps provide a secondary reference, but sub-minute precision depends on client clock integrity. NTP monitoring is a deployer responsibility.
All three residual risks are documented as deployer responsibilities in the Deployer Responsibility Matrix.
6. Threat Model Maintenance
This threat model is reviewed and updated:
- With every major protocol version (spec version bump)
- When new procedures are added that introduce new trust boundaries
- When a security incident or vulnerability is reported
- At minimum every 90 days