Audience: You are an assessor or auditor evaluating an AI system's risk management practices against the NIST AI RMF (AI 100-1). This walkthrough maps AI RMF functions and categories to SWT3 procedures that generate cryptographic compliance evidence. Each category includes verification steps and common findings to accelerate your assessment.

Protocol note: SWT3 is an industry-agnostic cryptographic witness protocol. The evidence described in this guide is identical regardless of the AI system's application domain, deployment model, or organizational context. This walkthrough maps that universal evidence to NIST AI RMF functions and categories so assessors can verify risk management practices using independently verifiable cryptographic proof.

Contents

1. Overview 2. How to Use This Walkthrough 3. GOVERN Function GOVERN 1.1 -- Dual-Use and Oversight Policies GOVERN 1.2 -- Technical Environment and Provenance GOVERN 1.3 -- Supply Chain and Multi-Agent Governance GOVERN 1.4 -- Audit and Logging GOVERN 1.5 -- Trust, Authorization, and Guardrails GOVERN 1.7 -- Transparency and Documentation GOVERN 2.1 -- Roles and Responsibilities GOVERN 2.2 -- Governance Mechanisms GOVERN 4.1 -- Human Oversight Mechanisms GOVERN 6.1 -- Policy Enforcement 4. MAP Function MAP 1.1 -- System Identification and Inventory MAP 2.1 -- Risk Identification MAP 2.3 -- Fairness, Explainability, and Data Provenance MAP 3.5 -- Data Governance MAP 4.1 -- Data Lineage MAP 5.2 -- Impact Assessment 5. MEASURE Function MEASURE 2.5 -- Performance, Fairness, and Explainability Metrics MEASURE 2.6 -- Drift, Robustness, and Model Integrity MEASURE 3.1 -- Red Team and Supply Chain Testing 6. MANAGE Function MANAGE 1.3 -- Model Lifecycle MANAGE 2.2 -- Cybersecurity MANAGE 2.3 -- Safety and Security Controls MANAGE 2.4 -- Access Control and Revocation MANAGE 3.1 -- Incident Response MANAGE 3.2 -- Autonomous Operations and Incident Management MANAGE 4.1 -- Human Oversight, Post-Market, and Violations 7. Anchor Anatomy 8. Assessment Resources

1. Overview

The NIST AI Risk Management Framework (AI 100-1) organizes AI governance into four functions: GOVERN MAP MEASURE MANAGE. Each function contains categories and subcategories that describe organizational practices for trustworthy AI.

The SWT3 protocol generates cryptographic witness anchors for 80 AI-specific procedures. This walkthrough maps 80 of those procedures to 27 NIST AI RMF categories across all four functions. For each category, you will find the specific procedures that produce evidence, what to verify, and what findings to watch for.

Coverage summary: 80 SWT3 procedures map to 27 AI RMF categories. GOVERN receives the deepest coverage (37 procedures across 10 categories), reflecting the framework's emphasis on organizational governance as the foundation for trustworthy AI.

2. How to Use This Walkthrough

Step 1: Identify the AI RMF categories in scope for your assessment.

Not all categories apply to every system. A narrow-purpose classifier may only touch MAP 1.1 and MEASURE 2.5, while a general-purpose model deployed across business units may require full coverage.

Step 2: For each category, locate the SWT3 procedures in this guide.

Each category section lists the procedures that generate relevant evidence. Check the organization's witness ledger for anchors matching those procedure IDs.

Step 3: Verify anchors using the public verification endpoint.

Navigate to /verify and enter any SWT3 anchor fingerprint to confirm it has not been tampered with. Batch verification is available for enclave-wide integrity checks.

Step 4: Document gaps where procedures exist but no anchors have been minted.

A procedure without a corresponding anchor indicates the organization has not yet generated evidence for that requirement. This is a finding, not necessarily a failure -- the practice may exist without being witnessed.

Assessor tip: Use the interactive assessment tool to filter procedures by the NIST AI RMF framework and track completion status during your evaluation.

3. GOVERN Function

The GOVERN function establishes and maintains the organizational policies, processes, and accountability structures needed for trustworthy AI. It is the largest function in the AI RMF and receives the broadest SWT3 procedure coverage because governance decisions ripple through every other function.

SWT3 maps 37 procedures across 10 GOVERN categories. Evidence in this function tends to be organizational rather than technical, so look for anchors that witness policy versions, approval gates, and role assignments rather than model metrics.

GOVERN 1.1 -- Dual-Use and Oversight Policies

What this requires: The organization must establish policies addressing the potential for AI systems to be repurposed for unintended uses, including dual-use scenarios. Governance structures must define oversight responsibilities and human-in-the-loop requirements for high-risk applications.

ProcedureTitleWhat It Witnesses
AI-DUALUSE.1Dual-Use Risk ClassificationRecords the classification decision for systems with potential dual-use applications, including risk tier and justification.
AI-GOV.1Governance Policy VersionWitnesses the active governance policy version, including approval authority and effective date.
AI-HITL.1Human-in-the-Loop DeclarationRecords whether human oversight is required for the system and at what decision points it is enforced.

How to verify

1. Request the organization's AI governance policy and confirm a current AI-GOV.1 anchor exists with a matching policy version hash.

2. For systems with dual-use potential, verify that AI-DUALUSE.1 was minted before operational deployment, not retroactively.

3. Confirm the HITL declaration in AI-HITL.1 matches the actual operational workflow. Interview operators to validate.

Common finding: Organizations frequently have governance policies but no AI-DUALUSE.1 classification on record. If the system has potential for repurposing beyond its stated intent, the absence of a dual-use classification is a material gap. The assessor determines which repurposing scenarios are relevant based on the system's capabilities and deployment context.

GOVERN 1.2 -- Technical Environment and Provenance

What this requires: The organization must document the technical environment in which AI systems operate, including hardware, software dependencies, model provenance, and adapter configurations. This supports reproducibility and incident investigation.

ProcedureTitleWhat It Witnesses
AI-ENV.1Runtime EnvironmentRecords the runtime environment configuration, including OS, framework versions, and dependency manifests.
AI-ENV.2Deployment EnvironmentWitnesses the deployment target (cloud region, container image, orchestrator) at the time of release.
AI-HW.1Hardware AttestationRecords the compute hardware (GPU model, memory, accelerator) used during inference or training.
AI-HW.3Hardware Runtime ProfileWitnesses runtime hardware utilization, thermal state, and error rates during model execution.
AI-MDL.5Model Weights HashRecords the SHA-256 hash of model weight files, establishing a tamper-evident baseline.
AI-MDL.6Adapter StackWitnesses the LoRA, QLoRA, or fine-tuning adapter configuration active at inference time.

How to verify

1. Compare the AI-MDL.5 weight hash in the ledger against a fresh hash of the model files currently in production. Any mismatch indicates undocumented model changes.

2. Verify AI-ENV.1 and AI-ENV.2 anchors exist for each deployment environment. Organizations running in multiple regions should have per-region anchors.

3. If the system uses fine-tuned adapters, confirm AI-MDL.6 anchors record the adapter lineage (base model, training dataset reference, adapter version).

Assessor tip: AI-MDL.5 (weight hashing) is the single most important provenance control. If the organization can demonstrate continuous weight integrity via a chain of AI-MDL.5 anchors, they have strong evidence for model tamper detection.

GOVERN 1.3 -- Supply Chain and Multi-Agent Governance

What this requires: AI supply chains must be documented, including third-party model providers, data sources, and downstream consumers. For multi-agent systems, the organization must define how agent-to-agent interactions are governed and what chain-of-custody rules apply.

ProcedureTitleWhat It Witnesses
AI-CHAIN.1Chain-of-Custody StartRecords the beginning of a witnessed inference chain, establishing a cycle_id for correlation.
AI-CHAIN.2Chain-of-Custody LinkWitnesses each subsequent step in a multi-model or multi-agent pipeline, linking to the parent cycle_id.
AI-GOV.6Supply Chain AttestationRecords the declared provenance of third-party models, datasets, and API dependencies.
AI-MULTI.1Multi-Agent CoordinationWitnesses the coordination protocol between agents, including role assignments and escalation rules.

How to verify

1. For multi-agent systems, request the full chain of AI-CHAIN.1 and AI-CHAIN.2 anchors for a sample transaction. Verify that cycle_ids form a complete, unbroken chain from start to finish.

2. Check that AI-GOV.6 supply chain attestations exist for every third-party model in the system's AI SBOM.

3. For AI-MULTI.1, confirm that the witnessed coordination protocol matches the actual runtime behavior by comparing with system logs.

Common finding: Organizations using orchestration frameworks (LangChain, CrewAI, AutoGen) often have no chain-of-custody witnessing. Each agent call is an unwitnessed handoff. If the system routes between multiple models or agents, the absence of AI-CHAIN.1/CHAIN.2 anchors is a significant gap.

GOVERN 1.4 -- Audit and Logging

What this requires: The organization must maintain audit trails for AI system decisions, including who accessed the system, what inputs were provided, and what outputs were generated. Logging must be tamper-evident and time-stamped.

ProcedureTitleWhat It Witnesses
AI-AUDIT.1Audit Trail IntegrityWitnesses the integrity of the AI audit log, including log completeness and tamper-detection status.
AI-AUDIT.2External Timestamp AttestationRecords an RFC 3161 timestamp from an external Timestamp Authority, proving the audit record existed at a specific time.
AI-LOG.1Inference LoggingWitnesses that inference inputs and outputs are being captured, including the log retention policy in effect.

How to verify

1. Request a sample of AI-AUDIT.1 anchors and verify they are generated on a regular cadence (daily or per-session).

2. For organizations claiming tamper-evident logging, check for AI-AUDIT.2 anchors with RFC 3161 timestamps. These provide independent third-party proof of log existence.

3. Verify AI-LOG.1 confirms active inference logging. Check the retention period factor -- many organizations log inferences but purge them too quickly for meaningful audit.

Assessor tip: AI-AUDIT.2 with an RFC 3161 timestamp is the strongest form of audit evidence because the timestamp comes from an external, independent authority. If the organization has these, prioritize verifying the TSA certificate chain.

GOVERN 1.5 -- Trust, Authorization, and Guardrails

What this requires: AI systems must implement authorization controls, trust boundaries, and operational guardrails. Access to model capabilities must be gated, and the system must prevent unauthorized or unsafe outputs through input/output filtering.

ProcedureTitleWhat It Witnesses
AI-AUTO.2Autonomous Generation DepthRecords the depth of autonomous generation (e.g., recursive code generation), including halt conditions.
AI-CONSENT.1Consent VerificationWitnesses that user consent was obtained before AI processing, including the consent mechanism and scope.
AI-GOV.4Policy Compliance GateRecords a pre-inference policy check result, confirming the request was authorized against the active policy.
AI-GRD.1Input GuardrailWitnesses input filtering decisions, including blocked prompts and the guardrail rule that triggered.
AI-GRD.2Output GuardrailRecords output filtering decisions, including content that was blocked or modified before delivery.
AI-TRUST.1Trust VerificationWitnesses the trust verification result for an agent or service requesting model access.
AI-TRUST.2Credential PresentationRecords the credential presented by an agent, including credential type, issuer, and expiration.

How to verify

1. Check that AI-GRD.1 and AI-GRD.2 anchors exist and are being generated at inference time, not just during initial deployment.

2. For systems requiring consent under applicable regulations, verify AI-CONSENT.1 anchors exist for each data subject interaction or confirm a batch consent mechanism is witnessed.

3. Verify AI-GOV.4 (policy compliance gate) fires before inference. The anchor timestamp must precede the corresponding inference anchor. If the gate fires after inference, it is not a gate -- it is a post-hoc check.

4. For agent-to-agent communication, verify that AI-TRUST.1 and AI-TRUST.2 form a matched pair for each trust negotiation.

Common finding: Guardrails (AI-GRD.1, AI-GRD.2) exist in the system configuration but are not witnessed at runtime. The guardrail may be active, but without anchors, there is no cryptographic proof it was enforced for any specific inference.

GOVERN 1.7 -- Transparency and Documentation

What this requires: Organizations must maintain comprehensive documentation of AI system capabilities, limitations, intended uses, and known risks. Transparency measures must include model cards, data sheets, and clear labeling of AI-generated content.

ProcedureTitleWhat It Witnesses
AI-CHR.1Agent CharterRecords the agent's declared purpose, constraints, and operational boundaries.
AI-DATA.2Training Data AttestationWitnesses the training data provenance, including dataset identifiers and preprocessing steps.
AI-GRD.3Policy Version BindingRecords which specific policy version governs a model's guardrail configuration.
AI-LIC.1License ComplianceWitnesses the license terms under which a model or dataset is used, including open-weight governance obligations.
AI-MARK.1AI Content MarkingRecords whether AI-generated content is labeled as such, including the marking mechanism.
AI-SKILL.1Skill ManifestWitnesses the declared capabilities (skills) of an AI agent, providing a machine-readable capability inventory.
AI-TOOL.1Tool UsageRecords which external tools an agent invoked, including tool identity and the inputs provided.
AI-TRANS.1Transparency ReportWitnesses the publication or update of a transparency report, including scope and reporting period.
AI-WATERMARK.1Output WatermarkRecords whether AI-generated outputs carry a digital watermark and the watermarking method used.

How to verify

1. Check that AI-CHR.1 (agent charter) exists for every agent in the system. The charter should predate the agent's first inference anchor.

2. For systems using open-weight models (LLaMA, Mistral, Falcon), verify that AI-LIC.1 records the license type and any use restrictions. Many open-weight licenses prohibit specific applications.

3. Confirm AI-MARK.1 anchors demonstrate active content marking. For EU AI Act high-risk systems, this is a legal requirement under Article 50.

4. Review AI-TOOL.1 anchors for completeness. If the agent can invoke 12 tools but only 3 have witnessed usage, ask why the other 9 are not covered.

Framework intersection: GOVERN 1.7 transparency requirements overlap significantly with EU AI Act Article 11 (Technical Documentation) and Article 50 (Transparency for GPAI). If the organization is also subject to EU AI Act, AI-MARK.1 and AI-TRANS.1 evidence serves both frameworks.

GOVERN 2.1 -- Roles and Responsibilities

What this requires: The organization must define and document roles and responsibilities for AI risk management, including who is accountable for model performance, who approves deployment decisions, and who monitors ongoing operations.

ProcedureTitleWhat It Witnesses
AI-GOV.2Responsibility AssignmentRecords the assignment of AI governance roles, including the responsible individual and their scope of authority.
AI-INF.3Inference AuthorizationWitnesses the identity of the individual or service that authorized a specific inference or batch of inferences.

How to verify

1. Confirm AI-GOV.2 anchors name specific individuals or role-holders, not generic teams. Accountability requires named parties.

2. For high-risk systems, verify AI-INF.3 records show that authorized personnel (not just service accounts) approved inference operations.

GOVERN 2.2 -- Governance Mechanisms

What this requires: The organization must implement mechanisms (boards, committees, automated checks) that enforce governance decisions throughout the AI lifecycle.

ProcedureTitleWhat It Witnesses
AI-GOV.7Governance Board DecisionRecords decisions made by an AI governance board or ethics committee, including the decision rationale and vote outcome.

How to verify

1. Check that AI-GOV.7 anchors exist for major governance decisions (model deployment approval, risk acceptance, decommissioning).

2. Verify the cadence of governance board meetings by examining the timestamp distribution of AI-GOV.7 anchors.

GOVERN 4.1 -- Human Oversight Mechanisms

What this requires: The organization must implement effective mechanisms for human oversight of AI systems, including the ability to intervene, override, or shut down AI operations when necessary.

ProcedureTitleWhat It Witnesses
AI-HITL.3Human Override RecordRecords instances where a human overrode an AI decision, including the override reason and the original AI output.

How to verify

1. Request AI-HITL.3 anchors and verify that overrides are being recorded. A system with zero overrides over months of operation may indicate the override mechanism is unused or unavailable.

2. Cross-reference override frequency with the system's risk tier. High-risk systems with very few overrides warrant closer examination of the oversight process.

GOVERN 6.1 -- Policy Enforcement

What this requires: Governance policies must be actively enforced, not merely documented. The organization must demonstrate that policy violations are detected, escalated, and resolved.

ProcedureTitleWhat It Witnesses
AI-GOV.5Policy Enforcement ActionRecords an enforcement action taken in response to a policy violation, including the violation type and remediation.

How to verify

1. Verify AI-GOV.5 anchors exist. The absence of any enforcement actions may indicate either excellent compliance or a failure to detect violations.

2. Cross-reference AI-GOV.5 records with AI-VIO.1 (violation detection) anchors from the MANAGE function to confirm violations flow through to enforcement.

4. MAP Function

The MAP function focuses on understanding the AI system's context, identifying risks, and characterizing the system's capabilities and limitations. It is the foundation for risk-informed decisions in MEASURE and MANAGE. SWT3 maps 16 procedures across 6 MAP categories.

Evidence in this function should demonstrate that the organization understands what the system does, who it affects, and what can go wrong. Look for anchors that document system identity, data provenance, and impact assessments.

MAP 1.1 -- System Identification and Inventory

What this requires: The organization must maintain an inventory of AI systems, including unique identifiers, intended purposes, deployment status, and technical specifications. Each system must be identifiable and trackable throughout its lifecycle.

ProcedureTitleWhat It Witnesses
AI-GOV.3System RegistrationRecords the registration of an AI system in the organization's inventory, including purpose and classification.
AI-ID.1Agent IdentityWitnesses the cryptographic identity of an AI agent, binding it to a persistent identifier (agent_id).
AI-MDL.4Model CardRecords the model card contents, including architecture, training data summary, and performance benchmarks.
AI-SBOM.1AI Software Bill of MaterialsWitnesses the full dependency tree of the AI system, including model providers, libraries, and data sources.

How to verify

1. Confirm every AI system in scope has an AI-ID.1 anchor. The agent_id should be consistent across all subsequent anchors for that system.

2. Verify AI-SBOM.1 exists and is current (minted within the last 90 days or after any dependency change).

3. Check that AI-MDL.4 model card anchors cover all models in production, not just the primary model.

Common finding: Organizations maintain an internal spreadsheet of AI systems but have no AI-GOV.3 or AI-ID.1 anchors. Without cryptographic identity binding, there is no tamper-evident link between the system in the inventory and the system generating inferences.

MAP 2.1 -- Risk Identification

What this requires: The organization must identify and document risks associated with AI systems, including technical risks (model failure, data poisoning), societal risks (bias, discrimination), and operational risks (availability, misuse).

ProcedureTitleWhat It Witnesses
AI-RISK.1Risk AssessmentRecords a risk assessment outcome, including identified risks, likelihood, impact, and mitigation status.

How to verify

1. Confirm AI-RISK.1 anchors are generated at deployment and on a recurring schedule (quarterly minimum for high-risk systems).

2. Review the risk categories captured. A risk assessment that only covers technical risks but ignores fairness, privacy, and societal impact is incomplete under the AI RMF.

MAP 2.3 -- Fairness, Explainability, and Data Provenance

What this requires: The organization must characterize the AI system's behavior with respect to fairness, explainability, and data quality. This includes understanding how the model makes decisions, whether it exhibits bias, and whether its data sources are reliable.

ProcedureTitleWhat It Witnesses
AI-EXPL.2Explainability MethodRecords the explainability technique applied (SHAP, LIME, attention maps) and whether it is available to end users.
AI-FAIR.2Bias Detection MethodWitnesses the bias detection methodology used, including protected attributes tested and statistical tests applied.
AI-FAIR.3Bias Mitigation ActionRecords specific actions taken to mitigate detected bias, including the technique and its measured effect.
AI-INF.1Inference RecordWitnesses an individual inference event, including model version, clearing level, and response metadata.
AI-MDL.2Model VersionRecords the active model version at inference time, enabling traceability from output to specific model state.
AI-RAG.1RAG ProvenanceWitnesses the retrieval-augmented generation context, including source documents retrieved and relevance scores.

How to verify

1. For systems making consequential decisions about individuals, verify AI-FAIR.2 anchors document the protected attributes tested. The absence of fairness testing evidence is a material gap. The assessor determines which use cases are consequential based on the system's context.

2. Check that AI-EXPL.2 records the explainability method. Systems that claim to be "explainable" should have anchored evidence of which technique is used and when.

3. For RAG systems, verify AI-RAG.1 anchors exist. These prove the system can trace its outputs to specific source documents, which is critical for factual accuracy claims.

Assessor tip: AI-INF.1 and AI-MDL.2 together form the minimum viable inference trail. If the organization has these two, you can trace any output back to a specific model version at a specific time. Start your evidence review here.

MAP 3.5 -- Data Governance

What this requires: The organization must implement data governance practices that address data quality, representativeness, and fitness for purpose. Data used to train, test, or operate AI systems must be documented and managed.

ProcedureTitleWhat It Witnesses
AI-DATA.1Data Quality AttestationWitnesses the data quality assessment outcome, including completeness, accuracy, and representativeness metrics.
AI-DATA.4Synthetic Data DeclarationRecords whether synthetic data was used in training or testing, including the generation method and proportion.

How to verify

1. Confirm AI-DATA.1 anchors exist for each training dataset and are refreshed when data is updated or augmented.

2. If the organization uses synthetic data, verify AI-DATA.4 documents the percentage of synthetic vs. real data and the generation methodology.

MAP 4.1 -- Data Lineage

What this requires: The organization must maintain data lineage records that trace data from source to model input, including transformations, filtering, and aggregation steps.

ProcedureTitleWhat It Witnesses
AI-DATA.3Data Lineage RecordRecords the full lineage of a dataset, including source systems, transformation pipeline, and version history.

How to verify

1. Request AI-DATA.3 anchors and trace the lineage from raw source to model input. Every transformation step should be documented.

2. Verify the lineage record includes data retention and deletion policies, not just the processing pipeline.

MAP 5.2 -- Impact Assessment

What this requires: The organization must conduct impact assessments for AI systems, particularly those that affect individuals' rights, safety, or opportunities. Assessments should consider both intended and unintended consequences.

ProcedureTitleWhat It Witnesses
AI-DPIA.1Data Protection Impact AssessmentRecords the completion of a DPIA, including the assessment scope, identified risks, and mitigation measures.
AI-IMPACT.1Societal Impact AssessmentWitnesses a broader impact assessment covering societal, environmental, and equity dimensions.

How to verify

1. Verify AI-DPIA.1 was completed before deployment, not retroactively. The anchor timestamp should predate the first AI-INF.1 inference anchor.

2. For high-impact systems, check that AI-IMPACT.1 covers environmental and equity dimensions, not just privacy and security.

Framework intersection: AI-DPIA.1 evidence satisfies both NIST AI RMF MAP 5.2 and GDPR Article 35 (DPIA). If the organization processes EU personal data, this procedure generates dual-framework evidence from a single witness event.

5. MEASURE Function

The MEASURE function addresses quantitative and qualitative assessment of AI system performance, reliability, and trustworthiness. Evidence in this function is inherently technical, covering metrics, drift detection, and adversarial testing. SWT3 maps 14 procedures across 3 MEASURE categories.

This is where the "show your work" principle applies most directly. Governance policies (GOVERN) and risk identification (MAP) set expectations; MEASURE proves the system meets them.

MEASURE 2.5 -- Performance, Fairness, and Explainability Metrics

What this requires: The organization must define and regularly evaluate metrics for AI system performance, fairness, and explainability. Metrics must be documented, tracked over time, and compared against acceptable thresholds.

ProcedureTitleWhat It Witnesses
AI-EXPL.1Explainability ScoreRecords the quantitative explainability score for a model or inference, including the scoring methodology.
AI-FAIR.1Fairness MetricWitnesses the measured fairness metric (disparate impact ratio, equalized odds, demographic parity) and the threshold applied.
AI-INF.2Inference QualityRecords quality metrics for inferences (confidence score, latency, token count), enabling performance trend analysis.
AI-PERF.1Performance BenchmarkWitnesses the results of a formal performance benchmark, including the benchmark suite, dataset, and scores achieved.
AI-RAG.2RAG Relevance ScoreRecords the relevance scoring of retrieved documents in a RAG pipeline, measuring retrieval quality.
AI-SKILL.2Memory ContextWitnesses the memory/context window utilization of an agent, including what was retained and what was discarded.

How to verify

1. Request a time series of AI-FAIR.1 anchors and verify the organization has defined acceptable thresholds. Measuring fairness without a threshold is measurement without accountability.

2. Check AI-PERF.1 benchmark results against the model card (AI-MDL.4). Performance claims in the model card should be supported by anchored benchmark evidence.

3. For RAG systems, verify AI-RAG.2 relevance scores are within acceptable ranges. Consistently low relevance scores indicate retrieval quality issues that affect output accuracy.

Common finding: Organizations track performance metrics in internal dashboards but do not generate SWT3 anchors for them. This means the metrics exist but are not cryptographically witnessed, so their integrity cannot be verified by an external assessor.

MEASURE 2.6 -- Drift, Robustness, and Model Integrity

What this requires: The organization must monitor AI systems for drift (changes in model behavior over time), assess robustness against adversarial inputs, and verify model integrity against known baselines.

ProcedureTitleWhat It Witnesses
AI-BASE.1Baseline EstablishmentRecords the establishment of a performance baseline, including the baseline metrics and the conditions under which they were measured.
AI-DRIFT.1Drift DetectionWitnesses a drift detection event, including the drift magnitude, direction, and whether thresholds were exceeded.
AI-MDL.3Model ValidationRecords model validation results against acceptance criteria, including test datasets and pass/fail determination.
AI-MDL.7Quantization RecordWitnesses model quantization parameters (INT8, FP16, GPTQ), documenting precision-performance tradeoffs.
AI-ROBUST.1Robustness TestRecords the results of adversarial robustness testing, including attack types simulated and the model's resilience.
AI-SKILL.3Reward ModelWitnesses the reward model configuration and alignment metrics for RLHF-trained systems.

How to verify

1. Verify that AI-BASE.1 was established before deployment and that AI-DRIFT.1 anchors reference the baseline. Drift is meaningless without a baseline to drift from.

2. Check AI-DRIFT.1 frequency. For production systems, drift detection should run at least weekly. For high-frequency systems, daily or continuous monitoring is expected.

3. If the model has been quantized (common for edge deployment), verify AI-MDL.7 documents the quantization method and any accuracy impact measured against the full-precision baseline.

4. For RLHF systems, verify AI-SKILL.3 documents the reward model version and alignment metrics. Reward model changes can silently alter system behavior.

Assessor tip: The AI-BASE.1 to AI-DRIFT.1 chain is the strongest evidence of continuous monitoring. Ask for the full chain: baseline establishment, regular drift checks, and any remediation actions taken when drift thresholds were exceeded.

MEASURE 3.1 -- Red Team and Supply Chain Testing

What this requires: The organization must conduct adversarial testing (red teaming) of AI systems and assess supply chain risks through structured testing programs.

ProcedureTitleWhat It Witnesses
AI-REDTEAM.1Red Team ExerciseRecords the execution and findings of an AI red team exercise, including attack scenarios tested and vulnerabilities discovered.
AI-SUPPLY.1Supply Chain AssessmentWitnesses a supply chain risk assessment, including third-party dependencies evaluated and risk ratings assigned.

How to verify

1. Verify AI-REDTEAM.1 anchors exist and were generated before production deployment or at regular intervals (annually minimum).

2. Check that AI-SUPPLY.1 covers all critical dependencies identified in the AI-SBOM.1 (from MAP 1.1). A supply chain assessment that misses key dependencies is incomplete.

3. Cross-reference red team findings with AI-GRD.1/GRD.2 (GOVERN 1.5) to verify that discovered vulnerabilities led to guardrail updates.

6. MANAGE Function

The MANAGE function addresses the ongoing operation, maintenance, and incident response for AI systems. Evidence here demonstrates that the organization does not just deploy and forget -- it actively manages risks throughout the system's operational life. SWT3 maps 13 procedures across 7 MANAGE categories.

MANAGE 1.3 -- Model Lifecycle

What this requires: The organization must manage AI models across their full lifecycle, including development, testing, deployment, monitoring, updating, and decommissioning.

ProcedureTitleWhat It Witnesses
AI-MDL.1Model RegistrationRecords the formal registration of a model for production use, including approval authority and deployment constraints.

How to verify

1. Verify AI-MDL.1 exists for every model in production. Cross-reference with the system inventory (AI-GOV.3) to ensure no unregistered models are operating.

2. Check the approval chain. The anchor should reference who authorized the model for production use.

MANAGE 2.2 -- Cybersecurity

What this requires: The organization must address cybersecurity risks specific to AI systems, including adversarial attacks, model theft, data poisoning, and prompt injection.

ProcedureTitleWhat It Witnesses
AI-CYBER.1AI-Specific Threat AssessmentRecords an AI-specific cybersecurity threat assessment, including threats evaluated and countermeasures deployed.

How to verify

1. Verify AI-CYBER.1 covers AI-specific threats (prompt injection, model inversion, training data extraction), not just general IT security threats.

2. Check that the threat assessment is current. AI threat landscapes evolve rapidly; an assessment older than 6 months should be refreshed.

MANAGE 2.3 -- Safety and Security Controls

What this requires: The organization must implement safety and security controls proportional to the AI system's risk level. Controls must be documented, tested, and maintained.

ProcedureTitleWhat It Witnesses
AI-SAFE.1Safety BoundaryRecords the safety boundaries defined for the AI system, including operational limits and fail-safe conditions.
AI-SEC.1Security Control AttestationWitnesses the security control posture of the AI system, including encryption, access controls, and network isolation.
AI-SEC.2Vulnerability AssessmentRecords the results of an AI-focused vulnerability assessment, including findings and remediation status.

How to verify

1. Confirm AI-SAFE.1 defines concrete, measurable boundaries (not vague "best effort" language). Safety boundaries should specify what happens when limits are reached.

2. Verify AI-SEC.1 and AI-SEC.2 are generated on a regular cadence. Security attestation should align with the organization's overall vulnerability management cycle.

Common finding: Organizations conduct general IT vulnerability assessments but do not assess AI-specific vulnerabilities (model poisoning, inference manipulation, embedding attacks). AI-SEC.2 should cover threats unique to AI systems, not just the infrastructure they run on.

MANAGE 2.4 -- Access Control and Revocation

What this requires: The organization must control access to AI systems and maintain the ability to revoke access or retract AI outputs when necessary.

ProcedureTitleWhat It Witnesses
AI-ACC.1Access Control DecisionRecords an access control decision for the AI system, including the requester, resource, and authorization result.
AI-REV.1Anchor RevocationWitnesses the revocation of a previously-issued witness anchor, including the revocation reason code and the target fingerprint.

How to verify

1. Verify AI-ACC.1 anchors demonstrate active access control enforcement, not just policy documentation.

2. Check for AI-REV.1 anchors. If the organization has never issued a revocation, ask whether they have a documented revocation procedure and whether it has been tested. Seven reason codes are supported: unspecified, model_recall, policy_violation, data_contamination, consent_withdrawal, regulatory_order, error_correction.

3. If revocations exist, verify the revoked anchors are flagged in the public verification endpoint. Navigate to /verify and confirm revoked anchors display their revocation status.

Assessor tip: AI-REV.1 is the "undo button" for the SWT3 protocol. Organizations that have exercised revocation demonstrate a mature incident response capability. Ask for the revocation log and review the reason codes used.

MANAGE 3.1 -- Incident Response

What this requires: The organization must have an incident response plan specific to AI systems, including procedures for detecting, containing, and recovering from AI-related incidents.

ProcedureTitleWhat It Witnesses
AI-IR.1Incident Response ActivationRecords the activation of an AI incident response plan, including the incident classification and initial containment actions.

How to verify

1. Verify an AI-specific incident response plan exists and is distinct from the general IT incident response plan.

2. Check for AI-IR.1 anchors from tabletop exercises or drills, not just real incidents. A tested plan is more credible than an untested one.

MANAGE 3.2 -- Autonomous Operations and Incident Management

What this requires: For AI systems with autonomous decision-making capabilities, the organization must define operational boundaries and maintain incident management processes that account for autonomous behavior.

ProcedureTitleWhat It Witnesses
AI-AUTO.1Autonomous Decision RecordRecords an autonomous decision made without human intervention, including the decision rationale and confidence level.
AI-INCIDENT.1AI Incident RecordWitnesses the details of an AI-related incident, including impact, root cause, and corrective actions taken.

How to verify

1. For autonomous systems, verify AI-AUTO.1 anchors are generated for each autonomous decision. The volume should match the system's actual decision rate.

2. Check AI-INCIDENT.1 records for completeness: root cause analysis, corrective actions, and follow-up verification should all be documented.

3. Cross-reference AI-INCIDENT.1 with AI-IR.1 to verify that incidents triggered the incident response plan as expected.

MANAGE 4.1 -- Human Oversight, Post-Market, and Violations

What this requires: The organization must maintain human oversight mechanisms, conduct post-market monitoring of deployed AI systems, and track policy violations or system failures.

ProcedureTitleWhat It Witnesses
AI-HITL.2Human Review OutcomeRecords the outcome of a human review of AI outputs, including the reviewer's assessment and any corrections applied.
AI-PMM.1Post-Market MonitoringWitnesses a post-market monitoring report, including performance metrics observed in production and any anomalies detected.
AI-VIO.1Violation DetectionRecords the detection of a policy or safety violation by the AI system, including the violation type and severity.

How to verify

1. Verify AI-PMM.1 anchors exist on a regular cadence (monthly or quarterly). Post-market monitoring is not a one-time activity.

2. Cross-reference AI-VIO.1 (violation detection) with AI-GOV.5 (GOVERN 6.1, policy enforcement). Violations that are detected but never enforced indicate a governance gap.

3. For AI-HITL.2, check the reviewer volume. If one person reviews all AI outputs, examine whether the review is meaningful or a rubber-stamp process.

Common finding: Post-market monitoring (AI-PMM.1) is the most frequently missing procedure in production AI systems. Organizations invest heavily in pre-deployment testing but have no structured process for monitoring the system after it is live. This is a critical gap for the MANAGE function.

7. Anchor Anatomy

Every SWT3 Witness Anchor follows a deterministic format that encodes the deployment tier, infrastructure provider, procedure namespace, specific procedure, verdict, timestamp, and a 12-character SHA-256 fingerprint. Here is an annotated example for an AI RMF-relevant procedure:

SWT3-E-AWS-AI-DRIFT1-PASS-1780300000-a3f8c92b1d07
SegmentValueMeaning
SWT3-ETierE = Enclave deployment tier (self-hosted, full control)
AWSProviderInfrastructure provider where the AI system operates
AIUCT NamespaceAI procedures namespace within the Unified Control Taxonomy
DRIFT1ProcedureAI-DRIFT.1, drift detection (MEASURE 2.6)
PASSVerdictThe procedure's acceptance criteria were met
1780300000EpochUnix timestamp when the anchor was minted
a3f8c92b1d07FingerprintFirst 12 characters of SHA-256 hash over witness data
Assessor tip: To verify any anchor, paste its fingerprint at /verify. The verifier will confirm the anchor exists in the ledger, display its full metadata, and flag any revocation status. Batch verification is available for enclave-wide integrity checks.

8. Assessment Resources

Interactive Assessment Tools

Assessment tracker -- Filter by NIST AI RMF and track procedure completion during your evaluation.

Assessment checklist -- Printable checklist of all AI RMF-mapped procedures with pass/fail columns.

Anchor verifier -- Verify individual anchors or run enclave-wide integrity checks.

Related Guides

Assessment Playbook -- Operational playbook for conducting AI compliance assessments across all frameworks.

NIST CI AI Profile Crosswalk -- Mapping between the NIST Cybersecurity for IoT/AI profile and SWT3 procedures.

Assessor Evidence Matrix -- Complete evidence matrix showing what each procedure produces and where to find it.

SDK documentation: For technical integration details, visit the SDK documentation page. The SWT3 SDK is available for Python (pip install swt3-ai), TypeScript (npm install @tenova/swt3-ai), Rust, C#, and Ruby.