The Golden Rule of SoC Crypto IP Verification — A Complete Golden Reference Guide

The Golden Rule of SoC Crypto IP Verification — A Complete Golden Reference Guide

📅 May 19, 2026 · 🔐 Cryptographic Algorithms · 🧪 RTL Verification · 🏛️ NIST / XKCP / OpenSSL

When embedding crypto IP — AES, SHA-2/3, HMAC/KMAC, RSA — into an SoC, the first question is deceptively simple: "How do you prove this RTL is bit-accurate?" The answer is a verifiable software reference model — a Golden Reference — combined with test vectors sanctioned by a standards body. Passing NIST CAVP vectors is the baseline, not the finish line. In practice, the real debugging leverage comes from a clean C reference that exposes intermediate values at every round. This guide covers: the authoritative standard for each algorithm, the trust hierarchy of golden reference candidates, where to obtain them, required test vector categories, and the critical pitfall of reaching for OpenSSL as a golden model.

🧭 Verification Workflow — The Full Pipeline at a Glance

Crypto RTL verification is a five-stage chain: "FIPS spec → golden C code → test vectors → DPI-C comparison → regression automation." A single stage driven by assumption — rather than a verified reference — breaks bit-level correctness across the entire chain.


flowchart TD
  A([FIPS Standard Spec]) --> B[Golden C Reference
NIST · XKCP · Fiat] B --> C[Test Vectors
KAT · MCT · ACVP JSON] C --> D{DPI-C Round
Intermediate match?} D -->|YES| E([RTL Verification Passed]) D -->|NO| F[Bug Localization
State Comparison] style A fill:#3498db,stroke:#2980b9,color:#ffffff style B fill:#e8f8f5,stroke:#16a085 style C fill:#fef9e7,stroke:#f39c12 style D fill:#fef9e7,stroke:#f39c12 style E fill:#eafaf1,stroke:#27ae60,color:#1e8449 style F fill:#fdedec,stroke:#e74c3c,color:#c0392b

🔁 Diagram summary: Starting from the FIPS spec, a golden C model is built and exercised against KAT, MCT, and ACVP JSON vectors. DPI-C then bridges the golden model and the RTL, comparing per-round intermediate state. Match → verification passes; mismatch → the failing round pinpoints the bug.

📚 1. Algorithm Standards and Golden Reference Map

Each algorithm has a recognized primary reference. For SHA-3 and KMAC, the Keccak team's XKCP is the de facto standard. For AES and RSA, OpenSSL is the industry baseline — but using OpenSSL directly as a golden model for RTL comparison is a trap. The reason is covered in depth below.

Algorithm Standard Primary Golden Oracle Candidate
AES FIPS 197 Rijndael C + NIST Intermediate Values OpenSSL (AES-NI disabled build)
SHA-2 FIPS 180-4 NIST C Reference Code OpenSSL EVP_Digest*
SHA-3 / SHAKE FIPS 202 XKCP `ref` (Keccak team) OpenSSL 3.x
HMAC FIPS 198-1 NIST examples + OpenSSL HMAC() PyCryptodome (supplemental)
KMAC SP 800-185 XKCP (includes cSHAKE) OpenSSL 3.0+ KMAC EVP
RSA FIPS 186-4 (→ 186-5) OpenSSL bignum + NIST examples Fiat-Crypto (formally verified)
💡 Key insight: For SHA-3 and KMAC, XKCP is the de facto standard. For AES and RSA, OpenSSL is the industry baseline. However, OpenSSL is not suitable as a golden model for direct RTL comparison — that distinction is the central point of this guide.

📥 2. Where to Obtain Them — Reference Source Catalog

🏛️ 2.1 NIST Official Sources (Tier 1 — acquire unconditionally)

CSRC project pages — The algorithm-specific "Examples with Intermediate Values" PDFs are the most direct path to RTL debugging. For AES, each PDF lists the 16-byte state after every round. For SHA-256, it covers the full message schedule W[t]. For Keccak, every θ/ρ/π/χ/ι step result is given byte-by-byte — enabling 1:1 comparison against RTL waveforms.

CAVP legacy ZIP/RSP: AES (AESAVS), SHS (SHA-2), SHA-3, RSA (RSA2VS) — plain-text bundles with tens of thousands of KAT, MCT, and MMT vectors per algorithm.

ACVP (current JSON standard): NIST is migrating from RSP to structured JSON. Parsing ACVP JSON in a UVM or Python testbench is far simpler than hand-parsing RSP files. New projects should default to ACVP JSON.

🔑 2.2 XKCP — De Facto Standard for SHA-3 / KMAC

Repository: github.com/XKCP/XKCP. Maintained directly by the Keccak designers — Bertoni, Daemen, Peeters, and Van Assche — making it the highest-trust implementation of FIPS 202 and SP 800-185. If there is ever a discrepancy between XKCP and another library, XKCP is authoritative.

Usage rule: use only the `ref` folder as your golden model. The AVX-512 and ARMv8 optimized folders are for performance benchmarking only. Assembly-optimized paths diverge structurally from RTL — they exist to maximize throughput, not to mirror the spec's algorithmic steps, which limits their value for per-round debugging.

⚠️ 2.3 OpenSSL — For Oracle / Wrapper Use Only (Unsuitable as Golden)

FIPS 140-3 certification status: Module #4985 (March 11, 2025, OpenSSL 3.1.2) is the current recommended module. Module #4811 (September 24, 2024) was the first FIPS 140-3 certification. OpenSSL 3.5.4 (submitted October 9, 2025) is a next-generation module with PQC support, currently under review.

Gotcha: A call to EVP_EncryptInit dispatches at runtime to AES-NI, VAES, or AVX-512 code paths. There is no guarantee it follows the same algorithmic flow as your RTL. For RTL comparison, always use a clean C reference derived directly from the FIPS 197 spec — one you have audited and understand completely.

🛠️ 2.4 Supporting Tools — Wycheproof · AWS-LC · Fiat-Crypto

Wycheproof (Google → C2SP): The execution harness has been removed; it now serves as a JSON vector library. Essential for edge-case coverage: bad padding, weak keys, non-standard lengths, and other adversarial inputs that NIST CAVP deliberately excludes.

AWS-LC: BoringSSL fork with FIPS certification. Suitable when you need both modern performance (including PQC) and compliance coverage simultaneously.

BoringSSL: Google-internal. No external FIPS certification — not recommended for SoC compliance work.

Fiat-Crypto: Generates formally verified RSA/ECC bignum implementations with machine-checked proofs of correctness. The highest-confidence golden reference option for RSA.

📊 3. Golden Candidate Trust Comparison

Scored across four axes — RTL comparison fit, standards authority, debugging convenience (intermediate value access), and automation friendliness (JSON/DPI-C) — the candidates rank as follows.

NIST Intermediate
95
XKCP ref (SHA-3)
93
Fiat-Crypto (RSA)
88
ACVP JSON
85
AWS-LC
70
OpenSSL (Oracle use)
65
PyCryptodome
35
BoringSSL (no FIPS)
25

* Scores are qualitative assessments — composite of RTL comparison fit, standards authority, and debugging convenience.

🧪 4. Required Test Vector Categories

A design is only "verified" when all seven categories below pass. Skip any one of them and a silent bug can be frozen into a production chip.

# Category Verification Target Source
1 KAT Standard input → standard output match (minimum requirement) NIST CAVP
2 MCT Thousands-to-millions of chained operations; detects accumulated errors CAVP / ACVP
3 MMT State propagation consistency across multiple block boundaries CAVP
4 Intermediate Per-round state comparison — the core of SoC debugging FIPS appendix examples
5 Edge / Invalid Max/min/zero-length inputs, weak keys, bad padding Wycheproof
6 RSA-specific KeyGen / SigGen·SigVer (PKCS#1, PSS) / OAEP / timing RSA2VS + Wycheproof
7 HMAC/KMAC Three key-length branches: < block size, = block size, > block size CAVP HMAC

⚙️ 5. Co-Simulation Workflow — DPI-C Is the Right Choice

When bridging RTL and a golden model, DPI-C (Direct Programming Interface for C) is superior to Python ctypes for cycle-accurate per-round comparison. The standard four-stage pipeline looks like this.


graph LR
  A[Golden C
State Exposed] --> B[DPI-C Binding
SystemVerilog] B --> C[UVM Scoreboard
Per-Round Compare] C --> D[ACVP JSON
Regression] style A fill:#e8f8f5,stroke:#16a085 style B fill:#eaf2f8,stroke:#2980b9 style C fill:#fef9e7,stroke:#f39c12 style D fill:#eafaf1,stroke:#27ae60

🔗 Diagram summary: The NIST/XKCP `ref` C code is instrumented to expose internal state, then bound into SystemVerilog via DPI-C. A UVM scoreboard compares RTL output against DPI results at every round. ACVP JSON drives regression automation — a four-stage pipeline that scales from a single design to a full regression suite.

5.1 Key Implementation Patterns

▶ Instrument the C reference to expose an inspection API — for example, void aes_get_state(int round, uint8_t state[16]). This makes the golden model and the RTL share the same observable state structure, enabling direct per-round comparison.

▶ In SystemVerilog, declare import "DPI-C" function void aes_get_state(...) to bind the C function directly into the testbench simulation context.

▶ The UVM scoreboard compares the RTL round output against the DPI call result at every round boundary. The first mismatch pinpoints the exact failing round — there is no ambiguity about where the bug is.

▶ Limit Python to stimulus generation and regression loading. Keep the critical verification path in DPI-C. Mixing Python into the comparison loop introduces latency and reduces observability.

5.2 Library Pitfall Comparison

Library Intermediate Value Access RTL Comparison Fit
PyCryptodome ❌ Black box Not suitable
OpenSSL ⚠️ ASM dispatch Oracle use only
NIST example C ✅ Explicit Optimal
XKCP ref ✅ State easily exposed Optimal
🚨 Recommendation: Triple-verify the same vector against (a) a clean C reference, (b) OpenSSL/AWS-LC, and (c) XKCP `ref` to rule out library-dependent errors. Passing against only one source is not sufficient — library-specific bugs have reached production crypto IP exactly this way.

📖 6. Algorithm Quick Reference — Core Structures

🔐 AES (FIPS 197): 128-bit block, 128/192/256-bit keys. Round structure: SubBytes (S-Box) → ShiftRows → MixColumns → AddRoundKey, for 10/12/14 rounds depending on key length. Verification checkpoint: 16-byte state after each round.
#️⃣ SHA-2 (FIPS 180-4): SHA-224/256/384/512. Merkle-Damgård construction with Davies-Meyer compression. 64/80 message schedule words W[t]. Verification checkpoint: working variables a–h at each compression step.
🌀 SHA-3 (FIPS 202): Keccak-f[1600] sponge construction. Five permutation steps — θ, ρ, π, χ, ι — applied for 24 rounds. SHAKE128/256 produce variable-length output. Verification checkpoint: per-round snapshot of the 5×5×64 state array.
🔏 HMAC (FIPS 198-1): H((K⊕opad) ∥ H((K⊕ipad) ∥ M)). Verification checkpoint: correct key-length handling across three cases — key shorter than, equal to, and longer than the block size.
🗝️ KMAC (SP 800-185): Keyed hash built on cSHAKE. Provides explicit domain separation via a Customization String — more disciplined than raw SHA-3 for keyed operations. Verification checkpoint: domain-separation byte padding.
🔑 RSA (FIPS 186-4 → 186-5): c = me mod n; signing uses PKCS#1 v1.5 or PSS; encryption uses OAEP. Verifying correctness of bignum, Montgomery reduction, and CRT optimization is the hardest challenge in SoC crypto verification. Verification checkpoint: intermediate values in modular exponentiation.

🎯 7. Recommended Production Setup

Algorithm Golden (Primary) Oracle Edge / Regression
SHA-3 / KMAC XKCP `ref` OpenSSL 3.x Wycheproof JSON
AES Custom C from FIPS 197 Appendix OpenSSL (AES-NI disabled) AESAVS + Wycheproof
SHA-2 / HMAC NIST C Reference OpenSSL CAVP RSP + ACVP JSON
RSA Fiat-Crypto or custom bignum OpenSSL RSA2VS + Wycheproof PSS
🛡️ Physical attack caveat: A software golden reference cannot prove resistance to side-channel attacks or fault injection. Power and timing leakage must be assessed separately using TVLA (Test Vector Leakage Assessment) — a t-test applied to gate-level simulation waveforms or physical measurements. Functional verification and security verification are separate tracks with separate budgets.

📅 8. Standards Compliance Timeline

Sep 2024
FIPS 140-3
#4811 First Cert
Mar 2025
#4985
OpenSSL 3.1.2
Oct 2025
OpenSSL 3.5.4
PQC Submitted
2026+
FIPS 186-5
Transition Accelerates

🏁 The Four-Layer Golden Rule

The golden rule of SoC crypto IP verification is a four-layer combination: "NIST standard spec + clean C reference + DPI-C intermediate-value comparison + Wycheproof edge cases." OpenSSL is highly useful as a wrapper and oracle, but its platform-specific assembly dispatch — AES-NI, VAES, AVX-512 — makes it unsuitable as the bit-accurate golden model for RTL comparison.

For SHA-3/KMAC: golden = XKCP `ref`. For RSA: golden = Fiat-Crypto or custom code grounded in NIST examples. For AES/SHA-2: golden = C code that embeds the FIPS appendix intermediate values verbatim. Automate regression with ACVP JSON. This is the most robust setup as of 2026. And one thing not to overlook — functional verification and side-channel verification are entirely separate tracks. Budget for TVLA separately.

🧠 One-line summary: Standards: NIST. Golden: XKCP · Fiat · NIST examples. Oracle: OpenSSL. Edge cases: Wycheproof. Bridge: DPI-C. Regression: ACVP JSON — six things to remember.

📎 References

NIST CSRC — Cryptographic Standards

NIST ACVP-Server JSON Files

XKCP — Keccak Code Package

Keccak Team Official

Wycheproof (C2SP)

Fiat-Crypto

NIST CAVP

This document is for informational purposes only. Actual IP certification and compliance must follow the latest guidelines from the relevant certification authority. All SoC designs intended for production must undergo review by a qualified security evaluation lab before tape-out.

S
SoC Design
Semiconductor & SoC Design Notes

Materials on semiconductor and SoC design and verification — personally curated and reviewed before publication.

Written based on publicly available data and cited sources. Last updated: June 8, 2026.

댓글