The Golden Rule of SoC Crypto IP Verification — A Complete Golden Reference Guide
The Golden Rule of SoC Crypto IP Verification — A Complete Golden Reference Guide
📅 May 19, 2026 · 🔐 Cryptographic Algorithms · 🧪 RTL Verification · 🏛️ NIST / XKCP / OpenSSL
When embedding crypto IP — AES, SHA-2/3, HMAC/KMAC, RSA — into an SoC, the first question is deceptively simple: "How do you prove this RTL is bit-accurate?" The answer is a verifiable software reference model — a Golden Reference — combined with test vectors sanctioned by a standards body. Passing NIST CAVP vectors is the baseline, not the finish line. In practice, the real debugging leverage comes from a clean C reference that exposes intermediate values at every round. This guide covers: the authoritative standard for each algorithm, the trust hierarchy of golden reference candidates, where to obtain them, required test vector categories, and the critical pitfall of reaching for OpenSSL as a golden model.
🧭 Verification Workflow — The Full Pipeline at a Glance
Crypto RTL verification is a five-stage chain: "FIPS spec → golden C code → test vectors → DPI-C comparison → regression automation." A single stage driven by assumption — rather than a verified reference — breaks bit-level correctness across the entire chain.
flowchart TD
A([FIPS Standard Spec]) --> B[Golden C Reference
NIST · XKCP · Fiat]
B --> C[Test Vectors
KAT · MCT · ACVP JSON]
C --> D{DPI-C Round
Intermediate match?}
D -->|YES| E([RTL Verification Passed])
D -->|NO| F[Bug Localization
State Comparison]
style A fill:#3498db,stroke:#2980b9,color:#ffffff
style B fill:#e8f8f5,stroke:#16a085
style C fill:#fef9e7,stroke:#f39c12
style D fill:#fef9e7,stroke:#f39c12
style E fill:#eafaf1,stroke:#27ae60,color:#1e8449
style F fill:#fdedec,stroke:#e74c3c,color:#c0392b
🔁 Diagram summary: Starting from the FIPS spec, a golden C model is built and exercised against KAT, MCT, and ACVP JSON vectors. DPI-C then bridges the golden model and the RTL, comparing per-round intermediate state. Match → verification passes; mismatch → the failing round pinpoints the bug.
📚 1. Algorithm Standards and Golden Reference Map
Each algorithm has a recognized primary reference. For SHA-3 and KMAC, the Keccak team's XKCP is the de facto standard. For AES and RSA, OpenSSL is the industry baseline — but using OpenSSL directly as a golden model for RTL comparison is a trap. The reason is covered in depth below.
| Algorithm | Standard | Primary Golden | Oracle Candidate |
|---|---|---|---|
| AES | FIPS 197 | Rijndael C + NIST Intermediate Values | OpenSSL (AES-NI disabled build) |
| SHA-2 | FIPS 180-4 | NIST C Reference Code | OpenSSL EVP_Digest* |
| SHA-3 / SHAKE | FIPS 202 | XKCP `ref` (Keccak team) | OpenSSL 3.x |
| HMAC | FIPS 198-1 | NIST examples + OpenSSL HMAC() | PyCryptodome (supplemental) |
| KMAC | SP 800-185 | XKCP (includes cSHAKE) | OpenSSL 3.0+ KMAC EVP |
| RSA | FIPS 186-4 (→ 186-5) | OpenSSL bignum + NIST examples | Fiat-Crypto (formally verified) |
📥 2. Where to Obtain Them — Reference Source Catalog
🏛️ 2.1 NIST Official Sources (Tier 1 — acquire unconditionally)
▶ CSRC project pages — The algorithm-specific "Examples with Intermediate Values" PDFs are the most direct path to RTL debugging. For AES, each PDF lists the 16-byte state after every round. For SHA-256, it covers the full message schedule W[t]. For Keccak, every θ/ρ/π/χ/ι step result is given byte-by-byte — enabling 1:1 comparison against RTL waveforms.
▶ CAVP legacy ZIP/RSP: AES (AESAVS), SHS (SHA-2), SHA-3, RSA (RSA2VS) — plain-text bundles with tens of thousands of KAT, MCT, and MMT vectors per algorithm.
▶ ACVP (current JSON standard): NIST is migrating from RSP to structured JSON. Parsing ACVP JSON in a UVM or Python testbench is far simpler than hand-parsing RSP files. New projects should default to ACVP JSON.
🔑 2.2 XKCP — De Facto Standard for SHA-3 / KMAC
Repository: github.com/XKCP/XKCP. Maintained directly by the Keccak designers — Bertoni, Daemen, Peeters, and Van Assche — making it the highest-trust implementation of FIPS 202 and SP 800-185. If there is ever a discrepancy between XKCP and another library, XKCP is authoritative.
Usage rule: use only the `ref` folder as your golden model. The AVX-512 and ARMv8 optimized folders are for performance benchmarking only. Assembly-optimized paths diverge structurally from RTL — they exist to maximize throughput, not to mirror the spec's algorithmic steps, which limits their value for per-round debugging.
⚠️ 2.3 OpenSSL — For Oracle / Wrapper Use Only (Unsuitable as Golden)
FIPS 140-3 certification status: Module #4985 (March 11, 2025, OpenSSL 3.1.2) is the current recommended module. Module #4811 (September 24, 2024) was the first FIPS 140-3 certification. OpenSSL 3.5.4 (submitted October 9, 2025) is a next-generation module with PQC support, currently under review.
Gotcha: A call to EVP_EncryptInit dispatches at runtime to AES-NI, VAES, or AVX-512 code paths. There is no guarantee it follows the same algorithmic flow as your RTL. For RTL comparison, always use a clean C reference derived directly from the FIPS 197 spec — one you have audited and understand completely.
🛠️ 2.4 Supporting Tools — Wycheproof · AWS-LC · Fiat-Crypto
▶ Wycheproof (Google → C2SP): The execution harness has been removed; it now serves as a JSON vector library. Essential for edge-case coverage: bad padding, weak keys, non-standard lengths, and other adversarial inputs that NIST CAVP deliberately excludes.
▶ AWS-LC: BoringSSL fork with FIPS certification. Suitable when you need both modern performance (including PQC) and compliance coverage simultaneously.
▶ BoringSSL: Google-internal. No external FIPS certification — not recommended for SoC compliance work.
▶ Fiat-Crypto: Generates formally verified RSA/ECC bignum implementations with machine-checked proofs of correctness. The highest-confidence golden reference option for RSA.
📊 3. Golden Candidate Trust Comparison
Scored across four axes — RTL comparison fit, standards authority, debugging convenience (intermediate value access), and automation friendliness (JSON/DPI-C) — the candidates rank as follows.
* Scores are qualitative assessments — composite of RTL comparison fit, standards authority, and debugging convenience.
🧪 4. Required Test Vector Categories
A design is only "verified" when all seven categories below pass. Skip any one of them and a silent bug can be frozen into a production chip.
| # | Category | Verification Target | Source |
|---|---|---|---|
| 1 | KAT | Standard input → standard output match (minimum requirement) | NIST CAVP |
| 2 | MCT | Thousands-to-millions of chained operations; detects accumulated errors | CAVP / ACVP |
| 3 | MMT | State propagation consistency across multiple block boundaries | CAVP |
| 4 | Intermediate | Per-round state comparison — the core of SoC debugging | FIPS appendix examples |
| 5 | Edge / Invalid | Max/min/zero-length inputs, weak keys, bad padding | Wycheproof |
| 6 | RSA-specific | KeyGen / SigGen·SigVer (PKCS#1, PSS) / OAEP / timing | RSA2VS + Wycheproof |
| 7 | HMAC/KMAC | Three key-length branches: < block size, = block size, > block size | CAVP HMAC |
⚙️ 5. Co-Simulation Workflow — DPI-C Is the Right Choice
When bridging RTL and a golden model, DPI-C (Direct Programming Interface for C) is superior to Python ctypes for cycle-accurate per-round comparison. The standard four-stage pipeline looks like this.
graph LR
A[Golden C
State Exposed] --> B[DPI-C Binding
SystemVerilog]
B --> C[UVM Scoreboard
Per-Round Compare]
C --> D[ACVP JSON
Regression]
style A fill:#e8f8f5,stroke:#16a085
style B fill:#eaf2f8,stroke:#2980b9
style C fill:#fef9e7,stroke:#f39c12
style D fill:#eafaf1,stroke:#27ae60
🔗 Diagram summary: The NIST/XKCP `ref` C code is instrumented to expose internal state, then bound into SystemVerilog via DPI-C. A UVM scoreboard compares RTL output against DPI results at every round. ACVP JSON drives regression automation — a four-stage pipeline that scales from a single design to a full regression suite.
5.1 Key Implementation Patterns
▶ Instrument the C reference to expose an inspection API — for example, void aes_get_state(int round, uint8_t state[16]). This makes the golden model and the RTL share the same observable state structure, enabling direct per-round comparison.
▶ In SystemVerilog, declare import "DPI-C" function void aes_get_state(...) to bind the C function directly into the testbench simulation context.
▶ The UVM scoreboard compares the RTL round output against the DPI call result at every round boundary. The first mismatch pinpoints the exact failing round — there is no ambiguity about where the bug is.
▶ Limit Python to stimulus generation and regression loading. Keep the critical verification path in DPI-C. Mixing Python into the comparison loop introduces latency and reduces observability.
5.2 Library Pitfall Comparison
| Library | Intermediate Value Access | RTL Comparison Fit |
|---|---|---|
| PyCryptodome | ❌ Black box | Not suitable |
| OpenSSL | ⚠️ ASM dispatch | Oracle use only |
| NIST example C | ✅ Explicit | Optimal |
| XKCP ref | ✅ State easily exposed | Optimal |
📖 6. Algorithm Quick Reference — Core Structures
🎯 7. Recommended Production Setup
| Algorithm | Golden (Primary) | Oracle | Edge / Regression |
|---|---|---|---|
| SHA-3 / KMAC | XKCP `ref` | OpenSSL 3.x | Wycheproof JSON |
| AES | Custom C from FIPS 197 Appendix | OpenSSL (AES-NI disabled) | AESAVS + Wycheproof |
| SHA-2 / HMAC | NIST C Reference | OpenSSL | CAVP RSP + ACVP JSON |
| RSA | Fiat-Crypto or custom bignum | OpenSSL | RSA2VS + Wycheproof PSS |
📅 8. Standards Compliance Timeline
#4811 First Cert
OpenSSL 3.1.2
PQC Submitted
Transition Accelerates
🏁 The Four-Layer Golden Rule
The golden rule of SoC crypto IP verification is a four-layer combination: "NIST standard spec + clean C reference + DPI-C intermediate-value comparison + Wycheproof edge cases." OpenSSL is highly useful as a wrapper and oracle, but its platform-specific assembly dispatch — AES-NI, VAES, AVX-512 — makes it unsuitable as the bit-accurate golden model for RTL comparison.
For SHA-3/KMAC: golden = XKCP `ref`. For RSA: golden = Fiat-Crypto or custom code grounded in NIST examples. For AES/SHA-2: golden = C code that embeds the FIPS appendix intermediate values verbatim. Automate regression with ACVP JSON. This is the most robust setup as of 2026. And one thing not to overlook — functional verification and side-channel verification are entirely separate tracks. Budget for TVLA separately.
📎 References
▶ NIST CSRC — Cryptographic Standards
This document is for informational purposes only. Actual IP certification and compliance must follow the latest guidelines from the relevant certification authority. All SoC designs intended for production must undergo review by a qualified security evaluation lab before tape-out.
Materials on semiconductor and SoC design and verification — personally curated and reviewed before publication.
Written based on publicly available data and cited sources. Last updated: June 8, 2026.
댓글
댓글 쓰기