The Golden Rule of SoC Crypto IP Verification — A Complete Golden Reference Guide

📅 May 19, 2026 · 🔐 Cryptographic Algorithms · 🧪 RTL Verification · 🏛️ NIST / XKCP / OpenSSL

When embedding crypto IP — AES, SHA-2/3, HMAC/KMAC, RSA — into an SoC, the first question is deceptively simple: "How do you prove this RTL is bit-accurate?" The answer is a verifiable software reference model — a Golden Reference — combined with test vectors sanctioned by a standards body. Passing NIST CAVP vectors is the baseline, not the finish line. In practice, the real debugging leverage comes from a clean C reference that exposes intermediate values at every round. This guide covers: the authoritative standard for each algorithm, the trust hierarchy of golden reference candidates, where to obtain them, required test vector categories, and the critical pitfall of reaching for OpenSSL as a golden model.

🧭 Verification Workflow — The Full Pipeline at a Glance

Crypto RTL verification is a five-stage chain: "FIPS spec → golden C code → test vectors → DPI-C comparison → regression automation." A single stage driven by assumption — rather than a verified reference — breaks bit-level correctness across the entire chain.


flowchart TD
  A([FIPS Standard Spec]) --> B[Golden C Reference
NIST · XKCP · Fiat]
  B --> C[Test Vectors
KAT · MCT · ACVP JSON]
  C --> D{DPI-C Round
Intermediate match?}
  D -->|YES| E([RTL Verification Passed])
  D -->|NO| F[Bug Localization
State Comparison]
  style A fill:#3498db,stroke:#2980b9,color:#ffffff
  style B fill:#e8f8f5,stroke:#16a085
  style C fill:#fef9e7,stroke:#f39c12
  style D fill:#fef9e7,stroke:#f39c12
  style E fill:#eafaf1,stroke:#27ae60,color:#1e8449
  style F fill:#fdedec,stroke:#e74c3c,color:#c0392b

🔁 Diagram summary: Starting from the FIPS spec, a golden C model is built and exercised against KAT, MCT, and ACVP JSON vectors. DPI-C then bridges the golden model and the RTL, comparing per-round intermediate state. Match → verification passes; mismatch → the failing round pinpoints the bug.

📚 1. Algorithm Standards and Golden Reference Map

Each algorithm has a recognized primary reference. For SHA-3 and KMAC, the Keccak team's XKCP is the de facto standard. For AES and RSA, OpenSSL is the industry baseline — but using OpenSSL directly as a golden model for RTL comparison is a trap. The reason is covered in depth below.

Algorithm	Standard	Primary Golden	Oracle Candidate
AES	FIPS 197	Rijndael C + NIST Intermediate Values	OpenSSL (AES-NI disabled build)
SHA-2	FIPS 180-4	NIST C Reference Code	OpenSSL EVP_Digest*
SHA-3 / SHAKE	FIPS 202	XKCP `ref` (Keccak team)	OpenSSL 3.x
HMAC	FIPS 198-1	NIST examples + OpenSSL HMAC()	PyCryptodome (supplemental)
KMAC	SP 800-185	XKCP (includes cSHAKE)	OpenSSL 3.0+ KMAC EVP
RSA	FIPS 186-4 (→ 186-5)	OpenSSL bignum + NIST examples	Fiat-Crypto (formally verified)

💡 Key insight: For SHA-3 and KMAC, XKCP is the de facto standard. For AES and RSA, OpenSSL is the industry baseline. However, OpenSSL is not suitable as a golden model for direct RTL comparison — that distinction is the central point of this guide.

📥 2. Where to Obtain Them — Reference Source Catalog

🏛️ 2.1 NIST Official Sources (Tier 1 — acquire unconditionally)

▶ CSRC project pages — The algorithm-specific "Examples with Intermediate Values" PDFs are the most direct path to RTL debugging. For AES, each PDF lists the 16-byte state after every round. For SHA-256, it covers the full message schedule W[t]. For Keccak, every θ/ρ/π/χ/ι step result is given byte-by-byte — enabling 1:1 comparison against RTL waveforms.

▶ CAVP legacy ZIP/RSP: AES (AESAVS), SHS (SHA-2), SHA-3, RSA (RSA2VS) — plain-text bundles with tens of thousands of KAT, MCT, and MMT vectors per algorithm.

▶ ACVP (current JSON standard): NIST is migrating from RSP to structured JSON. Parsing ACVP JSON in a UVM or Python testbench is far simpler than hand-parsing RSP files. New projects should default to ACVP JSON.

🔑 2.2 XKCP — De Facto Standard for SHA-3 / KMAC

Repository: github.com/XKCP/XKCP. Maintained directly by the Keccak designers — Bertoni, Daemen, Peeters, and Van Assche — making it the highest-trust implementation of FIPS 202 and SP 800-185. If there is ever a discrepancy between XKCP and another library, XKCP is authoritative.

Usage rule: use only the `ref` folder as your golden model. The AVX-512 and ARMv8 optimized folders are for performance benchmarking only. Assembly-optimized paths diverge structurally from RTL — they exist to maximize throughput, not to mirror the spec's algorithmic steps, which limits their value for per-round debugging.

⚠️ 2.3 OpenSSL — For Oracle / Wrapper Use Only (Unsuitable as Golden)

FIPS 140-3 certification status: Module #4985 (March 11, 2025, OpenSSL 3.1.2) is the current recommended module. Module #4811 (September 24, 2024) was the first FIPS 140-3 certification. OpenSSL 3.5.4 (submitted October 9, 2025) is a next-generation module with PQC support, currently under review.

Gotcha: A call to EVP_EncryptInit dispatches at runtime to AES-NI, VAES, or AVX-512 code paths. There is no guarantee it follows the same algorithmic flow as your RTL. For RTL comparison, always use a clean C reference derived directly from the FIPS 197 spec — one you have audited and understand completely.

🛠️ 2.4 Supporting Tools — Wycheproof · AWS-LC · Fiat-Crypto

▶ Wycheproof (Google → C2SP): The execution harness has been removed; it now serves as a JSON vector library. Essential for edge-case coverage: bad padding, weak keys, non-standard lengths, and other adversarial inputs that NIST CAVP deliberately excludes.

▶ AWS-LC: BoringSSL fork with FIPS certification. Suitable when you need both modern performance (including PQC) and compliance coverage simultaneously.

▶ BoringSSL: Google-internal. No external FIPS certification — not recommended for SoC compliance work.

▶ Fiat-Crypto: Generates formally verified RSA/ECC bignum implementations with machine-checked proofs of correctness. The highest-confidence golden reference option for RSA.

📊 3. Golden Candidate Trust Comparison

Scored across four axes — RTL comparison fit, standards authority, debugging convenience (intermediate value access), and automation friendliness (JSON/DPI-C) — the candidates rank as follows.

NIST Intermediate

XKCP ref (SHA-3)

Fiat-Crypto (RSA)

ACVP JSON

AWS-LC

OpenSSL (Oracle use)

PyCryptodome

BoringSSL (no FIPS)

* Scores are qualitative assessments — composite of RTL comparison fit, standards authority, and debugging convenience.

🧪 4. Required Test Vector Categories

A design is only "verified" when all seven categories below pass. Skip any one of them and a silent bug can be frozen into a production chip.

#	Category	Verification Target	Source
1	KAT	Standard input → standard output match (minimum requirement)	NIST CAVP
2	MCT	Thousands-to-millions of chained operations; detects accumulated errors	CAVP / ACVP
3	MMT	State propagation consistency across multiple block boundaries	CAVP
4	Intermediate	Per-round state comparison — the core of SoC debugging	FIPS appendix examples
5	Edge / Invalid	Max/min/zero-length inputs, weak keys, bad padding	Wycheproof
6	RSA-specific	KeyGen / SigGen·SigVer (PKCS#1, PSS) / OAEP / timing	RSA2VS + Wycheproof
7	HMAC/KMAC	Three key-length branches: < block size, = block size, > block size	CAVP HMAC

⚙️ 5. Co-Simulation Workflow — DPI-C Is the Right Choice

When bridging RTL and a golden model, DPI-C (Direct Programming Interface for C) is superior to Python ctypes for cycle-accurate per-round comparison. The standard four-stage pipeline looks like this.


graph LR
  A[Golden C
State Exposed] --> B[DPI-C Binding
SystemVerilog]
  B --> C[UVM Scoreboard
Per-Round Compare]
  C --> D[ACVP JSON
Regression]
  style A fill:#e8f8f5,stroke:#16a085
  style B fill:#eaf2f8,stroke:#2980b9
  style C fill:#fef9e7,stroke:#f39c12
  style D fill:#eafaf1,stroke:#27ae60

🔗 Diagram summary: The NIST/XKCP `ref` C code is instrumented to expose internal state, then bound into SystemVerilog via DPI-C. A UVM scoreboard compares RTL output against DPI results at every round. ACVP JSON drives regression automation — a four-stage pipeline that scales from a single design to a full regression suite.

5.1 Key Implementation Patterns

▶ Instrument the C reference to expose an inspection API — for example, void aes_get_state(int round, uint8_t state[16]). This makes the golden model and the RTL share the same observable state structure, enabling direct per-round comparison.

▶ In SystemVerilog, declare import "DPI-C" function void aes_get_state(...) to bind the C function directly into the testbench simulation context.

▶ The UVM scoreboard compares the RTL round output against the DPI call result at every round boundary. The first mismatch pinpoints the exact failing round — there is no ambiguity about where the bug is.

▶ Limit Python to stimulus generation and regression loading. Keep the critical verification path in DPI-C. Mixing Python into the comparison loop introduces latency and reduces observability.

5.2 Library Pitfall Comparison

Library	Intermediate Value Access	RTL Comparison Fit
PyCryptodome	❌ Black box	Not suitable
OpenSSL	⚠️ ASM dispatch	Oracle use only
NIST example C	✅ Explicit	Optimal
XKCP ref	✅ State easily exposed	Optimal

🚨 Recommendation: Triple-verify the same vector against (a) a clean C reference, (b) OpenSSL/AWS-LC, and (c) XKCP `ref` to rule out library-dependent errors. Passing against only one source is not sufficient — library-specific bugs have reached production crypto IP exactly this way.

📖 6. Algorithm Quick Reference — Core Structures

🔐 AES (FIPS 197): 128-bit block, 128/192/256-bit keys. Round structure: SubBytes (S-Box) → ShiftRows → MixColumns → AddRoundKey, for 10/12/14 rounds depending on key length. Verification checkpoint: 16-byte state after each round.

#️⃣ SHA-2 (FIPS 180-4): SHA-224/256/384/512. Merkle-Damgård construction with Davies-Meyer compression. 64/80 message schedule words W[t]. Verification checkpoint: working variables a–h at each compression step.

🌀 SHA-3 (FIPS 202): Keccak-f[1600] sponge construction. Five permutation steps — θ, ρ, π, χ, ι — applied for 24 rounds. SHAKE128/256 produce variable-length output. Verification checkpoint: per-round snapshot of the 5×5×64 state array.

🔏 HMAC (FIPS 198-1): H((K⊕opad) ∥ H((K⊕ipad) ∥ M)). Verification checkpoint: correct key-length handling across three cases — key shorter than, equal to, and longer than the block size.

🗝️ KMAC (SP 800-185): Keyed hash built on cSHAKE. Provides explicit domain separation via a Customization String — more disciplined than raw SHA-3 for keyed operations. Verification checkpoint: domain-separation byte padding.

🔑 RSA (FIPS 186-4 → 186-5): c = m^e mod n; signing uses PKCS#1 v1.5 or PSS; encryption uses OAEP. Verifying correctness of bignum, Montgomery reduction, and CRT optimization is the hardest challenge in SoC crypto verification. Verification checkpoint: intermediate values in modular exponentiation.

🎯 7. Recommended Production Setup

Algorithm	Golden (Primary)	Oracle	Edge / Regression
SHA-3 / KMAC	XKCP `ref`	OpenSSL 3.x	Wycheproof JSON
AES	Custom C from FIPS 197 Appendix	OpenSSL (AES-NI disabled)	AESAVS + Wycheproof
SHA-2 / HMAC	NIST C Reference	OpenSSL	CAVP RSP + ACVP JSON
RSA	Fiat-Crypto or custom bignum	OpenSSL	RSA2VS + Wycheproof PSS

🛡️ Physical attack caveat: A software golden reference cannot prove resistance to side-channel attacks or fault injection. Power and timing leakage must be assessed separately using TVLA (Test Vector Leakage Assessment) — a t-test applied to gate-level simulation waveforms or physical measurements. Functional verification and security verification are separate tracks with separate budgets.

📅 8. Standards Compliance Timeline

Sep 2024

FIPS 140-3
#4811 First Cert

Mar 2025

#4985
OpenSSL 3.1.2

Oct 2025

OpenSSL 3.5.4
PQC Submitted

2026+

FIPS 186-5
Transition Accelerates

🏁 The Four-Layer Golden Rule

The golden rule of SoC crypto IP verification is a four-layer combination: "NIST standard spec + clean C reference + DPI-C intermediate-value comparison + Wycheproof edge cases." OpenSSL is highly useful as a wrapper and oracle, but its platform-specific assembly dispatch — AES-NI, VAES, AVX-512 — makes it unsuitable as the bit-accurate golden model for RTL comparison.

For SHA-3/KMAC: golden = XKCP `ref`. For RSA: golden = Fiat-Crypto or custom code grounded in NIST examples. For AES/SHA-2: golden = C code that embeds the FIPS appendix intermediate values verbatim. Automate regression with ACVP JSON. This is the most robust setup as of 2026. And one thing not to overlook — functional verification and side-channel verification are entirely separate tracks. Budget for TVLA separately.

🧠 One-line summary: Standards: NIST. Golden: XKCP · Fiat · NIST examples. Oracle: OpenSSL. Edge cases: Wycheproof. Bridge: DPI-C. Regression: ACVP JSON — six things to remember.

📎 References

▶ NIST CSRC — Cryptographic Standards

▶ NIST ACVP-Server JSON Files

▶ XKCP — Keccak Code Package

▶ Keccak Team Official

▶ Wycheproof (C2SP)

▶ Fiat-Crypto

▶ NIST CAVP

This document is for informational purposes only. Actual IP certification and compliance must follow the latest guidelines from the relevant certification authority. All SoC designs intended for production must undergo review by a qualified security evaluation lab before tape-out.

SoC Design

Semiconductor & SoC Design Notes

Materials on semiconductor and SoC design and verification — personally curated and reviewed before publication.

Blog

Written based on publicly available data and cited sources. Last updated: June 8, 2026.

이 블로그 검색

SoC Design