History

Qyriad b151773b0d feat: implement concurrent Smt construction (#341 ) * merkle: add parent() helper function on NodeIndex * smt: add pairs_to_leaf() to trait * smt: add sorted_pairs_to_leaves() and test for it * smt: implement single subtree-8 hashing, w/ benchmarks & tests This will be composed into depth-8-subtree-based computation of entire sparse Merkle trees. * merkle: add a benchmark for constructing 256-balanced trees This is intended for comparison with the benchmarks from the previous commit. This benchmark represents the theoretical perfect-efficiency performance we could possibly (but impractically) get for computing depth-8 sparse Merkle subtrees. * smt: test that SparseMerkleTree::build_subtree() is composable * smt: test that subtree logic can correctly construct an entire tree This commit ensures that `SparseMerkleTree::build_subtree()` can correctly compose into building an entire sparse Merkle tree, without yet getting into potential complications concurrency introduces. * smt: implement test for basic parallelized subtree computation w/ rayon Building on the previous commit, this commit implements a test proving that `SparseMerkleTree::build_subtree()` can be composed into itself not just concurrently, but in parallel, without issue. * smt: add from_raw_parts() to trait interface This commit adds a new required method to the SparseMerkleTree trait, to allow generic construction from pre-computed parts. This will be used to add a generic version of `with_entries()` in a later commit. * smt: add parallel constructors to Smt and SimpleSmt What the previous few commits have been leading up to: SparseMerkleTree now has a function to construct the tree from existing data in parallel. This is significantly faster than the singlethreaded equivalent. Benchmarks incoming! --------- Co-authored-by: krushimir <krushimir@reilabs.co> Co-authored-by: krushimir <kresimir.grofelnik@reilabs.io>		2024-12-04 10:54:41 -08:00
..
hash.rs	docs: update changelog and readme	2024-02-14 11:52:40 -08:00
merkle.rs	feat: implement concurrent Smt construction (#341 )	2024-12-04 10:54:41 -08:00
README.md	docs: update changelog and readme	2024-02-14 11:52:40 -08:00
smt-subtree.rs	feat: implement concurrent Smt construction (#341 )	2024-12-04 10:54:41 -08:00
smt-with-entries.rs	feat: implement concurrent Smt construction (#341 )	2024-12-04 10:54:41 -08:00
smt.rs	fix: clippy warnings (#280 )	2024-02-21 20:55:02 -08:00
store.rs	fix: clippy warnings (#280 )	2024-02-21 20:55:02 -08:00

README.md

Miden VM Hash Functions

In the Miden VM, we make use of different hash functions. Some of these are "traditional" hash functions, like BLAKE3, which are optimized for out-of-STARK performance, while others are algebraic hash functions, like Rescue Prime, and are more optimized for a better performance inside the STARK. In what follows, we benchmark several such hash functions and compare against other constructions that are used by other proving systems. More precisely, we benchmark:

BLAKE3 as specified here and implemented here (with a wrapper exposed via this crate).
SHA3 as specified here and implemented here.
Poseidon as specified here and implemented here (but in pure Rust, without vectorized instructions).
Rescue Prime (RP) as specified here and implemented here.
Rescue Prime Optimized (RPO) as specified here and implemented in this crate.
Rescue Prime Extended (RPX) a variant of the xHash hash function as implemented in this crate.

Comparison and Instructions

Comparison

We benchmark the above hash functions using two scenarios. The first is a 2-to-1 (a,b)\mapsto h(a,b) hashing where both a, b and h(a,b) are the digests corresponding to each of the hash functions. The second scenario is that of sequential hashing where we take a sequence of length 100 field elements and hash these to produce a single digest. The digests are 4 field elements in a prime field with modulus 2^{64} - 2^{32} + 1 (i.e., 32 bytes) for Poseidon, Rescue Prime and RPO, and an array [u8; 32] for SHA3 and BLAKE3.

Scenario 1: 2-to-1 hashing `h(a,b)`

Function	BLAKE3	SHA3	Poseidon	Rp64_256	RPO_256	RPX_256
Apple M1 Pro	76 ns	245 ns	1.5 µs	9.1 µs	5.2 µs	2.7 µs
Apple M2 Max	71 ns	233 ns	1.3 µs	7.9 µs	4.6 µs	2.4 µs
Amazon Graviton 3	108 ns				5.3 µs	3.1 µs
AMD Ryzen 9 5950X	64 ns	273 ns	1.2 µs	9.1 µs	5.5 µs
AMD EPYC 9R14	83 ns				4.3 µs	2.4 µs
Intel Core i5-8279U	68 ns	536 ns	2.0 µs	13.6 µs	8.5 µs	4.4 µs
Intel Xeon 8375C	67 ns				8.2 µs

Scenario 2: Sequential hashing of 100 elements `h([a_0,...,a_99])`

Function	BLAKE3	SHA3	Poseidon	Rp64_256	RPO_256	RPX_256
Apple M1 Pro	1.0 µs	1.5 µs	19.4 µs	118 µs	69 µs	35 µs
Apple M2 Max	0.9 µs	1.5 µs	17.4 µs	103 µs	60 µs	31 µs
Amazon Graviton 3	1.4 µs				69 µs	41 µs
AMD Ryzen 9 5950X	0.8 µs	1.7 µs	15.7 µs	120 µs	72 µs
AMD EPYC 9R14	0.9 µs				56 µs	32 µs
Intel Core i5-8279U	0.9 µs				107 µs	56 µs
Intel Xeon 8375C	0.8 µs				110 µs

Notes:

On Graviton 3, RPO256 and RPX256 are run with SVE acceleration enabled.
On AMD EPYC 9R14, RPO256 and RPX256 are run with AVX2 acceleration enabled.

Instructions

Before you can run the benchmarks, you'll need to make sure you have Rust installed. After that, to run the benchmarks for RPO and BLAKE3, clone the current repository, and from the root directory of the repo run the following:

cargo bench hash

To run the benchmarks for Rescue Prime, Poseidon and SHA3, clone the following repository as above, then checkout the hash-functions-benches branch, and from the root directory of the repo run the following:

cargo bench hash