Your definitive guide to zkVMs

Researched and written by our awesome engineering team - Aashutosh Rathi, Prudhvi Rampey, zk_cat and Kautuk during their time at Succinct’s ZK-Residency program that took place in October 2024.

The goal of this article is to not only provide objective performance metrics but also talk about the subjective experience while we were building using this toolkit -

  1. Developer Experience

    1. Docs

    2. Support / Community

    3. Code / API

    4. Happiness Quotient

  2. Features

  3. Performance


General-purpose zkVMs have become a powerful force in quickly building scalable and privacy-preserving applications. Builders no longer need to master low-level circuit construction or complex zk mathematics. By supporting popular languages like Rust and its vast ecosystem, zkVMs help builders reduce development time by orders of magnitude. The idea is simple but potent: Write once in Rust, prove anywhere.

But with the number of zkVMs out there, choosing the right one can be a challenge. We sought to answer some critical questions:

  • What zkVMs are available to use, and how do they perform under real-world conditions?

  • What is the developer experience like for each one?

  • What nuances arise in working with different zkVMs?

  • Can we prove arbitrary Wasm logic inside a zkVM? (spoiler: Yes!)

This is why we conducted yet-another detailed benchmarking exercise, focusing on not only performance but also the practicality and developer experience of each zkVM. We selected 8 real-world cryptographic algorithms that are computationally expensive to directly run on-chain and ran them inside 6 zkVMs: SP1, RISC0, Jolt, Nexus, Delphinus and Powdr. You can view the algorithms and benchmark numbers here.

Ready, Set, Bench

We conducted our benchmarking on the following zkVMs with some commonly used operations with varying complexities and input sizes.

We initially planned to include Valida, but its toolchain wasn’t functional when we began this project. Now that Valida has released an improved Rust toolchain, we’ll revisit it and may follow up with a future article.

We chose the following algorithms to benchmark against:

  • nth Prime

  • ECDSA Verification

  • BLS Verification

  • BLS Aggregation

  • Keccak

  • Poseidon

  • Merklization

  • Merkle Inclusion Proof

To provide a roughly fair comparison, all measures of performance are based on clock cycle counts - each cycle representing a single iteration of a virtual RISC-V processor loop. Clock cycle counts remain consistent across runs on similar hardware, so higher cycle counts generally indicate longer proving times. Except for Delphinus, all the zkVMs here are based on the RISC-V ISA.

To help visualize our results, we've plotted the data on logarithmic scales. The y-axis in each graph shows cycle counts, while the x-axis corresponds to input size (n). In all comparisons, a lower number is better.

Algorithm 1: nth Prime

This deceptively simple algorithm challenges zkVMs, especially when calculating large primes like the 100,000th prime number.

Most of them performed similarly, however, the Nexus VM requires significantly more cycles than other zkVMs, indicating a substantial performance gap while Jolt OOMs (out of memory, read: our archenemy) out earlier. Other zkVMs cluster much closer together in terms of performance, but all zkVMs show increasing cycle counts as input size grows—a trend that will continue across other algorithms.

Algorithm 2: Keccak hashing

Precompiles / Accelerators

Precompiles (SP1) and accelerators (R0) are built-in functions within a virtual machine that execute specialized operations more efficiently than executing the same logic as a standard contract or bytecode, thereby reducing computational overhead and cycle counts. Both SP1 and RISC-0 have a lot of prebuilt libraries for common operations

The presence of precompiles for keccak clearly provides a performance advantage, as seen with SP1 (orange line). While the Nexus VM (purple line) remains inefficient compared to other zkVMs, and RISC0 lags as input size increases, SP1’s implementation with precompiles significantly reduces cycle counts. Among zkVMs without precompiles, SP1, Jolt, Powdr, and Delphinus perform similarly, further highlighting the impact of precompiles in enhancing efficiency.

Algorithm 3: Merklization

Once again, Nexus VM consistently shows the highest cycle counts. Without precompiles, the other zkVMs cluster together in mid-range performance. For larger inputs, cryptographic accelerators provide significant improvements—particularly with SHA-2 precompiles in SP1 and RISC0.

Algorithm 4: BLS Signature verification

BLS signature verification plays a crucial role in various on-chain operations, from light client verification to account abstraction.

In these benchmarks, Nexus demonstrates higher cycle counts, while Jolt struggles with Out-of-Memory errors and fails to execute.

Among the remaining VMs, performance is tightly matched, with Delphinus creeping higher with significantly higher cycle usage. Notably, SP1 with precompiles again provides a remarkable performance boost, highlighting the effectiveness of precompiles.

Other Benchmarks

We’ve also gathered performance data on additional algorithms, including Poseidon hashing and BLS signature aggregation. Due to the volume and complexity of these results, we can’t include them all within this blog post. To provide comprehensive access to these metrics, we’ve made a separate spreadsheet available, ensuring interested readers can explore the full breadth of our findings.

Find it here.

Wasm is awsm?

The ability to prove arbitrary Wasm logic opens up zkVMs to a range of languages that compile to WebAssembly, such as Go, Python, and JavaScript. We conducted a benchmark to evaluate the feasibility of proving Wasm and to quantify the cycle count required.

To achieve this, we compiled Rust code to WebAssembly and tested two approaches:

  • Using wasmi, we interpreted WebAssembly within the zkVM

  • Leveraging Delphinus’s zkWASM VM to trace and prove Wasm instructions

This graph illustrates the performance differences between native Rust, Wasmi and zkWASM for various algorithms.

Key insights include:

  • Performance Gap: Across all algorithms, native Rust outperforms both wasm competitors.

  • Impact of Cryptographic Complexity: For simpler algorithms like nth prime, the performance gap is about an order of magnitude. However, for complex cryptographic tasks like BLS verification, this gap expands to two orders of magnitude.

  • zkWASM Advantage: Delphinus’s zkWASM achieves cycle counts that are relatively closer to native Rust compared to running Wasmi within a general-purpose zkVM, making it the leading option for proving Wasm applications.

  • Performance Trade-Off in Intensive Tasks: Wasm's flexibility comes at a cost, with significant overhead in cryptographic and complex operations compared to native execution.

To mitigate these costs, optimizations could include compiling Wasm directly to RISC-V assembly rather than interpreting it and integrating precompiles through the Wasm Component model.

Letting devs cook

Building within a zkVM is much like working with an embedded system: keeping dependencies minimal is crucial. We prioritized lean crate setups by setting default-features = false and incrementally enabling only necessary features. Additionally, we had to be cautious with serialization/deserialization for complex inputs and carefully manage memory, particularly for heavy tasks.

Given these constraints, its important to have a smooth setup process, thorough documentation, and responsive developer support to capitalize on the productivity gains that zkVMs promise. Here are some key insights on developer experience, from a month of working with various zkVMs:

  • Reliability: SP1 and RISC0 provided the most reliable experiences—everything simply worked. Other VMs sometimes crashed or encountered out-of-memory (OOM) issues when handling larger input sizes.

  • Documentation: RISC0 excelled with clear, well-structured documentation, including concept explanations and detailed diagrams. SP1 and Jolt also provided solid resources, while other VMs fell short in this area.

  • SDK: Jolt had the most intuitive SDK, allowing us to annotate provable functions with a simple macro. The other zkVMs followed a similar SDK pattern, making them relatively easy to adapt to. Also, most of the developments across other SDKs are following the same path as Jolt’s in terms of exposing guest program APIs. However, Delphinus’s initial setup was less straightforward, requiring us to dig into the codebase, which added complexity to the onboarding process.

  • Support: Delphinus led the way in developer support, with their team going above and beyond to resolve issues. SP1, RISC0 and Jolt also offered active community forums for troubleshooting. Even Nexus had good support. (Fun fact, Nexus and Succinct teams are literally 1 block away from each other in SF)

Overall, we had a lot of fun working with different VMs! We even reported multiple bugs and contributed to the development of these zkVMs throughout our project! (One inconvenience at a time 🕺🏻)

Scores

All the scores are based on the testing results of October-November 2024, All the teams are exceptionally fast in shipping updates and

Summary

SP1

SP1 provides a smooth development experience with strong documentation, good APIs, and extensive precompile support. Its prover network is reliable, and the inclusion of standard library support and compatibility with multiple Rust crates simplifies development. Its default recursive STARK proof system can be slow, but the significantly lower cycle counts due to precompiles highlight its efficiency advantages.

  • Offers the most extensive set of precompiles.

  • Default proof system is a recursive STARK, groth16 takes longer than RISC0

  • Better cycle tracking features than most.

  • Highly compatible with Rust crates, easing integration.

  • On-chain proving toolkit is available

  • No direct WASM interface.

RISC0

RISC0 offers a smooth development experience, solid tooling, and decent documentation. Some precompiles and standard library support exist, but there are fewer than SP1. While it lacks a polished GUI for its prover network and can be memory-heavy, RISC0 can produce Groth16 proofs more quickly. Its support is adequate, though not exceptional, and on-chain proving is possible.

  • Less precompiles compared to SP1.

  • Uses Groth16 for proofs, smaller proofs

  • Memory-heavy execution can lead to OOM issues.

  • No prover network web interface/GUI, though a network exists.

  • STARK-to-SNARK bridging only runs on x86.

  • On-chain proving supported but less polished.

Jolt

Jolt provides good developer experience and community support, but it faces severe memory issues leading to OOM errors. While it supports WASM verification and has tooling for stack configuration, the lack of precompiles and poor cycle tracing limit its efficiency. Despite some promise and ongoing fixes, its performance and reliability are currently lacking.

  • No precompiles, leading to higher cycle counts.

  • No Groth16 recursion, no on-chain verification.

  • Current fix proposals exist but are unmerged.

Nexus

Nexus has good support and pluggable guest options but lacks a prover network, standard library, and efficient proof systems. Its Nova-based approach is slow, with large proofs and frequent OOM issues. The developer experience is subpar, and no precompiles or on-chain tooling are available. Parallel processing helps slightly, but overall it remains inefficient.

  • Uses Nova proofs, slow and large.

  • No prover network, all proving is manual.

  • No standard library support.

  • Parallel processing on multiple CPUs is possible.

  • Pluggable guest approach but poor DX.

  • Lacks any form of precompile optimizations.

Valida

Valida uses plonky3 proofs and can achieve high speed on its own operations, but the developer experience is complex. Its Rust SDK is immature, arithmetic operations are limited, and it only runs on Linux. Documentation exists but doesn’t translate into smooth usage. It lacks community support, features, and mod operations, restricting its utility.

  • Runs only on Linux.

  • Immature Rust SDK, limiting integration. (Nov 24, Update available, need to test)

  • No mod operations.

  • Plonky3-based, can be very fast under constrained conditions.

  • Reading/writing IO is overly complex.

Delphinus

Delphinus uses direct Wasm and a GPU-based approach for proofs, resulting in very fast Wasm proving. However, circuit building is slow, and the developer experience requires custom Wasm development with limited documentation. While it supports importing Rust crates and comes with good customer support, proof generation is extremely slow, and certain features remain inaccessible. It provides standard library support but demands extensive setup and dedicated hardware.

  • Requires GPU for proof generation.

  • Direct Wasm approach for proving is fast but circuit building is slow.

  • Standard library supported.

  • Decent Rust crate integration.

  • Setup overhead is significant.

  • Has a “pro” plan for enhanced features and support

Takeaways

  • Precompiles are essential: They significantly reduce cycle counts, and in turn, proving times. In this area, SP1 stands out as a clear leader.

  • Client-side proving isn’t quite there yet: zkVMs today are not ready for efficient client-side proving. This can change with modular VMs like Powdr providing swappable backends like the STWO prover.

  • SP1 and RISC0 stand out: As of November 2024, SP1 and RISC0 are the most mature zkVMs, offering dev tooling, access to prover networks, different proof types and much more. Other zkVMs are catching up though.

  • zkVMs are making zero-knowledge mainstream: We rapidly generated proofs for multiple complex algorithms, showing how zkVMs can tackle real-world challenges swiftly and securely.

  • Not a silver bullet: While zkVMs lower the barrier to entry, custom circuits will likely still yield better performance than general-purpose solutions.

  • Proving is still tough: Efficient proving often requires expensive, specialized hardware, presenting a resource challenge for many developers.

  • Prover networks have trade-offs: They can help scale proving but may introduce privacy considerations and complexity in implementation.

Final verdict

Both SP1 and RISC0 emerge as top-tier solutions. RISC0 offers strong documentation, a smooth developer experience, and efficient proof generation. However, SP1’s advantage lies in its broad precompile support and superior APIs, enabling lower cycle counts and a more streamlined workflow. This efficiency edge—alongside robust tooling, reliable prover networks, and compatibility with standard Rust crates—positions SP1 slightly ahead in the race, making it the standout choice for those prioritizing performance and developer convenience.

But…

While there are currently more zkVM projects than actual zk applications, this abundance of infrastructure initiatives serves as a foundational catalyst. As these zkVMs mature, they will lower the barriers to entry for developers, ultimately enabling a broader range of builders to leverage zero-knowledge proofs in their products and services. Over time, this will foster a thriving ecosystem, encouraging innovation and accelerating the adoption of zero-knowledge technologies across the industry.

Ultimately, general-purpose zkVMs are poised to lead the upcoming ZK revolution, setting new standards for efficiency, scalability, and accessibility!

What’s Next?

Succinct has released a whitepaper outlining a new, fully permissionless prover network protocol designed to enable anyone to contribute computational resources without centralized oversight.

Around the same time, RISC0 unveiled “Boundless,” its own prover network solution aiming to streamline distributed proving and verification at scale.

Additionally, Valida introduced a new Rust-based toolchain claiming significant performance improvements and better developer ergonomics.

We will be taking a closer look at these developments soon, as they promise to advance both efficiency and usability in zero-knowledge proof ecosystems.


Glossary

Benchmark detailed numbers - Spreadsheet

ZK Residency final presentation - Recording, Slides

You can find all the code used for benchmarking here - GitHub

Subscribe to Stackr Labs
Receive the latest updates directly to your inbox.
Mint this entry as an NFT to add it to your collection.
Verification
This entry has been permanently stored onchain and signed by its creator.