Writing Programs
This page explains how to author Rust programs that execute inside the Venus zkVM, and how to drive them with Venus's cargo-zisk CLI.
Project Skeleton
The fastest way to scaffold a new guest is the SDK helper:
This produces:
.
├── build.rs
├── Cargo.toml
├── .gitignore
├── guest
| ├── src
| | └── main.rs
| └── Cargo.toml
└── host
├── src
| └── main.rs
├── bin
| ├── compressed.rs
| ├── execute.rs
| ├── prove.rs
| ├── plonk.rs
| ├── verify-constraints.rs
| └── ziskemu.rs
├── build.rs
└── Cargo.toml
The example program takes a number n as input and computes the SHA-256 hash n times.
Authoring a Guest
A Venus guest is a no_main Rust binary with a Venus-provided entrypoint macro and the ziskos runtime crate.
main.rs
// Compute SHA-256 hash `n` times sequentially.
#![no_main]
ziskos::entrypoint!(main);
use sha2::{Digest, Sha256};
fn main() {
let n: u32 = ziskos::io::read();
let mut hash = [0u8; 32];
for _ in 0..n {
let mut hasher = Sha256::new();
hasher.update(hash);
hash = hasher.finalize().into();
}
ziskos::io::commit(&hash);
}
Cargo.toml
[package]
name = "guest"
version = "0.1.0"
edition = "2021"
[dependencies]
sha2 = "0.10.8"
ziskos = { git = "https://github.com/cysic-labs/venus.git" }
Input / Output
Reading inputs:
let n: u32 = ziskos::io::read();
let my_data: MyStruct = ziskos::io::read(); // any `Deserialize` type
Committing public outputs:
Committed values become public outputs that anyone verifying the proof can inspect.
Building
You can run the guest natively (just like any Rust program) for development:
When you are ready to target the Venus zkVM, build with cargo-zisk:
The resulting ELF lands in target/elf/riscv64ima-zisk-zkvm-elf/release/<name> (or target/riscv64ima-zisk-zkvm-elf/debug/<name> without --release).
Executing in the Emulator
ziskemu runs a guest ELF without generating a proof. Use it to validate behavior before committing time to a full prove run:
If you hit the step limit, raise it with -n:
Performance Metrics (-m)
Execution Statistics (-X)
Cost definitions:
AREA_PER_SEC: 1000000 steps
COST_MEMA_R1: 0.00002 sec
...
Total Cost: 12.81 sec
Main Cost: 4.27 sec 85308 steps
Mem Cost: 2.22 sec 222052 steps
...
Opcodes:
add: 1.12 sec (77 steps/op) (14569 ops)
xor: 1.06 sec (77 steps/op) (13774 ops)
...
Generating a Proof
Once the guest runs correctly in the emulator, you can produce a real proof. The repo's Makefile shows the canonical end-to-end flow; the CLI invocations below are what those targets call under the hood.
Step 1 -- ROM Setup
Required once after the guest ELF is built (and any time it changes):
-e-- ELF path.-k-- proving key directory.
ROM setup files are generated in ./build/provingKey (or $HOME/.zisk/cache if you installed the binaries to ~/.zisk/bin). Use cargo-zisk clean to drop the cache.
Step 2 -- Verify Constraints (Optional)
A fast sanity check that all circuit constraints are satisfied, without producing a full proof:
cargo-zisk verify-constraints \
-e target/elf/riscv64ima-zisk-zkvm-elf/release/guest \
-i host/tmp/input.bin \
-k ./build/provingKey
If everything is correct, you will see:
[INFO ] CstrVrfy: All global constraints were successfully verified
[INFO ] CstrVrfy: All constraints were verified
Step 3 -- Generate the Proof
cargo-zisk prove \
-e target/elf/riscv64ima-zisk-zkvm-elf/release/guest \
-i host/tmp/input.bin \
-k ./build/provingKey \
-o proof -a -y
-e-- ELF path.-i-- input file.-k-- proving key directory.-o-- output directory.-a-- produce a final aggregated proof.-y-- verify the proof immediately after generation.
Successful output ends with:
Step 4 -- Verify the Proof
Concurrent Proof Generation (MPI)
Venus proofs can be generated using multiple processes concurrently to cut wall-clock time. Processes are launched via standard MPI (Message Passing Interface) and may run on the same server or across machines:
mpirun --bind-to none \
-np <num_processes> \
-x OMP_NUM_THREADS=<num_threads_per_process> \
-x RAYON_NUM_THREADS=<num_threads_per_process> \
target/release/cargo-zisk <args>
<num_processes>-- how many processes to launch.<num_threads_per_process>-- threads per process viaOMP_NUM_THREADS/RAYON_NUM_THREADS.--bind-to none-- let the OS schedule processes across cores for better load balancing.
Rule of thumb: <num_processes> * <num_threads_per_process> should match the number of available CPU cores (or 2x with hyperthreading). Memory usage scales linearly with <num_processes> (~25 GB per process).
GPU Proof Generation
Venus's GPU backend ships with several Cysic-contributed optimizations: cudaGraph integration, expression-evaluation kernel tuning, and shared-memory optimization for intermediate buffers.
The default make build already enables GPU support (cargo build --release --features gpu). If you build manually, use:
Notes:
- GPU support is only available for NVIDIA GPUs.
- The CUDA Toolkit must be installed.
- Compile Venus directly on the server where it will run; the binary is optimized for the local GPU architecture.
- GPU memory is typically more limited than system memory. When combining GPU proving with MPI concurrency, ensure each process has enough VRAM headroom.