Understanding zk-SNARKs
A visual, intuition-first guide to Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge - explained for engineers who want to truly understand how this cryptographic magic works.
📖 What You'll Learn
By the end of this guide, you'll understand how someone can prove they know a secret (like a password, a solution, or private data) without revealing anything about that secret - and you'll understand the mathematical machinery that makes this seemingly impossible feat possible.
Table of Contents
Prerequisites
You should be comfortable with basic programming concepts. We'll build up the math gradually - you don't need a cryptography background, but familiarity with concepts like hash functions and public-key cryptography helps.
Part 1 Why Zero-Knowledge Proofs Matter
NOTE: Skip this section if you already have some context on zero-knowledge proofs and their use cases.
Imagine you need to prove you're over 21 to enter a bar. Currently, you show your entire ID - revealing your name, address, exact birthdate, and more. But logically, the bouncer only needs one bit of information: "Is this person 21 or older?"
Zero-knowledge proofs solve exactly this kind of problem. They let you prove a statement is true without revealing why it's true or any additional information.
The "Where's Waldo?" Analogy
Suppose I claim I've found Waldo in a complex puzzle, and you want me to prove it without showing you where he is (so you can still enjoy finding him yourself).
Here's how: I take a huge piece of cardboard - much larger than the puzzle page - and cut a small Waldo-shaped hole in it. I position it over the puzzle so only Waldo is visible through the hole. You can verify I found Waldo, but you have no idea where on the page he is because the cardboard obscures everything else.
That's zero-knowledge in action: I proved my knowledge without revealing the actual information (Waldo's location).
The Three Properties of Zero-Knowledge Proofs
Completeness
If the statement is true and both parties follow the protocol honestly, the verifier will be convinced. A valid proof always succeeds.
Soundness
If the statement is false, no cheating prover can convince the verifier (except with negligible probability). You can't fake a proof.
Zero-Knowledge
The verifier learns nothing beyond the fact that the statement is true. No information about the secret "leaks" through the proof.
Real-World Applications
🔐 Privacy-Preserving Authentication
Prove you know a password without sending it. Even if someone intercepts the proof, they can't extract your password or replay the authentication.
💰 Private Cryptocurrency (Zcash)
Prove a transaction is valid (you have the funds, aren't double-spending) without revealing the sender, receiver, or amount. The blockchain verifies the proof without seeing the transaction details.
📊 Verifiable Computation
Outsource computation to an untrusted server and verify the result is correct without re-doing the work. The server provides a proof that the computation was performed correctly.
🗳️ Anonymous Voting
Prove you're an eligible voter and that your vote is valid, without revealing your identity or how you voted. The system can verify the election result without compromising voter privacy.
What Makes zk-SNARKs Special?
Let's break down the acronym:
zk = Zero-Knowledge
The verifier learns nothing beyond the validity of the statement.
S = Succinct
Proofs are tiny (a few hundred bytes) and quick to verify (milliseconds), regardless of how complex the computation being proved.
N = Non-Interactive
No back-and-forth between prover and verifier. The prover sends one message (the proof), and the verifier can check it independently.
ARK = Argument of Knowledge
Not only is the statement true, but the prover actually knowsthe secret witness that makes it true (not just that one exists).
The Central Challenge
How do we prove knowledge of something without revealing it? This sounds paradoxical. The answer lies in a beautiful mathematical construction usingpolynomials.
In the next part, we'll discover why polynomials are the perfect "medium of proof" - they have a remarkable property that lets us compress complex statements into simple checks.
Quick Check: Understanding the Basics
🤔 What does "zero-knowledge" mean?
Part 2 Polynomials - The Secret Weapon
Why polynomials? What makes them special for zero-knowledge proofs? The answer lies in a seemingly simple property that has profound implications.
The Key Insight: Polynomials Rarely Intersect
Think about what this means. A polynomial of degree 10 has only 10 possible intersection points with any other degree-10 polynomial. If we're working with numbers up to some large prime p, the probability of randomly hitting one of those intersection points is at most10/p - astronomically small.
Two degree-3 polynomials intersect at most 3 points. Outside these points, they diverge completely.
🎮 Explore this interactively →
How This Enables Proofs
Suppose a prover claims to know a polynomial p(x) that satisfies certain conditions. The verifier picks a random pointr and asks: "What is p(r)?"
If the prover really knows the correct polynomial, they can answer correctly. But if they're trying to cheat with a different polynomial, they'd have to be incredibly lucky to hit one of the few intersection points.
Simple Polynomial Verification Protocol
From Computation to Polynomial
Here's where it gets interesting. Any computation can be expressed as a polynomial equation. If the computation is correct, the polynomial equation holds. If someone cheated, the equation fails.
The Factory Inspection Analogy
Imagine you own a factory with 1000 assembly stations. Each station must follow specific rules. An inspector could check every station (expensive!), or they could randomly sample a few stations.
If the factory is compliant, every sample passes. If even one station is non-compliant, there's a chance the random sample catches it. With enough samples, the probability of catching cheaters approaches 100%.
Polynomials are like this, but better. Checking one random point is equivalent to sampling the entire polynomial. If any "part" is wrong, the single check will (almost certainly) catch it.
The Target Polynomial
Let's make this concrete. Suppose we have a computation withn steps. We create what's called atarget polynomial:
This polynomial is zero at exactly x = 1, 2, 3, ..., n - one root for each step of our computation.
Now, if our computation is correct, we can construct polynomials that represent the constraints. If all constraints are satisfied, those polynomials will be divisible by t(x).
Proof by Division
If p(x) encodes a correct computation, then:
for some quotient polynomial h(x). The prover provides both p(x) and h(x), and the verifier checks:
If this equality holds at a random point, the polynomials are almost certainly equal everywhere. The prover couldn't have faked it.
Interactive: Polynomial Division Check
Let's see polynomial division in action. We have a target polynomialt(x) = (x-1)(x-2) = x² - 3x + 2
Why This Matters for ZK Proofs
By encoding computation as polynomials and using random evaluation points, we get:
- Compression: A polynomial of degree 1,000,000 can be verified by checking just one point
- Soundness: Cheating requires guessing the random point, which is essentially impossible
- Structure: Complex computations map neatly to polynomial relationships
But we still have problems to solve. If the prover knows the evaluation point ahead of time, they could craft a fake polynomial that only works at that point. We need to hide the evaluation point - but then how can the prover evaluate their polynomial at it?
The answer: homomorphic encryption. That's what we'll explore in Part 3.
Part 3 Making It Non-Interactive & Zero-Knowledge
We have a problem. Our polynomial verification requires the verifier to send a random challenge r to the prover. This is interactive. For blockchain applications, we need the proof to be verifiable by anyone, at any time, without interaction.
The Hidden Evaluation Trick
What if we could have the prover evaluate their polynomial at a secret points that even the prover doesn't know?
This sounds impossible, but homomorphic encryption makes it real. The key insight: we can encrypt s in a way that allows computation on it without revealing it.
Homomorphic Encryption (Simplified)
We use an encryption function E with a special property:
You can multiply encrypted values, and the result is the encryption of their sum! This means you can compute on encrypted data without decrypting it.
🔐 How It Works
We use a generator g and define encryption as:
Because of the properties of exponents:
Going from gx back tox is the "discrete logarithm problem" - computationally infeasible for large numbers.
The Trusted Setup
Here's the clever construction. Before the proof system is used, atrusted setup generates encrypted powers of a secrets:
These are published as the Common Reference String (CRS). The secret s itself is destroyed - nobody knows it, not even the setup participants.
The trusted setup creates encrypted values that everyone uses. The original secret is destroyed.
Evaluating Polynomials on Encrypted Points
Here's the magic. Suppose the prover has polynomialp(x) = 3x² + 2x + 1. They want to computeE(p(s)) without knowing s.
Using the CRS values:
E(p(s)) = E(3s² + 2s + 1)
= E(s²)³ · E(s)² · E(1)¹ // Using homomorphic property
= (g^s²)³ · (g^s)² · g¹ // Substituting encryption
= g^(3s² + 2s + 1) // This equals E(p(s))!
The prover can evaluate their polynomial at the secret points by using only the encrypted values in the CRS. They never learn s itself.
The Verification
Remember, we need to verify that p(x) = t(x) · h(x). The prover provides E(p(s)) andE(h(s)).
The verifier checks:
But wait - how can we check multiplication of encrypted values? That's wherecryptographic pairings come in.
Cryptographic Pairings: The Missing Piece
A pairing is a function e that lets us check multiplicative relationships in encrypted form:
This lets us verify that encrypted values have the right relationship without decrypting them.
🎯 How Pairings Enable Verification
To check if p(s) = t(s) · h(s):
The left side equals e(g,g)^{p(s)}. The right side equals e(g,g)^{t(s)·h(s)}. They're equal only if p(s) = t(s) · h(s).
Making It Actually Zero-Knowledge
We've achieved non-interactive verification, but there's still a problem. The encrypted polynomial evaluations might leak information about the prover's secret values.
The solution: randomization. The prover adds random "blinding" factors to their proof:
Where δ is random. Sincet(x) is zero at all the important points, this doesn't change the validity of the proof. But it completely randomizes the polynomial, making it impossible to extract information from the proof.
Restricting the Prover: The α-Shift
There's one more security concern. What if the prover doesn't use the polynomial they claim? What if they just make up numbers that happen to pass the check?
We add an ingenious constraint. The CRS includes "shifted" versions of each power:
The prover must provide both E(p(s)) andE(α·p(s)). The verifier checks that they're related by factor α:
This only works if the prover actually computed both values from the same polynomial using the CRS values. They can't just make up numbers.
Non-Interactive Verification Protocol
The Toxic Waste Problem
This is why it's called "toxic waste" - it's dangerous if it leaks.
In practice, we use multi-party ceremonies. Many independent participants each contribute randomness. As long as at least one participant is honest and destroys their contribution, the final CRS is secure.
Now we have all the cryptographic machinery. But how do we turn an actual program into a polynomial? That's Part 4.
Part 4 From Programs to Polynomials (QAP)
We've established that polynomials can encode computations and be efficiently verified. But how do we convert a real program - with variables, operations, and logic - into polynomial form?
The answer is a construction called a Quadratic Arithmetic Program (QAP).
Step 1: Flatten the Computation
First, we break down the program into simple operations. Each operation has the form:
Any arithmetic circuit can be expressed this way. Let's trace through an example:
function compute(x):
return x * x * x + x + 5
// Flattened into simple operations:
sym_1 = x * x // x²
y = sym_1 * x // x³
sym_2 = y + x // x³ + x
out = sym_2 + 5 // x³ + x + 5
Each line becomes a constraint that must be satisfied. But we need to express additions within our multiplication framework. We rewrite:
sym_1 = x * x // constraint 1
y = sym_1 * x // constraint 2
sym_2 = (y + x) * 1 // constraint 3 (multiply by 1)
out = (sym_2 + 5) * 1 // constraint 4 (multiply by 1)
Step 2: R1CS (Rank-1 Constraint System)
Now we need to express each constraint mathematically. Every constraint has the form:
We encode this using three vectors: L (left), R (right), and O (output). These vectors act as "selectors" - they pick which variables participate in each side of the equation.
📊 Understanding R1CS Vectors
Step 1: List all variables in a fixed order:
Variable: [ 1, x, out, sym_1, y, sym_2 ]
↑ ↑ ↑ ↑ ↑ ↑
const input output temp temp temp
Step 2: For each constraint, create L, R, O vectors.
A 1 in position i means "include variable i".
Constraint 1: x × x = sym_1
L = [0, 1, 0, 0, 0, 0]
↑ ↑ ↑ ↑ ↑ ↑
1 x out sym_1 y sym_2
// This means: 0×(1) + 1×(x) + 0×(out) + 0×(sym_1) + 0×(y) + 0×(sym_2) = x
// What's on the RIGHT of ×? → x (also at index 1)
R = [0, 1, 0, 0, 0, 0] // Same as L, selects x
// What's the OUTPUT? → sym_1 (which is at index 3)
O = [0, 0, 0, 1, 0, 0] // The 1 is at position 3, selects sym_1
Verification with actual values:
If x = 3, then sym_1 = 9. Our variable values are [1, 3, 35, 9, 27, 30]:
R · values = 0×1 + 1×3 + 0×35 + 0×9 + 0×27 + 0×30 = 3
O · values = 0×1 + 0×3 + 0×35 + 1×9 + 0×27 + 0×30 = 9
// Check: L × R = O?
3 × 3 = 9 ✓
📊 A More Complex Constraint
Constraint 3: (y + x) × 1 = sym_2
This involves addition on the left side, and multiplication by constant 1.
L = [0, 1, 0, 0, 1, 0] // 1s at index 1 (x) AND index 4 (y)
↑ ↑ ↑ ↑ ↑ ↑
1 x out sym_1 y sym_2
// This computes: 0×(1) + 1×(x) + 0×(out) + 0×(sym_1) + 1×(y) + 0×(sym_2) = x + y
// RIGHT side is just the constant 1
R = [1, 0, 0, 0, 0, 0] // 1 at index 0 (the constant)
// OUTPUT is sym_2
O = [0, 0, 0, 0, 0, 1] // 1 at index 5 (sym_2)
Verification: With x=3, y=27, sym_2=30:
R · values = 1×1 + 0×3 + 0×35 + 0×9 + 0×27 + 0×30 = 1 // just the constant
O · values = 0×1 + 0×3 + 0×35 + 0×9 + 0×27 + 1×30 = 30
// Check: L × R = O?
30 × 1 = 30 ✓
Step 3: From R1CS to QAP
We have R1CS vectors for each constraint. But we want to check ALL constraints with a SINGLE verification. This is where polynomials come in.
What is Lagrange Interpolation?
Lagrange interpolation is a technique to find a polynomial that passes through specific points. Given points, it finds the unique polynomial of minimum degree that hits all of them.
🎯 Simple Example
Suppose we want a polynomial that:
- Equals 1 when x = 1
- Equals 0 when x = 2
Lagrange interpolation gives us: f(x) = 2 - x
f(1) = 2 - 1 = 1 ✓
f(2) = 2 - 2 = 0 ✓
That's it! It's just "find the polynomial that passes through these points."
Applying This to R1CS
Remember our constraints? Let's use a simpler example with just 2 constraints:
// Variables: [1, x, output] (indices 0, 1, 2)
// Constraint 1 (at evaluation point x=1): x × x = output
L₁ = [0, 1, 0] // selects x
R₁ = [0, 1, 0] // selects x
O₁ = [0, 0, 1] // selects output
// Constraint 2 (at evaluation point x=2): output × 1 = 9
// (This checks that output equals the public value 9)
L₂ = [0, 0, 1] // selects output
R₂ = [1, 0, 0] // selects constant 1
O₂ = [9, 0, 0] // constant 9 (coefficient 9 on the "1" variable)
The Key Idea: Evaluation Points
Here's the trick that makes everything work. We assign each constraint an"evaluation point":
Constraint 2 → evaluation point t = 2
Constraint 3 → evaluation point t = 3
...and so on
Now, for each variable, we want ONE polynomial that "remembers" how that variable participates in ALL constraints. Let's trace through variable x:
📊 Building a Polynomial for Variable x (Left Side)
Step 1: Look at x's coefficient in each constraint's L vector:
↑ ↑ ↑ (x participates!)
1 x out
Constraint 2: L₂ = [0, 0, 1] → x's coefficient = 0
↑ ↑ ↑ (x doesn't participate)
1 x out
Step 2: We want a polynomial L_x(t) that gives us these coefficients:
When we plug in t = 2 (constraint 2's point): L_x(2) should = 0
Step 3: Lagrange interpolation finds this polynomial:
// Verify it works:
L_x(1) = 2 - 1 = 1 ✓ (matches x's coef in constraint 1)
L_x(2) = 2 - 2 = 0 ✓ (matches x's coef in constraint 2)
The polynomial "stores" all the information! Plug in t=1, get constraint 1's data. Plug in t=2, get constraint 2's data.
We do this for EVERY variable, for L, R, and O:
L_1(t) = 0 // constant 1: coef 0 in both constraints
L_x(t) = 2 - t // x: coef 1 in constraint 1, coef 0 in constraint 2
L_out(t) = t - 1 // output: coef 0 in constraint 1, coef 1 in constraint 2
// R polynomials (right side participation):
R_1(t) = t - 1 // constant 1: coef 0 in constraint 1, coef 1 in constraint 2
R_x(t) = 2 - t // x: coef 1 in constraint 1, coef 0 in constraint 2
R_out(t) = 0 // output: doesn't appear on right in either
// O polynomials (output participation):
O_1(t) = 9·(t - 1) // for the "= 9" in constraint 2
O_x(t) = 0 // x: doesn't appear in output
O_out(t) = 2 - t // output: coef 1 in constraint 1, coef 0 in constraint 2
The QAP Check - How It All Comes Together
Now the prover has secret values. Let's say x = 3, so output = 9. The witness vector is w = [1, 3, 9].
Step 1: Combine the polynomials using the witness values:
L(t) = 1·L_1(t) + 3·L_x(t) + 9·L_out(t)
// R(t) = sum of (witness_i × R_i(t)) for all variables
R(t) = 1·R_1(t) + 3·R_x(t) + 9·R_out(t)
// O(t) = sum of (witness_i × O_i(t)) for all variables
O(t) = 1·O_1(t) + 3·O_x(t) + 9·O_out(t)
Step 2: Check that L(t) × R(t) - O(t) = 0 at all constraint points:
L(1) = 1·L_1(1) + 3·L_x(1) + 9·L_out(1)
= 1·(0) + 3·(1) + 9·(0) = 3 // This is x!
R(1) = 1·R_1(1) + 3·R_x(1) + 9·R_out(1)
= 1·(0) + 3·(1) + 9·(0) = 3 // This is x!
O(1) = 1·O_1(1) + 3·O_x(1) + 9·O_out(1)
= 1·(0) + 3·(0) + 9·(1) = 9 // This is output!
// Check: L(1) × R(1) - O(1) = 3 × 3 - 9 = 0 ✓
// At t = 2 (constraint 2: output × 1 = 9):
L(2) = ... = 9 // This is output!
R(2) = ... = 1 // This is the constant 1!
O(2) = ... = 9 // This is 9!
// Check: L(2) × R(2) - O(2) = 9 × 1 - 9 = 0 ✓
Step 3: The magic of polynomial divisibility:
🔑 The Key Insight
Since L(t) × R(t) - O(t) = 0 at t=1 AND t=2, the polynomialL(t) × R(t) - O(t) has roots at 1 and 2.
A polynomial with roots at 1 and 2 must be divisible by (t-1)(t-2).
We call t(t) = (t-1)(t-2) the target polynomial.
L(t) × R(t) - O(t) = t(t) × h(t) // for some quotient h(t)
// If computation is WRONG:
L(t) × R(t) - O(t) ≠ t(t) × h(t) // division has remainder!
t(t) divide L(t)×R(t) - O(t)?"If yes → ALL constraints are satisfied → computation is correct!
If no → at least one constraint failed → prover is cheating!
R1CS constraints become polynomials via Lagrange interpolation. Checking all constraints becomes one divisibility check.
Example: Proving Knowledge of x Where x³ + x + 5 = 35
Interactive: QAP Construction
We want to prove we know x such that x³ + x + 5 = 35 (the answer is x = 3). Let's trace through the constraint values:
Why This Encoding Works
Completeness
If the prover knows a valid witness (values satisfying all constraints), the polynomial equation holds, and they can produce a valid proof.
Soundness
If any constraint is violated, the polynomial won't be divisible by t(x). The prover can't create a valid h(x) quotient.
Zero-Knowledge
By adding random multiples of t(x), the prover hides their witness while preserving the divisibility property.
Putting It All Together
The complete transformation pipeline:
In Part 5, we'll put together the complete zk-SNARK protocol and see exactly what the prover computes and what the verifier checks.
Part 5 The Complete zk-SNARK Protocol
Let's bring everything together into the full protocol. We'll walk through exactly what happens at each stage.
Phase 1: Setup (Done Once Per Circuit)
🔧 Trusted Setup Ceremony
Input: The circuit (QAP polynomials)
Output: Proving key (pk) and Verification key (vk)
Generate Secrets
Choose random toxic waste: s (evaluation point), α (shift), β, γ, δ (security parameters)
Compute Encrypted Powers
Create CRS elements: E(1), E(s), E(s²), ..., E(s^d) and their α-shifted versions
Embed Circuit Structure
Compute encrypted evaluations of all L_i, R_i, O_i polynomials at the secret point s
Destroy Secrets
Securely delete s, α, β, γ, δ. Only the encrypted values remain.
Phase 2: Proving
🔐 Proof Generation
Input: Public inputs, private witness, proving key (pk)
Output: Proof π
Evaluate Polynomials
Using the witness values, compute L(s), R(s), O(s) via the encrypted values in pk
Compute Quotient
Calculate h(x) = (L(x)·R(x) - O(x)) / t(x) and evaluate h(s)
Add Randomness
Choose random r, z and add blinding factors for zero-knowledge
Construct Proof
Output π = (A, B, C) - three elliptic curve points encoding the encrypted polynomial evaluations
Phase 3: Verification
✓ Proof Verification
Input: Public inputs, proof π, verification key (vk)
Output: Accept or Reject
The verifier performs one pairing equation check:
This single equation verifies everything: polynomial divisibility, consistency checks, and public input binding. If it holds, accept. Otherwise, reject.
What Makes This Remarkable
✅ Proof Size
~200 bytes regardless of computation complexity
✅ Verification Time
~10ms regardless of what's being proved
✅ Privacy
Witness is completely hidden, only validity is revealed
⚠️ Trusted Setup
Requires ceremony; toxic waste must be destroyed
Security Properties
🛡️ Computational Soundness
A cheating prover would need to solve the discrete logarithm problem or find collisions in the pairing - both computationally infeasible.
🎭 Perfect Zero-Knowledge
The randomization (blinding factors) ensures that proofs are statistically indistinguishable from random, even to an adversary with unlimited computational power.
📚 Knowledge Extraction
The α-shift mechanism ensures the prover actually "knows" the witness. There's an extractor algorithm that could extract the witness given access to the prover's internal state.
Practical Considerations
The Full Picture
Recap: The Journey
We discovered:
- Polynomials can encode computations
- Random evaluation compresses verification
- Homomorphic encryption hides the evaluation point
- Pairings enable encrypted verification
- QAP transforms programs into polynomial equations
Where to Go From Here
- Groth16: The most efficient zk-SNARK implementation, used in Zcash and many other projects
- PLONK: A universal setup (one ceremony works for all circuits)
- zk-STARKs: No trusted setup required, post-quantum secure, but larger proofs
- Circom/SnarkJS: Practical tools for building zk-SNARK circuits
🎓 You Made It!
Understanding zk-SNARKs is a significant achievement. You now grasp one of the most elegant and powerful constructions in modern cryptography. This knowledge enables you to understand how privacy-preserving blockchains work, how verifiable computation functions, and opens the door to building zero-knowledge applications.