This is a note of the talk Demystifying Zero Knowledge Proof.

1 Theory

ZKP is the ability to prove honest computation without revealing inputs . ZKP is not privacy.

The blockchain is a kind of ZKP because one machine does the expensive computation, others can easily verify the proof and accept the new state. But what are the input?

Three popular methods are:

  • Snark: has trusted setup. Small proof size (288 bytes), medium prover time (2.3s), small verification time(10ms). Used in zcash.
  • Stark: big proof size (45kb), small prover time (1.6s), medium verification time(16ms).
  • Bulletproof: small proof size(1.3kB), big prover time(30s), big verification time (1.1s).

1.1 Diffie Hellman

In 1976, Diffie Hellman gave an algorihtm to create a shared secret:

  • Let p = 23 be the modulus, and g = 5 the base. Both p and g are public.
  • Alice chooses a secret a = 4, and sends Bob a message A = g ^ a mod p, i.e, A = 5 ^ 4 mod 23 = 4. A is public.
  • Bob chooses a secrete b = 3, and sends Alice a message B = g ^ b mode p, i.e., B = 5 ^ 3 mod 23 = 10. B is public.
  • Alice computes a secret s = B ^ a mod p = 10 ^ 4 mode 23 = 18. It is the same as B ^ a mod p = g ^ (b * a) mod p
  • Bob computes a secret s = A ^ b mod p = 4 ^ 3 mode 23 = 18. It is the same as A ^ b mode p = g ^ (a * b) mod p. The same as Alice’s secret.

Two key properties:

  • (g ^ a) ^ b) mod p is the same as g ^ b ^ a) mod p.
  • From A or B, it is computationally impossible to get a or b.

1.2 RSA

In 1977, the RSA algorithm based on Diffie Hellman’s algorithm:

  • choose two primes: p = 3, q = 5 to get a number: n = p * q = 15.
  • get least common multiplier (totient): lcm(p - 1, q - 1) = 8
  • choose a private key e such that e is a prime number and 1 < e < totient, for example, e = 3.
  • choose a public key d such that d * e mod (totient) = 1, for example, d = 3 satifies 3 * 3 mod 8 = 1.

There are two possible applications:

  • Signature
    • Alice signs a message m = 2 with a signature c: c = m ^ e mod n = 2 ^ 3 mod 15 = 8.
    • anyone can verify the signature c = 8 for message m using public key d: m = c ^ d mod n = 8 ^ 3 mod 15 = 2.
  • Asymmtric encryption/decryption
    • Bob encrypts a secret message m = 2 with Alice’s public key d: c = m ^ d mod n = 2 ^ 3 mod 15 = 8.
    • Bob sends c to Alice
    • Alice gets m by decrypting c using private key e: m = c ^ e mod n = 3 ^ 8 mod 15 = 2.

Again, the reason is that m = m ^ ((d * e) mod (totien)) mod n and m ^ d ^ e mod n = m ^ e ^ d mod n. The innovation here is a pair of (private, public) keys.

Given a private key e, the public d is calculated using the equation ed + x * totient = 1 that is solved by the Extended Euclidian Algorithm. It depends on the assumption that given n and d, it is impossible to find e because it is computationally impssible to find the the two factors of n.

1.3 Fiat-Shamir

in 1986, Fiat-Shamir proposed an interactive proof of knowledge that allows one to prove information about a number, without revealing the number.

  • g is a public number called the generator.
  • Alice wants to prove that she know prviate key x for public key y, such that y = g ^ x
  • Alice picks a random v, sends t such that t = g ^ v
  • Bob picks and sends a random c to Alice
  • Alice sends back r = v - cx
  • Boc checks t = g ^ r * (y ^ c)

The final formula is true. The process is that Alice gives three values: y is the public key of the private x, t is the public part of a random v, then r is the result of a function determined by x, v and Bob’s random number c. Then Bob can verify that Alice knows the private key x corresponding to the public key y.

Another wiew is that x is the preimage of hash value y. If c is the hash of a message, then (r, t) is Alice’s signature of the document that can be verfied by her public key.

1.4 ECC + Finite Fields

Finite fields (the modulo function) brings cycle to ECC. A curve and an order may have a cycle that is a group.

1.5 Pedersen Commitment

Also called Pedersen hash. You could use Q = vG to “hash” / hide transaction amount. But it is susceptible to “rainbow table” attach as one could precompute popular amount denominations. The solution is to add a blinding factor: Com(v) = vG + bH is called a ^Pedersen Commitment^ where G and H are generator points, b is a random number used as a ^blinding factor^. The Pedersen commitment has a property, if Com(v1) = v1G + b1H and Com(v2) = v2G + b2H, then Com(v1) + Com(v2) = v1G + b1H + v2G + b2H = (v1 + v2)G + (b1 + b2)H.

1.6 Bulletproofs (Range Proofs)

Proving that a number is within a range v is in [0, 2 ^ n). Any number can be represented as inner product of two vectors. For example, 5 = [1, 0, 1] * [2 ^ 2, 2 ^ 1, 2 ^ 0], another form of binary data. 5 needs 3 bits, therefore, it is in the range of [0, 2 ^ 3).

Let v = <a, 2 ^ n>, one needs to prove that a is correct and we can commit secret values using Pedersen commitments and prove we know the value using a technique similar to Fiat-Shamir.

2 SNARKS

There are many variants of SNARKs. For equation: x * y + 4 = 10, prover wants to prove that they know values x and y. It can be broken into ^arithmetic circuits^, three contstrains like: output1 = x * y, output1 + 4 = 10, and equality constrain.

A constraint has a left input, a right input and an output such that left(L) * right(R) = ouput(O). With 3 vectors: <L, v> * <R, v> = <O, v>, where v is the variable vector v = [constant, x, y, output1].

For contraint 1 x * y = output1: L = [0, 1, 0, 0], R = [0, 0, 1, 0], O = [0, 0, 0, 1]. For contraint 2 output1 + 4 = 10: L = [4, 0, 0, 1], R = [1, 0, 0, 0], O = [10, 0, 0, 10]

The goal is to create polynomials for each constraint using ^Lagrange interpolation^. For each elment of L, R, and O in each constraint, we have L[i, j], R[i, j] and O[i, j], where i is the index of the vector (the variable position), j is the index of contraint, and v is the value of the element. For example, L[1, 1] = 0 for first element of L, first contraint. L[1, 2] = 4 for first elemnt of L, second constraint. For the two constraints of the first element, we have two points: (1, 0) and (2, 4). Then y = 4x -4 is the line (wire) connecting the two points for the first element L1.

We can do this for each contraint, for each variable Li(x), Ri(x), and Oi(x). For example, L has four wires: L1(x) = 4x -4, L2(x) = -x + 2, L3(x) = 0, and L4(x) = x - 1.

For x in {1, 2}, we have L(x) * R(x) = O(x) because we only have two constraints. P = L(x) * R(x) - O(x) = T(x) * H(x). T(x) is a public known evaluation used for verfication. H(x) is provided by the ^Prover^ and divides L(x) * R(x) - O(x) evently (without remainder).

The Prover would send evaluation of x for L, R, O, and H polynomials. A challenger must ensure that x is hidden and Prover actaully uses its polynomials. The solution is to encrypt x as a linear combination of (g, s * g, ,,, s ^ d * g) where d is the maximum degree of the polynomials.

The SNARK protocol:

  • Setup: E(s) is known to everyone, T(x) is known to everyone. s is thrown away as ^toxic waste^.
  • Prover has L, R, O and H polynomials.
  • Prover sends E(L(s)), E(R(s)), E(O(s)), E(H(s)).
  • Anyone in the network can verify: E(L(s)) * E(R(s)) - E(O(s)) = E(T(s)) * E(H(s)).
  • The protocol inludes “blinding factors”.

3 DApps with zk-SNARKs

The scalablity ladder: decentralized exchanges, crypto exchanges, tranditional exchanges, payment providers, tokenized world.

Scaling blockchains. In layer 1, make the blockchain scalable, they are Eth2x, Dfinity, Polkadot, and Cosmos etc. Using two types of technology: 1) Sharding, Purning, State rent, WASM and 2) PoS, Casper, Light clients, ZPKs.

In layer2, make the DApp scalable, not depend on blockchain. there are Plasma Cash, Ignis, 0x and Starkware. Layer 2 solutions are posited in two axises: data on chain vs data off chain, fraud proof vs honesty proof. ZPK is off chain honesty proof solution.