This is a note part 2 of the paper Why and How zk-SNARK Works. The second half describes a general-purpose ZKP.

1 Computation

If you take two polynomials f(x) and g(x) and try, for example, to multiply/add them h(x) = f(x) * g(x), the result of evaluation of h(x) at any x = r will be the multiplication/addition of results of evaluations of f(r) and g(r).

In the computation of left_operand operator right_operand = output, if we can represent operand values as polynomials, then from the above arithmetci peroperties, we will be able to get the result of an operation imposed by an operand: l(x) operator r(x) = o(x) for x=a, which is an operatioin polynomial l(x) operator r(x) - o(x) = 0 where a is a root and (x - a) is a cofactor. Here, t(x) = x - a is the target polynomial.

For example, for computation 3 * 2 = 6, we have l(x) = 3x, r(x) = 2x, and o(x)= 6x, which evaluate to the corresponding values for a = 1, with is the root of 6x^2 - 6x = 6x * (x - 1) = 0. Therefore if the prover provides such polynomials l(x), r(x), o(x), then the verifier will accept it as valid, since it is divisible by t(x).

There are two reasons that p(x) = l(x) * r(x) - o(x) cannot be used directly. First, the multiplication of encrypted values is not possible because pairing can only be used once and it is required for the “polynomial restriction” check. Second, the prover may change the polynomial. Therefore the knowledge of polynomial must be adjusted. For a single multiplicatoin operation, the formula is l(s) * r(s) - o(s) = t(s) * h(s). Because the subtraction is expensive, the formula becomes l(s) * r(s) = t(s) * h(s) + o(s). The verifier receives pi = (g^l, g^r, g^o, g^h, g^l', g^r', g^o') and checks:

  • Polynomial restriction check
    • e(g^l', g) = e(g^l, g^a)
    • e(g^r', g) = e(g^r, g^a)
    • e(g^o', g) = e(g^o, g^a)
  • Valid operation check: e(g^l, g^r) = e(g^t, g^h) * e(g^o, g)

The zero-knowledge component is missing in the above checking. We also need to extend it to prove multiple operations.

2 Operand and Output Polynomials

Multiple operations can be expressed as multiple polynomials: a * b * c * d is a * b = r1; r1 * c = r2; r2 * d = r3 and so on so forth. For example: 2 * 1 * 3 * 2 is expressed as:

  • 2 * 1 = 2
  • 2 * 3 = 6
  • 6 * 2 = 12

The three formulas can be reprsented as polynomials where x takes 1, 2, 3 for each expression. l(x) passes (1, 2), (2, 2), (3, 6), r(x) passes (1, 1), (2, 3), (3, 2), and o(x) passes (1, 2), (2, 6), (3, 12).

A methods to construct operand and output polynomials that pass through specified points is called interpolation. There are different ways available: set of equations with unknowns, Newton polynomial, Neville’s algorithm, Lagrange polynomial, and Fast Fourier transform etc.

For example, in the set of equations with unknowns method, there exists a unique polynomial of degree at most n with yet unknown coefficients which pass through given n + 1 points such that for each point {(x_i, y_i)} for i in {1, n+1}, the polynomial evaluated at x_i should be equal to y_i.

In the above case, l(x) passes (1, 2), (2, 2), (3, 6), thus it has a degree of 2 with three unknown coefficents: ax^2 + bx + c = y. Sovling it and we have l(x) = 2x^2 - 6x + 6 that passes the specified points. Similarly, r(x) = (-3x^2 + 13x - 8) / 2 and o(x) = x^2 + x. We also have l(x) * r(x) = −3x^4 + 22x^3 − 56x^2 + 63x − 24 and l(x) * r(x) - o(x) = −3x^4 + 22x^3 − 57x^2 + 62x − 24.

The p(x) = l(x) * r(x) - o(x) has roots for x in {1, 2, 3}. Thus t(x) = (x-1)(x-2)(x-3). h(x) = (l(x) * r(x) - o(x)) / t(x) = -3x + 4. It allows us to prove multiple operations at once.

3 Variable Polynomial

If variables are used in our protocol, because a prover can set any coefficients to a polynomial, he can break consistency by setting different values of a variable for different operations. We must ensure that any variable can only have a single value across every operation it is used in.

For example, x^2 - 3x + 3 passes (1, 1), (2, 1) for value a = 1 and 2x^2 - 6x + 6 passes (1, 2), (2, 2) for value a = 2. The first polynomial is proportional to the second with a the same ratioin of 1 : 2. Consequently, if a verifier needs to enforce the prover to set the same value in all operations, then it should only be possible to modify the proportion and not the individual coefficients. The alpha-shifting enforces this rule. It also means that the prover can modify the polynomial by adding or subtracting an arbitrary value to the polynomial. Since any value a can be derived by a * 1, a polynomial should evaluated to 1 for every corresponding operation l(x) = 1 for x = 1 and x = 2, then a * l(x) = a.

If there are multiple variables, we can construct each operand variable polynomial such that if variable is used as an operatnd in the corresponding operation then it evaluates to 1, otherwise to 0. If a is used in {1, 2} and b is used in {3}, then l_a(x) passes {(1, 1), (2, 1), (3, 0)} and l_b(x) passes {(1, 0), (2, 0), (3, 1)}. Then we are able to set the value of each variable separately and just add them together to get the operand polynomial. For example, if a = 3 and b = 2, we have 3 * l_a(x) + 2 * l_b(x). Let’s denote such composite operand polynomial with an upper-case letter as L(x) = a * l_a(x) + b * l_b(x). The prover is not able to modify provided variable polynomial by changing their coefficients, except “assigning” values because necessary encrypted powers of s are unavailable sepratately with their alpha-shift. The prover is also not able to add another polynomial or mulitple another polynomial because alpha-shift (for adding) and encrpted multiplication is not possible in pre-pairings space (for mulitplication).

So far we have been using evaluations of unassigned variable polynomials 1 or 0 as a means to signify if the variable is used in operation or not. More generally, other coefficients can be used because we can interpolate polynomials through any ncessary points. For variable v_i in {v_1, v_2, ..., v_n}, we have (c_l * v_l) operator (c_r * v_r) = c_o * v_o. In this polynomial representation eveery operand expressed by some distinct x is a sum of all operand variable polynomial such that only singe used varible can have a non-zero value and all others are zero.

For example, if L(x) = a * l_a(x) + b * l_b(x) + c * l_c(x), then l_a(x) passes {(1, a), (2, 0), (3, 0)}, l_b(x) passes {(1, 0), (2, b), (3, 0)}, l_c(x) passes {(1, 0), (2, 0), (3, c)}. While L(x) passes {(1, a), (2, b), (3, c)}.

We can add any number of necessary variables for each operand in operation. For example, L(x) = (a + c) * l_a(x) + b * l_b(x) + c * l_c(x) passes {(1, a + c), (2, b), (3, c)}. So intead of (c_l * v_l) operator (c_r * v_r) = c_o * v_o, it becomes (c_l_a * a + c_l_b * b + ...) operator (c_r_a * a + * c_r_b * b + ...) = (c_o_a * a + c_o_b * b + ...).

We have been focus on multiplications and need to do addition, division and subtraction. To compute a + b, we can express it as (a + b) operator 1 = r because our constructor requires a form a constant operator variable, here the value of 1 is expressed as c_one operator v_one where c_one = 1 is hardwired, v_one is to be enforced through constraints. Subtraction is simlar to addition but come with a negative coefficient. The division of a / b = r can be expressed as b * r = a.

Note: the operation’s construction is also called “constraint” because the operation represented by polynomial construction does not compute results per se, but rather checks that the prover already knows variables (including result), and they are valid for the operation, i.e., the prover is constrained to provide consistent values no matter what they are.

4 Example Computation

Consider the following algorihtm 1:

1
2
3
4
5
def calc(w, a, b):
  if w:
    return a * b
  else:
    return a + b

The algorithm can be expressed as f(w, a, b) = w * (a * b) + (1 - w)( (a + b) assuming w is either 0 or 1. In mathematical form it is w * (a * b) + (1 - w)( (a + b) = v where v capatures the result of evaluation. It can be simplified two have two multiplications as: w * (a * b - a - b) = v - a - b. To add a constraint that requires w to be binary, we have (w - 1)(w - 0) = 0, adding 1w * 1w = 1w. We have three fomulas:

  • 1a * 1b = 1m
  • 1w * (1m + (-1)a + (-1)b) = 1v + (-1)a + (-1)b
  • 1w * 1w = 1w

These totals to 5 variables, with 2 in the left operand, 4 in the right operand and 5 in the output:

  • L(x) = a * l_a(x) + w * l_w(x)
  • R(x) = m * r_m(x) + a * r_a(x) + b * r_b(x) + w * r_w(x)
  • O(x) = m * o_m(x) + v * o_v(x) _ a * o_a(x) + b * o_b(x) + w * o_w(x)

where each variable polynomial must evaluate to a corresponding coefficient for each of 3 operations or to 0 if the variable isn’t present in the operation’s operand or output:

  • l_a(1) = 1, l_a(2) = 0, l_a(3) = 0: l_a(x) = (1/2)x^2 - (5/2)x + 3
  • l_w(1) = 0, l_w(2) = 1, l_w(3) = 1: l_w(x) = (-1/2)x^2 + (5/2)x - 2
  • r_m(1) = 0, r_m(2) = 1, r_m(3) = 0: r_m(x) = -x^2 + 4x - 3
  • r_a(1) = 0, r_a(2) = -1, r_a(3) = 0: r_a(x) = x^2 - 4x + 3
  • r_b(1) = 1, r_b(2) = -1, r_b(3) = 0: r_b(x) = (3/2)x^2 - (13/2)x + 6
  • r_w(1) = 0, r_w(2) = 1, r_w(3) = 1: r_w(x) = (1/2)x^2 - (3/2)x + 1
  • o_m(1) = 1, o_m(2) = 0, o_m(3) = 0: o_m(x) = (1/2)x^2 - (5/2)x + 3
  • o_v(1) = 0, o_v(2) = 1, o_v(3) = 0: o_v(x) = -x^2 + 4x - 3
  • o_a(1) = 0, o_a(2) = -1, o_a(3) = 0: o_a(x) = x^2 - 4x + 3
  • o_b(1) = 0, o_b(2) = -1, o_b(3) = 0: o_b(x) = x^2 - 4x + 3
  • o_w(1) = 0, o_w(2) = 0, o_c(3) = 1: o_w(x) = (1/2)x^2 - (3/2)x + 1

The confactor polynomial t(x) = (x - 1)(x - 2)(x - 3). We are ready to prove computation through polynomials.

First, choose input values for the function, for example, w=1, a = 3, b = 2. Secondly, calculate values of intermediary variables from operations: m = a * b = 6 and v = w(m - a - b) + a + b = 6. After, assign coefficents in variable polynomials:

  • L(x) = a * l_a(x) + w * l_w(x) = x^2 - 5x +7
  • R(x) = m * r_m(x) + a * r_a(x) + b * r_b(x) + w * r_w(x) = (1/2)x^2 - (5/2)x + 4
  • O(x) = m * o_m(x) + v * o_v(x) _ a * o_a(x) + b * o_b(x) + w * o_w(x) = (5/2)x^2 - (25/2)x + 16

We have h(x) = (L(x) * R(x) - O(x)) / t(x) = (1/2)x - 2. It wouldn’t be the case if we used inconsistent values of variables. That is how the knowledge of variable values for a correct computation execution is proven on the level of polynomials.