Quadratic Arithmetic Program

1. zk-SNARK 和 QAP
- 1.1. QAP 的定义
- 1.2. zk-SNARK 中为什么使用 QAP
2. 转换为 QAP
3. Summary

1. zk-SNARK 和 QAP

零知识证明 zk-SNARK 在区块链中的应用最近几年比较活跃，其中往往用到 QAP（Quadratic Arithmetic Program）。

考虑问题：Verify 怎么验证 Prover 执行了下面代码，但 Prover 在证明过程中并不想泄露自己的输入值。这怎么才能实现呢？

function calc(w, arg1, arg2)
    if w then
        return arg1 * arg2
    else
        return arg1 + arg2
    end if
end function

zk-SNARK 的做法是：先把程序转换为等价的 QAP，如果 Prover 给出了 QAP 的一个解，则说明 Prover 确实执行了程序（注：QAP 问题本身是公开的，它的求解很困难，但验证某个解是否正确是容易的）。当然，由于 Prover 不想泄露自己的解，所以他给出的解是某种加密的形式，我们还要仔细设计一下验证过程。

本文仅仅介绍如何把程序转换为等价的 QAP，并不会介绍 Prover 如何加密解，也不介绍 Verify 如何验证这个加密解的有效性。

注：QSP/QAP 的概念由 Rosario Gennaro 等在 2012 年的论文 Quadratic Span Programs and Succinct NIZKs without PCPs 中提出。

1.1. QAP 的定义

所谓 QAP 就是 给定一系列的多项式以及一个目标多项式，是否能根据这一系列的多项式，求得它们的一个线性组合，刚好可以整除目标多项式。 即如下描述：

给定一系列多项式 \(\{l_i(x), r_i(x), o_i(x)\}_{i=0}^n\) ，目标多项式 \(t(x)\)
求一个线性组合 \(\{u_i\}_{i=1}^n\) ，使得下式成立 \[\left(l_0(x) + \sum_{i=1}^n u_i \cdot l_i(x)\right) \cdot \left(r_0(x) + \sum_{i=1}^n u_i \cdot r_i(x)\right) - \left(o_0(x) + \sum_{i=1}^n u_i \cdot o_i(x)\right) = t(x)h(x)\]

注：如果定义 \(u_0 = 1\) ，那么上面也可以表达为：求一个线性组合 \(\{u_i\}_{i=0}^n\) ，使得下式成立 \[\left(\sum_{i=0}^n u_i \cdot l_i(x)\right) \cdot \left(\sum_{i=0}^n u_i \cdot r_i(x)\right) - \left(\sum_{i=0}^n u_i \cdot o_i(x)\right) = t(x)h(x)\]
可记为 \(L(x) \cdot R(x) - O(x) = t(x) h(x)\)

这个问题如果给出了一个解 \(\{u_i\}_{i=0}^n\) （由于 \(u_0=1\) ，也可以省略这个固定值，把解记为 \(\{u_i\}_{i=1}^n\) ，解也称为 witness），那么验证很简单，直接除 \(t(x)\) 看能否整除即可验证解是否正确，但是想求解就很难。

zk-SNARK 的思路就是：将原问题的解转化为一个 QAP 问题的解，然后公开 QAP 问题，这样的话拥有解的人公布自己的 witness（当然是某种加密形式），其他人来验证解是否成立。

1.2. zk-SNARK 中为什么使用 QAP

在 zk-SNARK 中，Verifier 需要“快速地”验证 Prover 的提供的证明是否正确。 QAP 有个特点：验证 QAP 的解是否正确很容易，而求解 QAP 的解则很难。 zk-SNARK 中，把原问题的解转化为一个 QAP 问题的解后，Verifier 就可以“快速地”验证解是否正确了。

2. 转换为 QAP

下面还是以前面提到的程序以例：

function calc(w, arg1, arg2)
    if w then
        return arg1 * arg2
    else
        return arg1 + arg2
    end if
end function

介绍如何把它转换为 QAP。

2.1. 从程序到多项式

首先，我们将程序改写为下面等价的多项式，其中 \(v\) 是程序 \(calc(w,arg1,arg2)\) 返回值：

\begin{align*} w \times (arg1 \times arg2) + (1 - w) \times (arg1 + arg2) &= v \\ w \times (w - 1) &= 0 \end{align*}

我们需要将这上面两个多项式转换成一系列多项式 \(\{l_i(x), r_i(x), o_i(x)\}_{i=0}^n\) ，并且给出目标多项式 \(t(x)\) ，最后代入解 \(\{a_i\}_{i=0}^n\) 应该满足 \(\sum_{i=0}^n a_i \times (l_i(x) \times r_i(x) - o_i(x)) = t(x)h(x)\) 形式。

2.2. 直观的做法

在文章 Zero-Knowledge Proof (zk-SNARK, Pinocchio) 7.8 计算示例中介绍了把上面例子中程序转换为对应 QAP 的直观做法。

这种直观的做法不适合编程实现，我们需要一种标准的做法，请看下节。

2.3. 标准的做法

利用 R1CS（rank 1 constraint system）实现到 QAP 的转换。

2.3.1. R1CS 的定义

R1CS 是由一系列三向量组 \((\vec{a},\vec{b},\vec{c})\) 组成的序列，R1CS 有一个解向量 \(\vec{s}\) ，需要满足：
\[\langle \vec{a_i}, \vec{s} \rangle \times \langle \vec{b_i}, \vec{s} \rangle - \langle \vec{c_i}, \vec{s} \rangle = 0 \quad \textrm{for} \quad \forall i\]
其中 \(\langle \cdot, \cdot \rangle\) 表示两个向量的点积。

2.3.2. 从多项式到算术电路（Flatten）

把多项式写为 \(z = x \;(op) \; y\) 的形式，其中 \(op\) 是加减乘除等运算：

1: sym_1 = arg1 * arg2
2: sym_2 = w * sym_1
3: sym_3 = 1 - w
4: sym_4 = arg1 + arg2
5: sym_5 = sym_3 * sym_4
6: v = sym_2 + sym_5
7: w = w * w

上面的形式即算数电路（Arithmetic circuit），这个过程称为拍平（Flatten）。

为了将所有操作都转成 \(z = x \;(op) \; y\) 的形式，我们加入了 \(sym\_1, sym\_2, \cdots sym\_5\) 中间变量。

2.3.3. 从算术电路到 R1CS

有一种简单的方法可以把拍平后的算术电路转移为 R1CS。把电路中涉及的所有变量（包含中间变量）组成的下面向量作为解向量：
\[ \vec{s} = \begin{pmatrix} 1 \\ w \\ arg1 \\ arg2 \\ sym\_1 \\ sym\_2 \\ sym\_3 \\ sym\_4 \\ sym\_5 \\ v \end{pmatrix}\]
注意，上面 解向量 \(\vec{s}\) 的首个分量固定为数字 1， 这是为了编码常量而故意留的，下面会看到这点。

有了这个解向量后，我们可以按下面方法寻找出满足条件的一系列三向量组 \((\vec{a},\vec{b},\vec{c})\) 。

针对第 1 行的约束 \(arg1 \times arg2 = sym\_1\) ，可以设置：

\begin{align*} \vec{a_0} &= [0, 0, 1, 0, 0, 0, 0, 0, 0, 0] \\ \vec{b_0} &= [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] \\ \vec{c_0} &= [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] \end{align*}

它满足 R1CS 的要求： \(\langle \vec{a_0}, \vec{s} \rangle \times \langle \vec{b_0}, \vec{s} \rangle - \langle \vec{c_0}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_0},\vec{b_0},\vec{c_0}, \vec{s}\) 代入进来就是 \(arg1 \times arg2 - sym\_1 = 0\) ，这显然成立。

针对第 2 行的约束 \(w \times sym\_1 = sym\_2\) ，可以设置：

\begin{align*} \vec{a_1} &= [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{b_1} &= [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] \\ \vec{c_1} &= [0, 0, 0, 0, 0, 1, 0, 0, 0, 0] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_1}, \vec{s} \rangle \times \langle \vec{b_1}, \vec{s} \rangle - \langle \vec{c_1}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_1},\vec{b_1},\vec{c_1}, \vec{s}\) 代入进来就是 \(w \times sym\_1 - sym\_2 = 0\) ，这显然成立。

针对第 3 行的约束 \((1 - w) \times 1 = sym\_3\) ，可以设置：

\begin{align*} \vec{a_2} &= [1, -1, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{b_2} &= [1, 0, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{c_2} &= [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_2}, \vec{s} \rangle \times \langle \vec{b_2}, \vec{s} \rangle - \langle \vec{c_2}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_2},\vec{b_2},\vec{c_2}, \vec{s}\) 代入进来就是 \((1 - w) \times 1 - sym\_3 = 0\) ，这显然成立。

针对第 4 行的约束 \((arg1 + arg2) \times 1 = sym\_4\) ，可以设置：

\begin{align*} \vec{a_3} &= [0, 0, 1, 1, 0, 0, 0, 0, 0, 0] \\ \vec{b_3} &= [1, 0, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{c_3} &= [0, 0, 0, 0, 0, 0, 0, 1, 0, 0] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_3}, \vec{s} \rangle \times \langle \vec{b_3}, \vec{s} \rangle - \langle \vec{c_3}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_3},\vec{b_3},\vec{c_3}, \vec{s}\) 代入进来就是 \((arg1 + arg2) \times 1 - sym\_4 = 0\) ，这显然成立。

针对第 5 行的约束 \(sym\_3 \times sym\_4 = sym\_5\) ，可以设置：

\begin{align*} \vec{a_4} &= [0, 0, 0, 0, 0, 0, 1, 0, 0, 0] \\ \vec{b_4} &= [0, 0, 0, 0, 0, 0, 0, 1, 0, 0] \\ \vec{c_4} &= [0, 0, 0, 0, 0, 0, 0, 0, 1, 0] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_4}, \vec{s} \rangle \times \langle \vec{b_4}, \vec{s} \rangle - \langle \vec{c_4}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_4},\vec{b_4},\vec{c_4}, \vec{s}\) 代入进来就是 \(sym\_3 \times sym\_4 - sym\_5 = 0\) ，这显然成立。

针对第 6 行的约束 \((sym\_2 + sym\_5) \times 1 = v\) ，可以设置：

\begin{align*} \vec{a_5} &= [0, 0, 0, 0, 0, 1, 0, 0, 1, 0] \\ \vec{b_5} &= [1, 0, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{c_5} &= [0, 0, 0, 0, 0, 0, 0, 0, 0, 1] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_5}, \vec{s} \rangle \times \langle \vec{b_5}, \vec{s} \rangle - \langle \vec{c_5}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_5},\vec{b_5},\vec{c_5}, \vec{s}\) 代入进来就是 \((sym\_2 + sym\_5) \times 1 - v = 0\) ，这显然成立。

针对第 7 行的约束 \(w \times w = w\) ，可以设置：

\begin{align*} \vec{a_6} &= [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{b_6} &= [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] \\ \vec{c_6} &= [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] \end{align*}

它也满足 R1CS 的要求： \(\langle \vec{a_6}, \vec{s} \rangle \times \langle \vec{b_6}, \vec{s} \rangle - \langle \vec{c_6}, \vec{s} \rangle = 0\) ，这容易验证，把 \(\vec{a_6},\vec{b_6},\vec{c_6}, \vec{s}\) 代入进来就是 \(w \times w - w = 0\) ，这显然成立。

上面这一系列三向量组 \((\vec{a},\vec{b},\vec{c})\) 组成的序列就是 R1CS：

A
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1, -1, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 1, 1, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0, 0, 1, 0]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

B
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

C
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

2.3.4. 从 R1CS 到 QAP

R1CS 是使用向量点积来实现约束，下面介绍如何把这些约束转换为 QAP。

把前面得到的 \(\vec{a}\) 序列组成矩阵 \(A\) ：
\[ A = \begin{pmatrix} 0& 0& 1& 0& 0& 0& 0& 0& 0& 0 \\ 0& 1& 0& 0& 0& 0& 0& 0& 0& 0 \\ 1& -1& 0& 0& 0& 0& 0& 0& 0& 0 \\ 0& 0& 1& 1& 0& 0& 0& 0& 0& 0 \\ 0& 0& 0& 0& 0& 0& 1& 0& 0& 0 \\ 0& 0& 0& 0& 0& 1& 0& 0& 1& 0 \\ 0& 1& 0& 0& 0& 0& 0& 0& 0& 0 \end{pmatrix}\]

设 \(l_0(x)\) 是经过点 \((1, 0), (2, 0), (3, 1), (4, 0), (5, 0), (6, 0), (7, 0)\) （注： \(x\) 为 \(1,2,\cdots,7\) ，而 \(y\) 值依次是 \(A\) 的第 1 列的每个元素）的多项式，由拉格朗日插值公式（利用这个公式，可以在已经某未知函数经过某些点的情况下，还原出函数。当然不一定要用拉格朗日插值公式，其它多项式插值方法也可以）可得：
\[l_0(x) = \frac{x^6}{48} - \frac{25x^5}{48} + \frac{247x^4}{48} - \frac{1219x^3}{48} + \frac{389x^2}{6} - \frac{949x}{12} + 35\]

设 \(l_1(x)\) 是经过点 \((1, 0), (2, 1), (3, -1), (4, 0), (5, 0), (6, 0), (7, 1)\) （注： \(x\) 为 \(1,2,\cdots,7\) ，而 \(y\) 值依次是 \(A\) 的第 2 列的每个元素）的多项式，由拉格朗日插值公式可得：
\[l_1(x) = - \frac{x^6}{36} + \frac{17x^5}{24} - \frac{515x^4}{72} + \frac{869x^3}{24} - \frac{6863x^2}{72} + \frac{1447x}{12} - 55\]

使用类似的方法，可以求得：

\begin{align*} l_2(x) &= - \frac{19x^6}{720} + \frac{151x^5}{240} - \frac{845x^4}{144} + \frac{1297x^3}{48} - \frac{11449x^2}{180} + \frac{1447x}{20} - 28 \\ l_3(x) &= - \frac{x^6}{36} + \frac{2x^5}{3} - \frac{113x^4}{18} + \frac{88x^3}{3} - \frac{2545x^2}{36} + 82x - 35 \\ l_4(x) &= 0 \\ l_5(x) &= - \frac{x^6}{120} + \frac{11x^5}{60} - \frac{19x^4}{12} + \frac{41x^3}{6} - \frac{1849x^2}{120} + \frac{1019x}{60} - 7 \\ l_6(x) &= \frac{x^6}{48} - \frac{23x^5}{48} + \frac{69x^4}{16} - \frac{925x^3}{48} + \frac{134x^2}{3} - \frac{201x}{4} + 21 \\ l_7(x) &= 0 \\ l_8(x) &= - \frac{x^6}{120} + \frac{11x^5}{60} - \frac{19x^4}{12} + \frac{41x^3}{6} - \frac{1849x^2}{120} + \frac{1019x}{60} - 7 \\ l_9(x) &= 0 \\ \end{align*}

同样地，我们根据 R1CS 中的矩阵 \(B\) 可得 \(r_0(x), r_1(x), \cdots, r_9(x)\) ；根据 R1CS 中的矩阵 \(C\) 可得 \(o_0(x), o_1(x), \cdots, o_9(x)\) 。此外， \(t(x) = (x-1)(x-2)(x-3)(x-4)(x-5)(x-6)(x-7)\) 。

上面例子中，假设 \(v = 6\) ，Prover 想证明他知道的那个解为 \(w=1, arg1=2, arg2=3\) ，则可以计算出中间变量 \(sym\_1 = 6, sym\_2 = 6, sym\_3 = 0, sym\_4 = 5, sym\_5 = 0\) ，从而 R1CS 的解为：
\[\vec{s} = \begin{pmatrix} 1 \\ 1 \\ 2 \\ 3 \\ 6 \\ 6 \\ 0 \\ 5 \\ 0 \\ 6 \end{pmatrix}\]

这个解就是 QAP 的一个解（witness），即 \(\{u_0, u_1, u_2, u_3, u_4, u_5, u_6, u_7, u_8, u_9\} = \{1, 1, 2, 3, 6, 6, 0, 5, 0, 6\}\)

注：Vitalik Buterin 在文章 Quadratic Arithmetic Programs: from Zero to Hero 中介绍了从 R1CS 到 QAP 的转换，且在 https://github.com/ethereum/research/blob/master/zksnark/qap_creator.py 中使用 Python 编写了相应的转换程序。

2.4. 有限域上计算

前面的计算，我们发现会出现浮点数，这样会有舍入误差。在实际中我们会在“有限域”上做算术运算，这样结果都是整数，不存在舍入误差了。

在有限素域 \(\text{GF}(p)\) 中，元素范围是整数 \(0\) 到 \(p-1\) ，其中 \(p\) 为大素数，域上加减乘除（注：域上只定义加法乘法两种运算，减法和除法看作它们的逆运算）运算都要对素数 \(p\) 取模。比如 \(p=13\) ，有 \(10 + 6 = 3, 3 \times 5 = 2, 1 / 2 = 7 \;(\text{as} \; 7 \times 2 = 1)\)

文章 R1CS to QAP (quadratic arithmetic program) with zkSNARKs in Go 中有“在有限域上求 QAP”的例子。

3. Summary

In summary, zk-SNARK proofs are an specific type of zero-knowledge proofs that allow you to prove that you know a set of signals (witness) that match all the constraints of a circuit without revealing any of the signals except the public inputs and the outputs.

Ref: https://docs.circom.io/background/background/#summary