Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: replace BW6-761 final exp by a class equivalence check #1155

Merged
merged 6 commits into from
Jul 3, 2024

Conversation

yelhousni
Copy link
Contributor

@yelhousni yelhousni commented Jun 3, 2024

Description

Similarly to #1143 we adapt https://eprint.iacr.org/2024/640.pdf to the BW6-761 case with the following parameters:

  • $h = \frac{p^6-1}{r}=3^2\cdot l$ with $\text{gcd}(l,3)=1$
  • $\lambda=x^3-x^2+1-(x+1)q$
// magma code ran online: http://magma.maths.usyd.edu.au/calc
QQ := Rationals();
QQx<x> := PolynomialRing(QQ);
rx := (x^6 - 2*x^5 + 2*x^3 + x + 1)/3;
qx := (103*x^12 - 379*x^11 + 250*x^10 + 691*x^9 - 911*x^8 - 79*x^7 + 623*x^6 - 640*x^5 + 274*x^4 + 763*x^3 + 73*x^2 + 254*x + 229)/9;
M := Matrix(QQx, 2, 2, [rx, 0, -qx mod rx, 1]);
R := LLL(M);
print R

[        x + 1 x^3 - x^2 - x]
[x^3 - x^2 + 1        -x - 1]

assert ((R[2][1] + qx*R[2][2]) mod rx) eq 0; // better Hamming weight
  • $m=\lambda/r$
  • $d=\text{gcd}(m,h)=1$
  • $m'=m/d = m$

First, we find the residue witness in a hint:

  1. Compute r-th root: Raising the miller function to $1/r \pmod h$
  2. Compute m′-th root: Raising the result to $1/m' \pmod h$
    (no need for the modified Tonelli-Shanks for cube roots here as $d=1$ and no need for scaling.)

Then, we check in-circuit that:

MillerLoop == Witness ^ (u^3-u^2+1-(u+1)q)

with two optimized addition chains, a Frobenius power and a hinted division in Fp6.

Type of change

  • New feature (non-breaking change which adds functionality)

How has this been tested?

TestPairingCheckTestSolve test passes.

How has this been benchmarked?

This PR saves 2,679,259 scs in the emulated PLONK verifier of BW6-761 in a BN254-PLONK.

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I did not modify files generated from templates
  • golangci-lint does not output errors locally
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@yelhousni
Copy link
Contributor Author

yelhousni commented Jun 5, 2024

It is though difficult to include the computation of c^{u^3-u^2+1-(u+1)q} in the Miller loop computation (see #1143 (comment)) for two reasons:
1- we use a different Miller loop for BW6 with loop size 3*l2+l1 where l2=u³-u²-u and l1=u+1 (see Alg.2 in https://eprint.iacr.org/2021/1359.pdf). We can have two separate ate Miller loops of sizes u^3-u^2+1 and u+1 but the current ML is way more efficient constraint-wise.
2- The Miller loop in-circuit is implemented using Fp6 as a direct extension of Fp while out-ciruit it is a quadratic over cubic extension. This makes the witness residue different. We can implement the direct extension in gnark-crypto and Toom-6 arithmetic but this is more of a pain in vain because of 1-.

Adapting the algorithms in/out-circuit to match each other would affect performances and make the trick not worth it.

@yelhousni
Copy link
Contributor Author

yelhousni commented Jun 5, 2024

It is though difficult to include the computation of c^{u^3-u^2+1-(u+1)q} in the Miller loop computation (see #1143 (comment)) for two reasons: 1- we use a different Miller loop for BW6 with loop size 3*l2+l1 where l2=u³-u²-u and l1=u+1 (see Alg.2 in https://eprint.iacr.org/2021/1359.pdf). 2- The Miller loop in-circuit is implemented using Fp6 as a direct extension of Fp while out-ciruit it is a quadratic over cubic extension. This makes the witness residue different.

Adapting the algorithms in/out-circuit to match each other would affect performances and make the trick not worth it.

It is though difficult to include the computation of c^{u^3-u^2+1-(u+1)q} in the Miller loop computation (see #1143 (comment)) for two reasons: 1- we use a different Miller loop for BW6 with loop size 3*l2+l1 where l2=u³-u²-u and l1=u+1 (see Alg.2 in https://eprint.iacr.org/2021/1359.pdf). 2- The Miller loop in-circuit is implemented using Fp6 as a direct extension of Fp while out-ciruit it is a quadratic over cubic extension. This makes the witness residue different.

Adapting the algorithms in/out-circuit to match each other would affect performances and make the trick not worth it.

Actually, now that this additional trick might not be worth it, it becomes more relevant to push the Miller function to the cyclotomic subgroup by performing the easy part of the final exp only before doing the class equivalence check. This saves an additional 1,390,037 scs making the total cut at 2,679,259 scs.

Copy link
Collaborator

@ivokub ivokub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yelhousni yelhousni merged commit 55c05b6 into master Jul 3, 2024
7 checks passed
@yelhousni yelhousni deleted the perf/eliminate-finalExp-bw6761 branch July 3, 2024 12:25
This was referenced Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants