Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update s2n-bignum subtree #1861

Merged

Conversation

torben-hansen
Copy link
Contributor

@torben-hansen torben-hansen commented Sep 18, 2024

No squash merge

Description of changes:

Updates to latest version of s2n-bignum files. Config used:

PATHS_TO_KEEP="\
./arm/p384 ./x86_att/p384 ./arm/p521 ./x86_att/p521 \
./arm/fastmul/bignum_emontredc_8n.S \
./arm/fastmul/bignum_kmul_16_32.S \
./arm/fastmul/bignum_kmul_32_64.S \
./arm/fastmul/bignum_ksqr_16_32.S \
./arm/fastmul/bignum_ksqr_32_64.S \
./arm/generic/bignum_ge.S \
./arm/generic/bignum_mul.S \
./arm/generic/bignum_optsub.S \
./arm/generic/bignum_sqr.S \
./arm/generic/bignum_copy_row_from_table.S \
./arm/generic/bignum_copy_row_from_table_8n_neon.S \
./arm/generic/bignum_copy_row_from_table_16_neon.S \
./arm/generic/bignum_copy_row_from_table_32_neon.S \
./x86_att/curve25519/curve25519_x25519.S \
./x86_att/curve25519/curve25519_x25519base.S \
./x86_att/curve25519/curve25519_x25519_alt.S \
./x86_att/curve25519/curve25519_x25519base_alt.S  \
./x86_att/curve25519/bignum_neg_p25519.S \
./x86_att/curve25519/bignum_mod_n25519.S  \
./x86_att/curve25519/edwards25519_decode.S  \
./x86_att/curve25519/edwards25519_decode_alt.S  \
./x86_att/curve25519/edwards25519_encode.S  \
./x86_att/curve25519/edwards25519_scalarmulbase.S  \
./x86_att/curve25519/edwards25519_scalarmulbase_alt.S  \
./x86_att/curve25519/edwards25519_scalarmuldouble.S  \
./x86_att/curve25519/edwards25519_scalarmuldouble_alt.S  \
./arm/curve25519/curve25519_x25519.S \
./arm/curve25519/curve25519_x25519base.S \
./arm/curve25519/curve25519_x25519_alt.S \
./arm/curve25519/curve25519_x25519base_alt.S \
./arm/curve25519/curve25519_x25519_byte.S \
./arm/curve25519/curve25519_x25519base_byte.S \
./arm/curve25519/curve25519_x25519_byte_alt.S \
./arm/curve25519/curve25519_x25519base_byte_alt.S \
./arm/curve25519/bignum_neg_p25519.S \
./arm/curve25519/bignum_mod_n25519.S  \
./arm/curve25519/bignum_madd_n25519.S \
./arm/curve25519/bignum_madd_n25519_alt.S \
./arm/curve25519/edwards25519_decode.S  \
./arm/curve25519/edwards25519_decode_alt.S  \
./arm/curve25519/edwards25519_encode.S  \
./arm/curve25519/edwards25519_scalarmulbase.S  \
./arm/curve25519/edwards25519_scalarmulbase_alt.S  \
./arm/curve25519/edwards25519_scalarmuldouble.S  \
./arm/curve25519/edwards25519_scalarmuldouble_alt.S \
./arm/p256/bignum_montinv_p256.S \
./arm/p256/p256_montjscalarmul.S \
./arm/p256/p256_montjscalarmul_alt.S \
./x86_att/p256/bignum_montinv_p256.S \
./x86_att/p256/p256_montjscalarmul.S \
./x86_att/p256/p256_montjscalarmul_alt.S \
./include/_internal_s2n_bignum.h"

Includes the addition of 6 new files:

./arm/p256/bignum_montinv_p256.S \
./arm/p256/p256_montjscalarmul.S \
./arm/p256/p256_montjscalarmul_alt.S \
./x86_att/p256/bignum_montinv_p256.S \
./x86_att/p256/p256_montjscalarmul.S \
./x86_att/p256/p256_montjscalarmul_alt.S \

Testing:

I tested build with

diff --git a/crypto/fipsmodule/CMakeLists.txt b/crypto/fipsmodule/CMakeLists.txt
index aaa370f93..767cc6254 100644
--- a/crypto/fipsmodule/CMakeLists.txt
+++ b/crypto/fipsmodule/CMakeLists.txt
@@ -205,6 +205,10 @@ if((((ARCH STREQUAL "x86_64") AND NOT MY_ASSEMBLER_IS_TOO_OLD_FOR_512AVX) OR
   set(
     S2N_BIGNUM_ASM_SOURCES
 
+    p256/bignum_montinv_p256.S
+    p256/p256_montjscalarmul.S
+    p256/p256_montjscalarmul_alt.S
+
     p384/bignum_add_p384.S
     p384/bignum_sub_p384.S
     p384/bignum_neg_p384.S

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

aqjune-aws and others added 30 commits September 15, 2023 15:32
This patch adds constant-time table-lookup functions
(`bignum_copy_row_from_table*`) and their proofs.
This patch only contains its AArch64 version, and the x86 version will
follow later.
The failure of proving its x86 version seems to be related to handling
negative offsets, and (if this is right) this can be avoided by simply
proving positive offsets.
I will record this as a Github issue with a promise that the x86 scalar
version will be provided after the RSA related things are finished.

This patch contains four table-lookup functions:
1. `bignum_copy_row_from_table`: a lookup for a generic table size
2. `bignum_copy_row_from_table_8n_neon`: a Neon version for a table
   whose width is a multiple of 8
3. `bignum_copy_row_from_table_16_neon`: Neon implementation of a table
   whose width is 16*64=1024 bits
4. `bignum_copy_row_from_table_32_neon`: Neon implementation of a table
   whose width is 32*64=2048 bits

The last two versions are initially written by Hanno Becker.

To successfully compile and run `test` and `benchmark` in x86, the
scalar `bignum_copy_row_from_table` function is processed as a way
similar to Neon functions.

s2n-bignum original commit: awslabs/s2n-bignum@f1ad23c
Add bignum_copy_row_from_table and its Neon-variants for AArch64
s2n-bignum original commit: awslabs/s2n-bignum@50aa85b
This implements the point compression encoding to a byte array from
https://datatracker.ietf.org/doc/html/rfc8032#section-5.1.2
as function "edwards25519_encode". It assumes the input is a point
(x,y) on the edwards25519 curve, with coordinates reduced mod
p_25519 = 2^255 - 19, and does not check any of that.

s2n-bignum original commit: awslabs/s2n-bignum@67430be
This implements point decoding from a 256-bit little-endian byte
sequence to a point (x,y) on the edwards25519 curve as specified in
https://datatracker.ietf.org/doc/html/rfc8032#section-5.1.3
The function returns 0 for success and 1 for failure, the latter
meaning that the input is not the encoding of any edwards25519 point.

s2n-bignum original commit: awslabs/s2n-bignum@97f7493
The function bignum_mod_n25519 performs reduction of an input of any
size (k digits) modulo the order of the curve25519/edwards25519
basepoint, n_25519 = 2^252 + 27742317777372353535851937790883648493.
It generalizes bignum_mod_n25519_4, which is the special case of
4-digit (256-bit) inputs.

s2n-bignum original commit: awslabs/s2n-bignum@e23fd30
The functions bignum_madd_n25519 and bignum_madd_n25519_alt perform
multiply-add modulo the order of the curve25519/edwards25519
basepoint, n_25519 = 2^252 + 27742317777372353535851937790883648493,
i.e. z := (x * y + c) mod n_25519 where all variables are 256 bits.

s2n-bignum original commit: awslabs/s2n-bignum@7fc5883
This replaces the inlined variant of "bignum_modinv" with code from
"bignum_inv_p25519" in all "curve25519_" functions returning an affine
point and hence using modular inverse. There are also a few
consequential changes related to the slightly different amount of
temporary storage needed by this function.

s2n-bignum original commit: awslabs/s2n-bignum@777d574
…ck_no

Document that x25519 function does not implement zero-check
s2n-bignum original commit: awslabs/s2n-bignum@5c4b15a
This replaces the inlined variant of "bignum_modinv" with code from
"bignum_inv_p25519" in all "edwards25519_scalarmul*" functions.
Again, there are consequential changes related to the slightly
different amount of temporary storage needed by bignum_inv_p25519.

s2n-bignum original commit: awslabs/s2n-bignum@7e7b18e
Ed25519 support and related updates
s2n-bignum original commit: awslabs/s2n-bignum@db8409d
Add BFM, BIC, FCSEL, INS, SUB, TRN1, TRN2, USHR, ZIP2 to ARM model
s2n-bignum original commit: awslabs/s2n-bignum@f1caaf1
In general, BOUNDER_RULE now directly handles operations over Z and N,
assuming an outer real_of_int / real_of_num cast into R (this is also
automated in the tactic form BOUNDER_TAC). In particular, this change
can greatly improve bounds for terms involving integer or natural
number division and remainder (DIV, div, MOD and rem) as well as
cutoff subtraction over N. There is also now support for conditionals,
though the condition is not used as extra context, simply being the
basis for a case split.

This update rolls in various trivial typographic fixes in comments.

s2n-bignum original commit: awslabs/s2n-bignum@ccefa2a
…5519

Avoid duplicate labels in ed25519 x86 implementation
s2n-bignum original commit: awslabs/s2n-bignum@f629458
64-bit SIMD regs in ARM model, better BOUNDER_RULE, slow-ARM field optimizations
s2n-bignum original commit: awslabs/s2n-bignum@06781d2
s2n-bignum original commit: awslabs/s2n-bignum@286d596
…er_mscv

Replace static array qualifier with macro that maps to nothing if it detects an msvc compiler
s2n-bignum original commit: awslabs/s2n-bignum@9e60dfc
…_input_const

Make _input_ parameter to ed25519 decode function const
s2n-bignum original commit: awslabs/s2n-bignum@4097178
* Allow MIT-0 license as well as Apache-2.0 and ISC

* Add appropriate year range to MIT-0 license
s2n-bignum original commit: awslabs/s2n-bignum@48fb153
This completely changes the implementation of ARM curve25519_x25519
and curve25519_x25519_byte (not the _alt forms, which remain faster
on their target microarchitectures) to a base-25.5 unsaturated version
with interleaved integer and SIMD operations, the inner loop closely
following Emil Lenngren's implementation described in the paper

  https://github.com/Emill/X25519-AArch64/blob/master/X25519_AArch64.pdf

and available here:

  https://github.com/Emill/X25519-AArch64

A version of this code was generated by SLOTHY from the reorganized
implementation by Abdulrahman, Becker, Kannwischer and Klein here:

 https://github.com/slothy-optimizer/slothy/blob/main/paper/clean/neon/X25519-AArch64-simple.s

as described in the associated paper

  https://eprint.iacr.org/2022/1303.pdf

with some additional annotations for use in the formal proof. The
final modular inverse computation reverts to the usual saturated
representation and s2n-bignum's divstep-based inverse function.

s2n-bignum original commit: awslabs/s2n-bignum@fc0b9bf
Lenngren-based X25519 for non-alt ARM code
s2n-bignum original commit: awslabs/s2n-bignum@57eb68a
jargh and others added 20 commits August 6, 2024 10:55
The new function bignum_montinv_p256 is a "Montgomery domain" variant
of the existing modular inverse function bignum_inv_p256, which seems
often to be better suited to the generally Montgomery-centric field
operations supplied by s2n-bignum for P-256. Given an input x it
returns a z such that x * z == 2^512 (mod p_256), or zero in cases
when x is in {0,p_256}. This does indeed correspond to inversion in
the Montgomery domain because if x == 2^256 * X and z == 2^256 * Z
(both mod p_256) then the congruence x * z == 2^512 (mod p_256) means
that X * Z == 1 (mod p_256). The code is in fact almost identical to
bignum_inv_p256, changing only the starting value for the iterations,
and the proof is very similar.

s2n-bignum original commit: awslabs/s2n-bignum@c07aee5
The new function p256_montjscalarmul[_alt] is analogous to the
existing p256_scalarmul[_alt], doing scalar multiplication n * P for a
point P on the NIST P-256 curve and a scalar n. This variant, however,
uses the Jacobian representation for both input and output points,
with the coordinates in Montgomery form. As such, it is approximately
the same as the "middle" of p256_scalarmul, excluding the mappings
from and back to affine form; it may make a more convenient
building-block for other operations.

s2n-bignum original commit: awslabs/s2n-bignum@0a8a754
Add global assumptions paragraph
s2n-bignum original commit: awslabs/s2n-bignum@d61796f
Update Arm cosimulator to check that opcodes appear, add x86_att test to CI
s2n-bignum original commit: awslabs/s2n-bignum@3eb104f
Add hybrid `p256_montjadd` and `p256_montjdouble` for Arm, slow multipliers
s2n-bignum original commit: awslabs/s2n-bignum@7ff619c
This inherits the performance improvements to p256_montjadd and
p256_montjdouble by analogy with the updates to p256_scalarmul.
The arm/Makefile is also tweaked to attempt to reflect more
accurately the dependencies of individual ".correct" files.

s2n-bignum original commit: awslabs/s2n-bignum@4451534
The new function p384_montjscalarmul[_alt] is the NIST P-384 analog
of the corresponding P-256 function. It does scalar multiplication of
a point on the P-384 curve, where both input and output points are in
the Jacobian representation with coordinates in Montgomery form.

s2n-bignum original commit: awslabs/s2n-bignum@2bbaf04
As pointed out by June Lee in the code review, the various forms of
the new function p384_montjscalarmul[_alt] all had the same typo in
the comment banner of the code, where the Jacobian points were shown
as size 12 instead of 3*6 = 18. The actual C header files in
include/s2n-bignum*.h were already correct.

s2n-bignum original commit: awslabs/s2n-bignum@eacef33
Extra P-256 functions for AWS-LC integration, popcount, basic P-384 scalar mul
s2n-bignum original commit: awslabs/s2n-bignum@08bf556
The new function bignum_inv_p384 is similar to the existing function
bignum_inv_p256, with intermediate results longer (6-7 words in place
of 4-5) and different Montgomery reduction, this time to work for the
NIST P-384 prime p_384 = 2^384 - 2^128 - 2^96 + 2^32 - 1.

s2n-bignum original commit: awslabs/s2n-bignum@c97c521
The new function bignum_montinv_p384 is a "Montgomery domain" variant
of the existing modular inverse function bignum_inv_p384, which seems
often to be better suited to the generally Montgomery-centric field
operations supplied by s2n-bignum for P-384. Given an input x it
returns a z such that x * z == 2^768 (mod p_384), or zero in cases
when x is in {0,p_384}. This does indeed correspond to inversion in
the Montgomery domain because if x == 2^384 * X and z == 2^384 * Z
(both mod p_384) then the congruence x * z == 2^768 (mod p_384) means
that X * Z == 1 (mod p_384). The code is in fact almost identical to
bignum_inv_p384, changing only the starting value for the iterations,
and the proof is very similar.

s2n-bignum original commit: awslabs/s2n-bignum@95b4d64
For some new functions that we want to integrate into AWS-LC,
this satisfies the BoringSSL / AWS-LC delocator by (1) making
the labels unique and (2) avoiding .rep / .endr (the assembler
repetition construct), which is replaced by C macro blocks.

s2n-bignum original commit: awslabs/s2n-bignum@9aa8155
Changing #(I) to #(1*I) in the new macro blocks makes it happy,
and is semantically trivial.

s2n-bignum original commit: awslabs/s2n-bignum@4207da6
P-384 field inverses and delocator-pacifying tweaks
s2n-bignum original commit: awslabs/s2n-bignum@d85c6b5
This patch updates the p384 point operations to use the field operations optimized using NEON and the SLOTHY optimizer.

The performance improvement is around 9%.

```
p384_montjdouble                :   591.7 ns each (var  0.1%, corr -0.02) =    1690105 ops/sec
p384_montjadd                   :  1143.6 ns each (var  0.1%, corr  0.04) =     874447 ops/sec
p384_montjscalarmul             : 329982.3 ns each (var  0.0%, corr  0.09) =       3030 ops/sec
=>
p384_montjdouble                :   543.2 ns each (var  0.1%, corr  0.02) =    1840798 ops/sec
p384_montjadd                   :  1044.5 ns each (var  0.1%, corr -0.09) =     957396 ops/sec
p384_montjscalarmul             : 303017.8 ns each (var  0.0%, corr  0.04) =       3300 ops/sec
```

This patch also includes the following updates:
- Add `arm/proofs/utils` and factor out the OCaml codes that are parameters to the equivalence checking tactics as files in the directory.
- Update EQUIV_STEP_TAC to take additional arguments that describe dead value information of the registers in the programs.
  The information describes which registers contain dead values after each program location.
  EQUIV_STEP_TAC uses this information to clean assumptions that will not be used later during simulation.
- A few tactics for equivalence checking to shorten repeatedly appearing proof patterns and a few updates
  in arm.ml for better error messages
- Bug fixes in actions_merger.ml
- Add a user-defined custom cache to ORTHOGONAL_COMPONENTS_CONV. This is useful for equiv checking because the memory invariants that appear in assumptions have a typical form (the byte64).
- And many other improvements for speed

s2n-bignum original commit: awslabs/s2n-bignum@0af76bc
Adopt the Arm SIMD-optimized p384 fields to point operations
s2n-bignum original commit: awslabs/s2n-bignum@6248d16
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.42%. Comparing base (efa9d6c) to head (e2401bc).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1861      +/-   ##
==========================================
+ Coverage   78.41%   78.42%   +0.01%     
==========================================
  Files         585      585              
  Lines       99411    99410       -1     
  Branches    14232    14234       +2     
==========================================
+ Hits        77951    77962      +11     
+ Misses      20825    20813      -12     
  Partials      635      635              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@torben-hansen torben-hansen merged commit 7090b90 into aws:main Sep 18, 2024
110 checks passed
@torben-hansen torben-hansen deleted the aws-lc-s2n-bignum-update-2024-09-18 branch September 18, 2024 18:12
smittals2 added a commit that referenced this pull request Sep 19, 2024
## What's Changed
* More tweaks for Ruby integration by @samuel40791765 in
#1852
* Implementation of EVP_PKEY_CTX_ctrl_str for various key types by
@justsmth in #1850
* Add MLKEM768 Hybrid Groups to libssl by @alexw91 in
#1849
* add support for PEM_write_bio_PrivateKey_traditional by
@samuel40791765 in #1845
* Update s2n-bignum subtree by @torben-hansen in
#1861
* Add asserts in testing to fix Coverity alert by @smittals2 in
#1864
* Disable CRYPTO_is_AVX512IFMA_capable by @justsmth in
#1858


**Full Changelog**:
v1.35.0...v1.35.1

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants