Gate s2n-bignum _alt variants under OPENSSL_SMALL#3320
Open
justsmth wants to merge 3 commits into
Open
Conversation
Every s2n-bignum EC and curve25519 operation ships two assembly
implementations and dispatches between them at runtime:
- x86_64: the primary uses BMI2+ADX; the "_alt" form is the generic
fallback that runs on any x86_64 CPU.
- aarch64: the primary is generic; the "_alt" form targets cores with
higher multiplier throughput (Graviton 3, Apple M1+).
Under OPENSSL_SMALL, pin each operation to the single, universally
compatible variant at compile time and drop the other from the build:
x86_64 keeps "_alt", aarch64 keeps the non-"_alt" form. This removes
~18 object files (~363 KB on x86_64; aarch64 TBD) at the cost of the
microarchitecture-specific fast paths, which is acceptable under the
OPENSSL_SMALL contract.
The selector header pins the symbol (so only the kept variant is
referenced) and crypto/fipsmodule/CMakeLists.txt drops the unused .S
from BCM_ASM_SOURCES. The two are exact mirrors, so no build or link
references a dropped object on any platform.
Non-OPENSSL_SMALL builds are unchanged: the runtime dispatch path is
byte-for-byte identical to before.
This is independent of the OPENSSL_SMALL AVX-512 size reduction. When
that work is also present, OPENSSL_SMALL disables s2n-bignum on x86_64
entirely (Fiat fallback), so the x86_64 removal here becomes a no-op
and aarch64 is the sole beneficiary; when it is not present, the
x86_64 path above is operative.
Testing: built crypto_test with -DOPENSSL_SMALL=ON on x86_64 and ran
the EC (P-224/256/384/521), ECDH, ECDSA, X25519, and Ed25519 suites
(51 tests, all passing). Verified at the object level that the dropped
primaries are absent from libcrypto.a and all _alt references resolve.
aarch64 runtime testing still pending.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3320 +/- ##
==========================================
- Coverage 78.17% 78.17% -0.01%
==========================================
Files 693 693
Lines 123874 123900 +26
Branches 17200 17208 +8
==========================================
+ Hits 96840 96856 +16
- Misses 26116 26125 +9
- Partials 918 919 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Adds .github/workflows/openssl-small.yml with two jobs: (1) size-reduction (x86_64, advisory): link a minimal static consumer (tests/binary-size/main.c) against libcrypto built with and without OPENSSL_SMALL and assert a size reduction (~25% observed; threshold 15%); (2) small-tests (blocking): build with OPENSSL_SMALL and run the full suite (run_tests) across a matrix of x86_64 + aarch64 and gcc + clang. The size-reduction job is advisory (continue-on-error) until calibrated; the small-tests matrix is a correctness gate. The aarch64 legs use GitHub-hosted Arm64 runners (swap for CodeBuild if preferred).
OPENSSL_SMALL implies MY_ASSEMBLER_IS_TOO_OLD_FOR_512AVX on x86_64 (aws#3319), which disables s2n-bignum entirely. Remove the now-unnecessary x86_64 branches from the OPENSSL_SMALL selector header and CMakeLists, leaving only the aarch64 variant pinning.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issues
Addresses: aws/aws-lc-rs#745, P353434599
Description of changes:
Under
OPENSSL_SMALLon aarch64, AWS-LC currently still compiles both the primary and_alts2n-bignum assembly for every EC and curve25519 operation and chooses between them at runtime. This change pins each operation to the single universally-compatible (non-_alt) variant at compile time and drops the_altform from the build. That removes ~254 KB of object code on aarch64 in exchange for the wide-multiplier fast paths (Graviton 3, Apple M-series), which is an acceptable trade under theOPENSSL_SMALLcontract.On x86_64,
OPENSSL_SMALLimpliesMY_ASSEMBLER_IS_TOO_OLD_FOR_512AVX(#3319), which disables s2n-bignum entirely — so no variant pinning is needed there.The non-
OPENSSL_SMALLruntime-dispatch path is left byte-for-byte identical, so onlyOPENSSL_SMALLbuilds change behavior. The only files touched are the AWS-LC-maintained selector headerthird_party/s2n-bignum/s2n-bignum_aws-lc.handcrypto/fipsmodule/CMakeLists.txt— no upstream-imported s2n-bignum assembly is modified, so this survives the next s2n-bignum import.Call-outs:
OPENSSL_SMALL → MY_ASSEMBLER_IS_TOO_OLD_FOR_512AVXimplication, s2n-bignum would still be compiled on x86_64 and the selectors here would call the non-_alt(BMI2/ADX-requiring) variants unconditionally, which is incorrect on older CPUs.OPENSSL_SMALLregresses on aarch64 hardware that would have used the dropped_altfast path (wide-multiplier cores like Graviton 3 / Apple M-series).Testing:
Built
crypto_testwith-DOPENSSL_SMALL=ONon x86_64 and ran the EC (P-224/256/384/521), ECDH, ECDSA, X25519, and Ed25519 suites — all passing. New CI workflow (.github/workflows/openssl-small.yml) runs the full test suite underOPENSSL_SMALLon both x86_64 and aarch64 with gcc and clang, and asserts a minimum binary-size reduction threshold.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.