feat(crypto/experimental/poly): add prime-field & polynomial utilities#646
feat(crypto/experimental/poly): add prime-field & polynomial utilities#646jjllzhang wants to merge 16 commits into
Conversation
- Introduced FpPolynomial class for polynomial operations over finite fields. - Added FpContext and Fp structures for prime field arithmetic. - Implemented basic polynomial operations: addition, subtraction, multiplication, evaluation, and division. - Included multi-point evaluation and interpolation methods. - Created unit tests for polynomial operations and prime field functionalities. - Established a porting plan for integrating poly-interp into yacl::crypto::experimental::poly.
…ethod and improving coefficient handling
…anagement in FpPolynomial and SubproductTree
…oefficient bound handling
…ction for improved prime handling
…election for improved clarity
… test data generation
… generation and update MakeDesc to use it
…ental-poly # Conflicts: # MODULE.bazel.lock
There was a problem hiding this comment.
Code Review
This pull request introduces an experimental prime-field and polynomial library under yacl/crypto/experimental/poly/ along with comprehensive unit tests, and refactors several link and transport tests to improve determinism and synchronization. The code review identified a critical correctness bug in pack_poly_bits when packing multiple limbs per coefficient on 32-bit platforms. Additionally, several redundant modulo operations on already canonical Fp elements were highlighted across the codebase, and a performance optimization was suggested for the polynomial derivative calculation to avoid repeated conversions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if (limbs_per_coeff > 0) { | ||
| const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff; | ||
| out.assign(limb_count, 0); | ||
| for (std::size_t i = 0; i < coeffs.size(); ++i) { | ||
| out[i * (std::size_t)limbs_per_coeff] = (mp_limb_t)coeffs[i].v; | ||
| } |
There was a problem hiding this comment.
There is a correctness bug in pack_poly_bits when limbs_per_coeff > 1 on 32-bit platforms or configurations where coeffs[i].v requires more than 1 limb. It only writes the first limb and leaves the rest as 0, leading to silent data corruption. We should write all limbs of coeffs[i].v into out.
| if (limbs_per_coeff > 0) { | |
| const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff; | |
| out.assign(limb_count, 0); | |
| for (std::size_t i = 0; i < coeffs.size(); ++i) { | |
| out[i * (std::size_t)limbs_per_coeff] = (mp_limb_t)coeffs[i].v; | |
| } | |
| if (limbs_per_coeff > 0) { | |
| const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff; | |
| out.assign(limb_count, 0); | |
| for (std::size_t i = 0; i < coeffs.size(); ++i) { | |
| u64 val = coeffs[i].v; | |
| for (unsigned t = 0; t < limbs_per_coeff; ++t) { | |
| const unsigned bit = t * limb_bits; | |
| if (bit >= 64) break; | |
| out[i * (std::size_t)limbs_per_coeff + t] = (mp_limb_t)(val >> bit); | |
| } | |
| } | |
| } |
| for (auto& x : vec) { | ||
| x.v %= p; | ||
| YACL_ENFORCE(x.v != 0, "FpContext::BatchInv: zero element in batch"); | ||
| } |
There was a problem hiding this comment.
Redundant and expensive modulo operation x.v %= p inside the BatchInv loop. Since Fp elements are guaranteed to be canonical by design, we can replace this with a fast assertion YACL_ENFORCE(x.v != 0 && x.v < p) to avoid costly division instructions.
for (auto& x : vec) {
YACL_ENFORCE(x.v != 0 && x.v < p, "FpContext::BatchInv: invalid or zero element in batch");
}| void FpPolynomial::SetCoeffs(std::vector<Fp> coeffs) { | ||
| RequireContext(); | ||
| c_ = std::move(coeffs); | ||
| for (auto& x : c_) { | ||
| x.v %= ctx_->GetModulus(); | ||
| } | ||
| Trim(); | ||
| } |
There was a problem hiding this comment.
| void FpPolynomial::SetCoeff(size_type i, Fp value) { | ||
| RequireContext(); | ||
| value.v %= ctx_->GetModulus(); | ||
| if (i >= c_.size()) { | ||
| c_.resize(i + 1, ctx_->Zero()); | ||
| } | ||
| c_[i] = value; | ||
| Trim(); | ||
| } |
| FpPolynomial FpPolynomial::ScalarMul(Fp k) const { | ||
| RequireContext(); | ||
| const FpContext& F = *ctx_; | ||
| k.v %= F.GetModulus(); | ||
|
|
||
| if (k.v == 0 || IsZero()) { | ||
| return FpPolynomial(F); | ||
| } |
| for (size_type i = 1; i < c_.size(); ++i) { | ||
| // (a_i * i) x^{i-1} | ||
| Fp ii = F.FromUint64( | ||
| static_cast<u64>(i)); // 自动 i mod p(特征 p 的情况也正确) | ||
| r.c_[i - 1] = F.Mul(c_[i], ii); | ||
| } |
There was a problem hiding this comment.
Performance optimization in Derivative. Instead of calling F.FromUint64(static_cast<u64>(i)) which performs modulo/reduction inside the loop, we can incrementally compute ii using F.Add(ii, F.One()).
Fp ii = F.Zero();
for (size_type i = 1; i < c_.size(); ++i) {
ii = F.Add(ii, F.One());
r.c_[i - 1] = F.Mul(c_[i], ii);
}| Fp FpPolynomial::Eval(Fp x) const { | ||
| RequireContext(); | ||
| const FpContext& F = *ctx_; | ||
| x.v %= F.GetModulus(); | ||
|
|
||
| Fp acc = F.Zero(); |
There was a problem hiding this comment.
| FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build( | ||
| const FpContext& ctx, const std::vector<Fp>& xs) { | ||
| SubproductTree T(ctx); | ||
| T.points = xs; | ||
| for (auto& x : T.points) { | ||
| x.v %= ctx.GetModulus(); | ||
| } |
There was a problem hiding this comment.
Redundant modulo operation x.v %= ctx.GetModulus() in SubproductTree::Build.
| FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build( | |
| const FpContext& ctx, const std::vector<Fp>& xs) { | |
| SubproductTree T(ctx); | |
| T.points = xs; | |
| for (auto& x : T.points) { | |
| x.v %= ctx.GetModulus(); | |
| } | |
| FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build( | |
| const FpContext& ctx, const std::vector<Fp>& xs) { | |
| SubproductTree T(ctx); | |
| T.points = xs; |
| for (std::size_t i = 0; i < n; ++i) { | ||
| Fp yi = ys[i]; | ||
| yi.v %= F.GetModulus(); | ||
| a[i] = F.Mul(yi, inv_dvals[i]); | ||
| } |
Resubmits #632, which was closed due to inactivity.
This PR adds prime-field and polynomial utilities, including NTT/CRT support, polynomial div/mod, multipoint evaluation, and interpolation.
The branch has been updated against the latest upstream main before resubmission.