Skip to content

feat(crypto/experimental/poly): add prime-field & polynomial utilities#646

Open
jjllzhang wants to merge 16 commits into
secretflow:mainfrom
jjllzhang:feat/crypto-experimental-poly-resubmit
Open

feat(crypto/experimental/poly): add prime-field & polynomial utilities#646
jjllzhang wants to merge 16 commits into
secretflow:mainfrom
jjllzhang:feat/crypto-experimental-poly-resubmit

Conversation

@jjllzhang

Copy link
Copy Markdown

Resubmits #632, which was closed due to inactivity.

This PR adds prime-field and polynomial utilities, including NTT/CRT support, polynomial div/mod, multipoint evaluation, and interpolation.

The branch has been updated against the latest upstream main before resubmission.

jjllzhang and others added 15 commits February 2, 2026 23:06
- Introduced FpPolynomial class for polynomial operations over finite fields.
- Added FpContext and Fp structures for prime field arithmetic.
- Implemented basic polynomial operations: addition, subtraction, multiplication, evaluation, and division.
- Included multi-point evaluation and interpolation methods.
- Created unit tests for polynomial operations and prime field functionalities.
- Established a porting plan for integrating poly-interp into yacl::crypto::experimental::poly.
…anagement in FpPolynomial and SubproductTree

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an experimental prime-field and polynomial library under yacl/crypto/experimental/poly/ along with comprehensive unit tests, and refactors several link and transport tests to improve determinism and synchronization. The code review identified a critical correctness bug in pack_poly_bits when packing multiple limbs per coefficient on 32-bit platforms. Additionally, several redundant modulo operations on already canonical Fp elements were highlighted across the codebase, and a performance optimization was suggested for the polynomial derivative calculation to avoid repeated conversions.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +1101 to +1106
if (limbs_per_coeff > 0) {
const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff;
out.assign(limb_count, 0);
for (std::size_t i = 0; i < coeffs.size(); ++i) {
out[i * (std::size_t)limbs_per_coeff] = (mp_limb_t)coeffs[i].v;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a correctness bug in pack_poly_bits when limbs_per_coeff > 1 on 32-bit platforms or configurations where coeffs[i].v requires more than 1 limb. It only writes the first limb and leaves the rest as 0, leading to silent data corruption. We should write all limbs of coeffs[i].v into out.

Suggested change
if (limbs_per_coeff > 0) {
const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff;
out.assign(limb_count, 0);
for (std::size_t i = 0; i < coeffs.size(); ++i) {
out[i * (std::size_t)limbs_per_coeff] = (mp_limb_t)coeffs[i].v;
}
if (limbs_per_coeff > 0) {
const std::size_t limb_count = coeffs.size() * (std::size_t)limbs_per_coeff;
out.assign(limb_count, 0);
for (std::size_t i = 0; i < coeffs.size(); ++i) {
u64 val = coeffs[i].v;
for (unsigned t = 0; t < limbs_per_coeff; ++t) {
const unsigned bit = t * limb_bits;
if (bit >= 64) break;
out[i * (std::size_t)limbs_per_coeff + t] = (mp_limb_t)(val >> bit);
}
}
}

Comment on lines +189 to +192
for (auto& x : vec) {
x.v %= p;
YACL_ENFORCE(x.v != 0, "FpContext::BatchInv: zero element in batch");
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant and expensive modulo operation x.v %= p inside the BatchInv loop. Since Fp elements are guaranteed to be canonical by design, we can replace this with a fast assertion YACL_ENFORCE(x.v != 0 && x.v < p) to avoid costly division instructions.

    for (auto& x : vec) {
      YACL_ENFORCE(x.v != 0 && x.v < p, "FpContext::BatchInv: invalid or zero element in batch");
    }

Comment on lines +63 to +70
void FpPolynomial::SetCoeffs(std::vector<Fp> coeffs) {
RequireContext();
c_ = std::move(coeffs);
for (auto& x : c_) {
x.v %= ctx_->GetModulus();
}
Trim();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation x.v %= ctx_->GetModulus() in SetCoeffs. Since Fp elements are guaranteed to be canonical, this loop is unnecessary and introduces significant overhead.

void FpPolynomial::SetCoeffs(std::vector<Fp> coeffs) {
  RequireContext();
  c_ = std::move(coeffs);
  Trim();
}

Comment on lines +129 to +137
void FpPolynomial::SetCoeff(size_type i, Fp value) {
RequireContext();
value.v %= ctx_->GetModulus();
if (i >= c_.size()) {
c_.resize(i + 1, ctx_->Zero());
}
c_[i] = value;
Trim();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation value.v %= ctx_->GetModulus() in SetCoeff.

void FpPolynomial::SetCoeff(size_type i, Fp value) {
  RequireContext();
  if (i >= c_.size()) {
    c_.resize(i + 1, ctx_->Zero());
  }
  c_[i] = value;
  Trim();
}

Comment on lines +173 to +180
FpPolynomial FpPolynomial::ScalarMul(Fp k) const {
RequireContext();
const FpContext& F = *ctx_;
k.v %= F.GetModulus();

if (k.v == 0 || IsZero()) {
return FpPolynomial(F);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation k.v %= F.GetModulus() in ScalarMul.

FpPolynomial FpPolynomial::ScalarMul(Fp k) const {
  RequireContext();
  const FpContext& F = *ctx_;

  if (k.v == 0 || IsZero()) {
    return FpPolynomial(F);
  }

Comment on lines +202 to +207
for (size_type i = 1; i < c_.size(); ++i) {
// (a_i * i) x^{i-1}
Fp ii = F.FromUint64(
static_cast<u64>(i)); // 自动 i mod p(特征 p 的情况也正确)
r.c_[i - 1] = F.Mul(c_[i], ii);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Performance optimization in Derivative. Instead of calling F.FromUint64(static_cast<u64>(i)) which performs modulo/reduction inside the loop, we can incrementally compute ii using F.Add(ii, F.One()).

  Fp ii = F.Zero();
  for (size_type i = 1; i < c_.size(); ++i) {
    ii = F.Add(ii, F.One());
    r.c_[i - 1] = F.Mul(c_[i], ii);
  }

Comment on lines +212 to +217
Fp FpPolynomial::Eval(Fp x) const {
RequireContext();
const FpContext& F = *ctx_;
x.v %= F.GetModulus();

Fp acc = F.Zero();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation x.v %= F.GetModulus() in Eval.

Suggested change
Fp FpPolynomial::Eval(Fp x) const {
RequireContext();
const FpContext& F = *ctx_;
x.v %= F.GetModulus();
Fp acc = F.Zero();
Fp FpPolynomial::Eval(Fp x) const {
RequireContext();
const FpContext& F = *ctx_;
Fp acc = F.Zero();

Comment on lines +90 to +96
FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build(
const FpContext& ctx, const std::vector<Fp>& xs) {
SubproductTree T(ctx);
T.points = xs;
for (auto& x : T.points) {
x.v %= ctx.GetModulus();
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation x.v %= ctx.GetModulus() in SubproductTree::Build.

Suggested change
FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build(
const FpContext& ctx, const std::vector<Fp>& xs) {
SubproductTree T(ctx);
T.points = xs;
for (auto& x : T.points) {
x.v %= ctx.GetModulus();
}
FpPolynomial::SubproductTree FpPolynomial::SubproductTree::Build(
const FpContext& ctx, const std::vector<Fp>& xs) {
SubproductTree T(ctx);
T.points = xs;

Comment on lines +348 to +352
for (std::size_t i = 0; i < n; ++i) {
Fp yi = ys[i];
yi.v %= F.GetModulus();
a[i] = F.Mul(yi, inv_dvals[i]);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redundant modulo operation yi.v %= F.GetModulus() in InterpolateSubproductTree.

  for (std::size_t i = 0; i < n; ++i) {
    a[i] = F.Mul(ys[i], inv_dvals[i]);
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant