fix(engine): deduplicate string-to-number across runtime and compiler#792
Conversation
Runtime coercion (TGocciaStringLiteralValue.ToNumberLiteral) and bytecode constant folding (StringToCompileTimeNumber) each kept a partial copy of the ECMA-262 StringToNumber parser, so the interpreter and the constant optimizer could fold the same string source to different values. Centralize the logic in a new side-effect-free source/shared/NumericText.pas StringToNumber(): Double, used by both paths (the compiler reuses it without allocating a value object). Implementing the StringNumericLiteral grammar once also closes the shared conformance gaps the two copies had: binary/octal literals (0b/0o), full-magnitude hex (no 32-bit wrap), rejection of "Inf" and bare punctuation, and sign rejection on non-decimal literals; whitespace trimming and signed zero are preserved. No bytecode format or opcode change (ADR 0040 optimizer design unchanged). Closes #787 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughA new ChangesShared StringToNumber consolidation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Suite TimingTest Runner (interpreted: 10,821 passed; bytecode: 10,821 passed)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Test runner worker shutdown frees thread-local heaps in bulk; that shutdown reclamation is not counted as GC collections or collected objects.
Benchmarks (interpreted: 430; bytecode: 430)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Benchmark runner performs explicit between-file collections, so collection and collected-object counts can be much higher than the test runner.
Measured on ubuntu-latest x64. |
Benchmark Results430 benchmarks Interpreted: 🟢 13 improved · 🔴 214 regressed · 203 unchanged · avg -7.1% arraybuffer.js — Interp: 🔴 9, 5 unch. · avg -8.6% · Bytecode: 🔴 2, 12 unch. · avg -0.3%
arrays.js — Interp: 🔴 10, 9 unch. · avg -9.5% · Bytecode: 19 unch. · avg -0.2%
async-await.js — Interp: 🔴 1, 5 unch. · avg -11.2% · Bytecode: 6 unch. · avg +0.9%
async-generators.js — Interp: 🔴 1, 1 unch. · avg -11.5% · Bytecode: 2 unch. · avg -2.6%
atomics.js — Interp: 🔴 1, 5 unch. · avg -7.3% · Bytecode: 6 unch. · avg +0.3%
base64.js — Interp: 🔴 6, 4 unch. · avg -11.3% · Bytecode: 10 unch. · avg +1.4%
classes.js — Interp: 🔴 16, 15 unch. · avg -8.3% · Bytecode: 🟢 2, 🔴 3, 26 unch. · avg -1.2%
closures.js — Interp: 🔴 7, 4 unch. · avg -8.7% · Bytecode: 🟢 1, 10 unch. · avg -1.8%
collections.js — Interp: 🔴 8, 4 unch. · avg -11.0% · Bytecode: 🟢 1, 11 unch. · avg +1.4%
csv.js — Interp: 🔴 9, 4 unch. · avg -11.7% · Bytecode: 🟢 2, 11 unch. · avg +2.6%
destructuring.js — Interp: 🔴 21, 1 unch. · avg -10.4% · Bytecode: 🟢 4, 🔴 1, 17 unch. · avg +0.5%
fibonacci.js — Interp: 🔴 5, 3 unch. · avg -10.8% · Bytecode: 8 unch. · avg -2.9%
float16array.js — Interp: 🔴 15, 17 unch. · avg -6.6% · Bytecode: 🟢 4, 🔴 4, 24 unch. · avg +0.8%
for-of.js — Interp: 🔴 5, 2 unch. · avg -9.2% · Bytecode: 7 unch. · avg -0.0%
generators.js — Interp: 🔴 4 · avg -8.2% · Bytecode: 4 unch. · avg -0.1%
intl.js — Interp: 🔴 5, 1 unch. · avg -15.4% · Bytecode: 🟢 1, 5 unch. · avg +5.7%
iterators.js — Interp: 🔴 13, 29 unch. · avg -5.7% · Bytecode: 🔴 6, 36 unch. · avg -1.3%
json.js — Interp: 🔴 8, 12 unch. · avg -10.2% · Bytecode: 🟢 4, 16 unch. · avg +0.5%
jsx.jsx — Interp: 🔴 10, 11 unch. · avg -10.6% · Bytecode: 🟢 2, 🔴 3, 16 unch. · avg -3.3%
modules.js — Interp: 🔴 5, 4 unch. · avg -10.1% · Bytecode: 9 unch. · avg +1.0%
numbers.js — Interp: 🔴 3, 8 unch. · avg -8.4% · Bytecode: 🔴 1, 10 unch. · avg -1.6%
objects.js — Interp: 🔴 3, 4 unch. · avg -10.9% · Bytecode: 7 unch. · avg -3.2%
promises.js — Interp: 🔴 3, 9 unch. · avg -2.7% · Bytecode: 🔴 4, 8 unch. · avg -3.4%
property-access.js — Interp: 🔴 3, 2 unch. · avg -6.9% · Bytecode: 5 unch. · avg +0.0%
regexp.js — Interp: 🔴 4, 7 unch. · avg -12.2% · Bytecode: 🟢 1, 🔴 2, 8 unch. · avg +2.3%
strings.js — Interp: 🔴 11, 8 unch. · avg -10.4% · Bytecode: 🟢 1, 🔴 1, 17 unch. · avg +0.9%
temporal.js — Interp: 🔴 4, 2 unch. · avg -12.8% · Bytecode: 🟢 1, 5 unch. · avg +1.0%
tsv.js — Interp: 🔴 4, 5 unch. · avg -9.6% · Bytecode: 🔴 3, 6 unch. · avg -2.4%
typed-arrays.js — Interp: 🟢 9, 🔴 5, 8 unch. · avg +25.8% · Bytecode: 🟢 4, 🔴 4, 14 unch. · avg -0.2%
uint8array-encoding.js — Interp: 🟢 1, 🔴 6, 11 unch. · avg -7.4% · Bytecode: 🟢 9, 🔴 3, 6 unch. · avg +18.7%
weak-collections.js — Interp: 🟢 3, 🔴 9, 3 unch. · avg -6.4% · Bytecode: 🟢 3, 🔴 4, 8 unch. · avg +23.0%
Deterministic profile diffDeterministic profile diff: no significant changes. Measured on ubuntu-latest x64. Benchmark ranges compare cached main-branch min/max ops/sec with the PR run; overlapping ranges are treated as unchanged noise. Percentage deltas are secondary context. |
test262 Conformance
Areas closest to 100%
Per-test deltas (+38 / -2)Newly failing (2):
Newly passing (38):
Steady-state failures are non-blocking; regressions vs the cached main baseline (lower total pass count, or any PASS → non-PASS transition) fail the conformance gate. Measured on ubuntu-latest x64, bytecode mode. Areas grouped by the first two test262 path components; minimum 25 attempted tests, areas already at 100% excluded. Δ vs main compares against the most recent cached |
Summary
StringToNumber(§7.1.4.1.1) was implemented twice — runtime coercion inTGocciaStringLiteralValue.ToNumberLiteral(Goccia.Values.Primitives) and compile-time folding inStringToCompileTimeNumber(Goccia.Compiler.ConstantValue) — so the interpreter and the bytecode constant optimizer could fold the same string source to different values (the+Infinityregression class).source/shared/NumericText.pas→StringToNumber(const AText: string): Double, used by both paths. It returns a rawDouble, so the compiler reuses it without instantiating a JS value (per the issue's allocation requirement). Net −171/+10 across the two units.Infinity, invalid→NaN, and signed-zero as cases that must agree. Those were latently broken identically in both paths, so (confirmed during planning) implementing theStringNumericLiteralgrammar once makes them correct rather than consistently wrong. Now fixed and verified identical in both execution modes:+"0b101"→5,+"0o17"→15(binary/octal wereNaN)+"0xABCDEF12345"→11806310474565(was a 32-bit-wrapped-554622139viaStrToInt)+"Inf"→NaN,+"."→NaN(wereInfinity/0)+"+0x10"/+"-0x10"→NaN(no sign on non-decimal); whitespace trimming and-0preservedCloses #787
Testing
--mode=bytecode; newtests/language/expressions/string-to-number/string-numeric-literal.jsexercises the full matrix and asserts both the constant-folded (+"literal") and runtime (Number(s)) paths.language.md/built-ins.mdchange (internal helper + conformance fix, no new public API surface).Primitives) changed; added co-locatedsource/shared/NumericText.Test.pas(7/7, incl. bit-level-0/NaN/large-hex); full./build.pas testsclean-builds green.