Description
In an x64 release build, calls to BitConverter.SingleToInt32Bits inside a tight loop over a Span<float> produce a stack roundtrip instead of a direct movd r32, xmm. The unnecessary store/reload is observable as a throughput regression on a small BenchmarkDotNet microbenchmark vs the previous release.
Repro
[Benchmark]
public int SumBits(Span<float> input)
{
int acc = 0;
foreach (var f in input)
acc += BitConverter.SingleToInt32Bits(f);
return acc;
}
Expected disasm (per element)
Actual disasm (per element)
movss dword ptr [rsp+0x..], xmm0
mov eax, dword ptr [rsp+0x..]
add ...
Notes
Likely a JIT-side fix in the intrinsic recognition / lowering path. A disasm-check test under src/tests/JIT/ would be a good way to lock it in.
Description
In an x64 release build, calls to
BitConverter.SingleToInt32Bitsinside a tight loop over aSpan<float>produce a stack roundtrip instead of a directmovd r32, xmm. The unnecessary store/reload is observable as a throughput regression on a small BenchmarkDotNet microbenchmark vs the previous release.Repro
Expected disasm (per element)
Actual disasm (per element)
Notes
Likely a JIT-side fix in the intrinsic recognition / lowering path. A disasm-check test under
src/tests/JIT/would be a good way to lock it in.