Skip to content

[treatment] Suboptimal codegen for BitConverter.SingleToInt32Bits in a hot Span<float> loop on x64 #8

@steveisok

Description

@steveisok

Description

In an x64 release build, calls to BitConverter.SingleToInt32Bits inside a tight loop over a Span<float> produce a stack roundtrip instead of a direct movd r32, xmm. The unnecessary store/reload is observable as a throughput regression on a small BenchmarkDotNet microbenchmark vs the previous release.

Repro

[Benchmark]
public int SumBits(Span<float> input)
{
    int acc = 0;
    foreach (var f in input)
        acc += BitConverter.SingleToInt32Bits(f);
    return acc;
}

Expected disasm (per element)

movd  eax, xmm0
add   ...

Actual disasm (per element)

movss dword ptr [rsp+0x..], xmm0
mov   eax, dword ptr [rsp+0x..]
add   ...

Notes

Likely a JIT-side fix in the intrinsic recognition / lowering path. A disasm-check test under src/tests/JIT/ would be a good way to lock it in.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions