Skip to content

mul_add on *-pc-windows-gnu returns incorrect results #140515

New issue

Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? No Sign in to your account

Open
beetrees opened this issue Apr 30, 2025 · 20 comments
Open

mul_add on *-pc-windows-gnu returns incorrect results #140515

beetrees opened this issue Apr 30, 2025 · 20 comments
Labels
A-floating-point Area: Floating point numbers and arithmetic C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows-gnu Toolchain: GNU, Operating system: Windows P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@beetrees
Copy link
Contributor

I tried this code (test case taken from this comment):

fn main() {
	let (a, b, c) = std::hint::black_box((-1.9369631e13f32, 2.1513551e-7, -1.7354427e-24));
	dbg!(a.mul_add(b, c));
}

I expected to see this happen: The correct result should be printed as mul_add is guaranteed to return the correctly rounded result, which is -4167095.8.

Instead, this happened: On both i686-pc-windows-gnu and x86_64-pc-windows-gnu, -4167095.5 is printed instead. This is due to the FMA implementation in MinGW being incorrect (upstream bug report). This is unsound as LLVM is allowed to rely on FMA returning correct results when optimising.

Meta

rustc --version --verbose:

rustc 1.88.0-nightly (74509131e 2025-04-29)
binary: rustc
commit-hash: 74509131e85a97353c67c503ea32e148a56cf4bd
commit-date: 2025-04-29
host: x86_64-unknown-linux-gnu
release: 1.88.0-nightly
LLVM version: 20.1.2
@beetrees beetrees added the C-bug Category: This is a bug. label Apr 30, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 30, 2025
@beetrees
Copy link
Contributor Author

@rustbot label +I-unsound +O-windows-gnu +A-floating-point

@rustbot rustbot added A-floating-point Area: Floating point numbers and arithmetic I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows-gnu Toolchain: GNU, Operating system: Windows I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 30, 2025
@ChrisDenton
Copy link
Member

Can it lead to actually unsound optimisations in practice or is this more a theoretical?

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 30, 2025
@beetrees
Copy link
Contributor Author

The fix for scalar evolution of functions which aren't guaranteed correctly rounded (llvm/llvm-project#90942) fixed the issue by just not doing scalar evolution on any libcalls which return floating-point numbers (relevant code), so scalar evolution won't be an issue in practice at the moment. I don't know if there are any other optimisations that depend on FMA having correct results.

@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 30, 2025
@beetrees
Copy link
Contributor Author

This will also affect Miri when run on a *-pc-windows-gnu host as Miri currently uses host mul_add to work around rust-lang/rustc_apfloat#11. cc @RalfJung

@Amanieu Amanieu added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 30, 2025
@Amanieu
Copy link
Member

Amanieu commented Apr 30, 2025

For that target could we just override the provided fma with the one from compiler-builtins?

cc @tgross35

@tgross35
Copy link
Contributor

As beetrees noted on Zulip we'll lose any LLVM optimizations, but that seems tolerable. This could be a __rust_fma.

(It would be convenient if LLVM had a mechanism to change fallback libcall names without losing optimizations)

@RalfJung
Copy link
Member

Didn't betrees say that we already don't get the LLVM optimizations, since they are inhibited any time a libcall is used? Also, they referred only to scalar evolution -- I assume we'd still get all the basic operations, in particular const-folding.

@Amanieu
Copy link
Member

Amanieu commented Apr 30, 2025

If compiler-builtins provides the fma and fmaf symbols then we can have a correct implementation that still uses LLVM optimizations.

@tgross35
Copy link
Contributor

If we call __rust_fma (or depend on libm directly) then we lose const folding right? Since it doesn't go through intrinsics and LLVM won't recognize the name.

I don't think we have a way to provide the actual symbol fma since AIUI the Unix-y Windows targets don't support weak linkage, so we would conflict with host libraries.

@beetrees
Copy link
Contributor Author

Didn't betrees say that we already don't get the LLVM optimizations, since they are inhibited any time a libcall is used? Also, they referred only to scalar evolution -- I assume we'd still get all the basic operations, in particular const-folding.

The only optimisation that's inhibited for FMA (and other LLVM intrinsics) by LLVM at the moment is scalar evolution. FMA is still const-folded etc., and LLVM can re-arrange code on the assumption it doesn't have any side-effects.

@ChrisDenton
Copy link
Member

I don't think we have a way to provide the actual symbol fma since AIUI the Unix-y Windows targets don't support weak linkage, so we would conflict with host libraries.

I don't think we need weak linkage we just need to provide the symbol, no? Usually the linker can't find the symbol so it goes looking in native libraries. If we provide it instead then it won't go looking elsewhere for it.

@tgross35
Copy link
Contributor

I was thinking the mingw toolchain didn’t pick the right one for some reason, but must be misremembering. You’re right, it should just work.

Any reason not to remove windows-gnu from the list of exceptions at https://github.com/rust-lang/compiler-builtins/blob/f456aa8baf0b108208332dc4bed63b6e70639b67/compiler-builtins/src/math/mod.rs#L41-L77 and use our versions for that whole list?

@ChrisDenton
Copy link
Member

I mean I have no particular objection to that but I also don't know how our implementations compare to MinGW's.

cc @mati865

tgross35 added a commit to tgross35/rust that referenced this issue Apr 30, 2025
Per [1], MinGW has an incorrect fma implementation. This showed up in
tests run with cranelift after adding float math operations to `core`.
Presumably we hadn't noticed this when running tests with LLVM because
LLVM was constant folding the result away.

Rust issue: rust-lang#140515

[1]: https://sourceforge.net/p/mingw-w64/bugs/848/
tgross35 added a commit to tgross35/rust that referenced this issue May 1, 2025
Per [1], MinGW has an incorrect fma implementation. This showed up in
tests run with cranelift after adding float math operations to `core`.
Presumably we hadn't noticed this when running tests with LLVM because
LLVM was constant folding the result away.

Rust issue: rust-lang#140515

[1]: https://sourceforge.net/p/mingw-w64/bugs/848/
@mati865
Copy link
Contributor

mati865 commented May 1, 2025

Faulty mingw-w64 implementation is used only with MSVCR* system C libraries which lack that function. UCRT provides that function, so it should work fine:
https://github.com/mingw-w64/mingw-w64/blob/dcbe86832f5c161ab70098ffb11b1e4955577140/mingw-w64-crt/Makefile.am#L195

The only sensible reason for sticking with MSVCRT is backward compatibility: Rust distributes the rust-mingw component (only on windows-gnu, not even on windows-gnullvm). It contains MSVCRT import lib among other things, and you cannot mix different CRTs (unless you are fine with hard to debug issues).
So as long as the rust-mingw component is used (if GCC is found in the PATH, it will take precedence), you can link prebuilt DLLs or create DLLs from Rust, and they will work fine with toolchains targeting MSVCRT.
If Rust switches to UCRT C toolchain, binaries will be compatible only with UCRT toolchains.

Enough talk, let's verify that.
First, without GCC in PATH, so that ugly MSVCTR is used:

$ rustup +stable component list | rg installed
cargo-x86_64-pc-windows-gnu (installed)
clippy-x86_64-pc-windows-gnu (installed)
rust-docs-x86_64-pc-windows-gnu (installed)
rust-mingw-x86_64-pc-windows-gnu (installed)
rust-src (installed)
rust-std-thumbv7m-none-eabi (installed)
rust-std-x86_64-pc-windows-gnu (installed)
rustc-x86_64-pc-windows-gnu (installed)
rustfmt-x86_64-pc-windows-gnu (installed)

$ rustc -vV
rustc 1.86.0 (05f9846f8 2025-03-31) (Rev2, Built by MSYS2 project)
binary: rustc
commit-hash: 05f9846f893b09a1be1fc8560e33fc3c815cfecb
commit-date: 2025-03-31
host: x86_64-pc-windows-gnullvm
release: 1.86.0
LLVM version: 20.1.1

$ PATH=~/.cargo/bin gcc -v
bash: gcc: command not found

$ PATH=~/.cargo/bin rustc +stable test.rs && ./test
[test.rs:3:5] a.mul_add(b, c) = -4167095.5

$ PATH=~/.cargo/bin rustc +stable test.rs -O && ./test
[test.rs:3:5] a.mul_add(b, c) = -4167095.5

Then with GCC + mingw-w64 targeting UCRT available in PATH:

$ rustup +stable component list | rg installed
cargo-x86_64-pc-windows-gnu (installed)
clippy-x86_64-pc-windows-gnu (installed)
rust-docs-x86_64-pc-windows-gnu (installed)
rust-mingw-x86_64-pc-windows-gnu (installed)
rust-src (installed)
rust-std-thumbv7m-none-eabi (installed)
rust-std-x86_64-pc-windows-gnu (installed)
rustc-x86_64-pc-windows-gnu (installed)
rustfmt-x86_64-pc-windows-gnu (installed)

$ rustc -vV
rustc 1.86.0 (05f9846f8 2025-03-31) (Rev2, Built by MSYS2 project)
binary: rustc
commit-hash: 05f9846f893b09a1be1fc8560e33fc3c815cfecb
commit-date: 2025-03-31
host: x86_64-pc-windows-gnullvm
release: 1.86.0

$ gcc -v
Using built-in specs.
COLLECT_GCC=H:\msys64\ucrt64\bin\gcc.exe
COLLECT_LTO_WRAPPER=H:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-15.1.0/configure --prefix=/ucrt64 --with-local-prefix=/ucrt64/local --with-native-system-header-dir=/ucrt64/include --libexecdir=/ucrt64/lib --enable-bootstrap --enable-checking=release --with-arch=nocona --with-tune=generic --enable-mingw-wildcard --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++,jit --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-backtrace=yes --enable-libstdcxx-filesystem-ts --enable-libstdcxx-time --disable-libstdcxx-pch --enable-lto --enable-libgomp --disable-libssp --disable-multilib --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/ucrt64 --with-mpfr=/ucrt64 --with-mpc=/ucrt64 --with-isl=/ucrt64 --with-pkgversion='Rev1, Built by MSYS2 project' --with-bugurl=https://github.com/msys2/MINGW-packages/issues --with-gnu-as --with-gnu-ld --disable-libstdcxx-debug --enable-plugin --with-boot-ldflags=-static-libstdc++ --with-stage1-ldflags=-static-libstdc++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.1.0 (Rev1, Built by MSYS2 project)

$ rustc +stable test.rs && ./test
[test.rs:3:5] a.mul_add(b, c) = -4167095.8

$ rustc +stable -O test.rs && ./test
[test.rs:3:5] a.mul_add(b, c) = -4167095.8

I believe no changes to Rust code are necessary; users should really use UCRT C toolchains, and that's it (there are many other issues with MSVCRT, other math functions, locales, and so on). Someday Rust should also migrate rust-mingw to UCRT.

@tgross35
Copy link
Contributor

tgross35 commented May 1, 2025

Is running CI with UCRT something we can/should do, or did you mean that by migrating rust-mingw? This problem was noticed because of the CI failure at #138087 (comment). It's probably okay to skip the tests for now, but obviously that's not ideal in the longer term.

@mati865
Copy link
Contributor

mati865 commented May 1, 2025

Migrating CI from the MSVCRT toolchain to the UCRT one will also affect rust-mingw, although a workaround is possible.

MinGW-w64 provides libmsvcrt.a, which is the default CRT, so either MSVCRT or UCRT (the naming is unfortunate, but that's what you get for retaining compatibility; it used to mean MSCVRT). But also libmsvcrt-os.a that is always MSCVRT and libucrt.a that is always UCRT.
Dist step could copy libmsvcrt-os.a as libmsvcrt.a, and rust-mingw wouldn't change.

But all DLLs shipped by Rust (like std-xxx.dll) would then use UCRT.

@tgross35
Copy link
Contributor

tgross35 commented May 1, 2025

To clarify - are you suggesting that workaround is something we should actually eventually do? Unless there is a better workaround, I'm tempted to say we should switch to our libm for now and plan to switch back after we start linking to a known good version, either with the CRT change or with an upstream fix.

@mati865
Copy link
Contributor

mati865 commented May 1, 2025

Kind of, yes. I'd recommend making it official: state that MSVCRT will still work, but is not fully supported—results may vary (quite literally as you can see).

The libm workaround may be viable for the time being. What are its downsides?

Perhaps I should have mentioned this before. MSYS2 made the UCRT toolchain its default on 2022-10-29 (you can find the news entry at https://www.msys2.org/news/). Mingw-w64 did the same eight months later: mingw-w64/mingw-w64@82b8edc. That change was released with version v12 on 2024-05-29.

@ChrisDenton
Copy link
Member

The libm workaround may be viable for the time being. What are its downsides?

There shouldn't be any downsides except, possibly, perf. I don't know how well optimised our implementations are. I mean, assuming it's not breaking the calling convention anywhere (which is sometimes a hazard if hand-rolling assembly).

tgross35 added a commit to tgross35/compiler-builtins that referenced this issue May 1, 2025
Add Windows-GNU to the group of targets that gets our version of basic
libm symbols available through compiler-builtins. This is done to avoid
bugs in the platform `fma`, see [1].

[1]: rust-lang/rust#140515
@tgross35
Copy link
Contributor

tgross35 commented May 1, 2025

There shouldn't be any downsides except, possibly, perf. I don't know how well optimised our implementations are. I mean, assuming it's not breaking the calling convention anywhere (which is sometimes a hazard if hand-rolling assembly).

Perf is pretty close to musl in most cases and I assume others are roughly comparable, though we do lose dynamic linking. rust-lang/compiler-builtins#896 will help significantly for fma in particular, at least on modern CPUs. I would be interested to see some benchmarks in any case.

Anyway, a patch to do this is at rust-lang/compiler-builtins#900.

tgross35 added a commit to tgross35/compiler-builtins that referenced this issue May 2, 2025
This is a known problem in the MinGW fmaf implementation, identified at
[1].  Make sure our implementation passes this edge case.

[1]: rust-lang/rust#140515
tgross35 added a commit to rust-lang/compiler-builtins that referenced this issue May 2, 2025
This is a known problem in the MinGW fmaf implementation, identified at
[1].  Make sure our implementation passes this edge case.

[1]: rust-lang/rust#140515
No Sign up for free to join this conversation on GitHub. Already have an account? No Sign in to comment
Labels
A-floating-point Area: Floating point numbers and arithmetic C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows-gnu Toolchain: GNU, Operating system: Windows P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants