Skip to content

use correct LLVM intrinsic for min/max on floats#153343

Merged
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix
Mar 15, 2026
Merged

use correct LLVM intrinsic for min/max on floats#153343
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
RalfJung:min-max-fix

Conversation

@RalfJung
Copy link
Member

@RalfJung RalfJung commented Mar 3, 2026

View all comments

The Rust minnum/maxnum intrinsics are documented to return the other argument when one input is an SNaN. However, the LLVM lowering we currently choose for them does not match those semantics: we lower them to minnum/maxnum, which (since llvm/llvm-project#172012) is documented to non-deterministically return the other argument or NaN when one input is an SNaN.

LLVM does have an intrinsic with the intended semantics: minimumnum/maximumnum. Let's use that instead. We can set the nsz flag since we treat signed zero ordering as non-deterministic.

Also rename the intrinsics to follow the IEEE 2019 naming, since that is mostly (and in particular, as far as NaN are concerned) now what we do. Also, minimum_number and minimum are less easy to mix up than minnum and minimum.

r? @nikic
Cc @tgross35
Fixes #149537
Fixes #151286
(The issues are only fixed when using the latest supported LLVM, but I don't think we usually track problems specific to people compiling rustc with old versions of LLVM)

@rustbot
Copy link
Collaborator

rustbot commented Mar 3, 2026

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @oli-obk, @lcnr

Some changes occurred to the CTFE machinery

cc @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 3, 2026
);
// `nsz` in minimumnum/maximumnum is special: its only effect is to make signed-zero
// ordering non-deterministic.
unsafe { llvm::LLVMRustSetNoSignedZeros(call) };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if the way I wired up nsz is correct.^^

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me once the questions for other backends are answered.

View changes since this review

if (auto I = dyn_cast<Instruction>(unwrap<Value>(V))) {
I->setHasNoSignedZeros(true);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C bindings have a native LLVMSetFastMathFlags(), we should probably switch to that. But I guess we should do that consistently for the existing LLVMRustSetAlgebraicMath/LLVMRustSetAllowReassoc/LLVMRustSetFastMath as well, so I don't particularly mind this in the meantime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't know why we have a separate wrapper for each flag configuration here, but I figured I'd follow the existing pattern.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

There are some odd things happening in CI

2026-03-03T12:31:32.4740066Z rustc-LLVM ERROR: Cannot select: 0xff67d41c22a0: f128 = fcanonicalize nsz 0xff67d41c2a10
2026-03-03T12:31:32.4740629Z   0xff67d41c2a10: f128 = AArch64ISD::CSEL 0xff67d41c5660, 0xff67d41c52e0, Constant:i32<11>, 0xff67d41c2930:1

Why did fcanonicalize end up with nsz? That was meant just for minimumnum.
And also it seems like f128 minimumnum is just broken on aarch64?

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

@nikic
Copy link
Contributor

nikic commented Mar 3, 2026

That was on the aarch64-gnu-llvm-20-1 runner. Maybe we have to fall back to minnum/maxnum for old LLVM versions?

Ah yes, that's a good point. I believe minimumnum used to have some selection failures that were only fixed in LLVM 22.

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

I guess that makes sense, the test fails on old LLVM where we (have to) use the wrong intrinsic.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

@bors try jobs=x86_64-gnu,aarch64-gnu

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

This is very strange, I tested the fallback implementation locally and it passes. Why does it fail on the aarch runner?

And it's also a very strange return value. The inputs are from_bits(0x7fbfffff) and -9.0 and the output is from_bits(0x7fffffff)?!?

@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 3, 2026

☀️ Try build successful (CI)
Build commit: 5fc4d3f (5fc4d3f9142818f2f2b292605ba61c2b9b55f112, parent: 1b7d722f429f09c87b08b757d89c689c6cf7f6e7)

@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

now with the commit that always forces the fallback impl to be used
@bors try jobs=x86_64-gnu,aarch64-gnu,x86_64-gnu-gcc

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 3, 2026
rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering


try-job: x86_64-gnu
try-job: aarch64-gnu
try-job: x86_64-gnu-gcc
@RalfJung
Copy link
Member Author

RalfJung commented Mar 3, 2026

Seems like LLVM 20 straight-up miscompiles code like this

fn minimum_num(x: f32, y: f32) -> f32 {
    if x.is_nan() || y >= x {
        y
    } else {
        // Either y < x or y is a NaN.
        x
    }
}

const SNAN: f32 = f32::from_bits(f32::NAN.to_bits() - 1);

fn main() {
    dbg!(minimum_num(-9.0, std::hint::black_box(SNAN)));
}

I tried this on an aarch64 dev desktop: with Rust 1.87, an optimized build prints NaN, with latest stable Rust, it prints -9.0.

How do we handle library tests that trigger miscompilations on old LLVM versions...? We could just remove the black_box, but -- it'd be a shame to reduce test coverage on newer LLVM just because we also still test old LLVM.

Are we anywhere close to dropping LLVM 20? :D

@rustbot

This comment has been minimized.

@RalfJung
Copy link
Member Author

@nikic Based on the CI results here, it looks like minimumnum/maximumnum already work well enough with LLVM 21. Is it fine to use them or should we still gate this on LLVM 22?

@RalfJung
Copy link
Member Author

There's a confusingly large amount of similarly named runners that don't all get run in PR CI so who knows which tests are actually covered... let's do a try run.
@bors try jobs=x86_64-gnu-llvm-21-*

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Mar 15, 2026
use correct LLVM intrinsic for min/max on floats


try-job: x86_64-gnu-llvm-21-*
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

Uh, what? Now some libtest tests are failing...? Those do involve min/max. Maybe there are miscompilations on x86 as well here (that were hopefully fixed in LLVM 22)?

@rust-bors rust-bors bot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 15, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 15, 2026

💔 Test for ac14465 failed: CI. Failed jobs:

@rustbot
Copy link
Collaborator

rustbot commented Mar 15, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@RalfJung
Copy link
Member Author

Well, let's just land it with the fallback impl for LLVM 21 then, since minimumnum apparently still has issues. #153866 landed so this should be good to go.

@bors r=nikic rollup=never

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 15, 2026

📋 This PR cannot be approved because it currently has the following label: S-blocked.

@RalfJung RalfJung removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-blocked Status: Blocked on something else such as an RFC or other implementation work. labels Mar 15, 2026
@RalfJung
Copy link
Member Author

@bors r=nikic rollup=never

@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 15, 2026

📌 Commit c7220f4 has been approved by nikic

It is now in the queue for this repository.

@rust-bors rust-bors bot added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 15, 2026
@rust-bors

This comment has been minimized.

@rust-bors rust-bors bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 15, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Mar 15, 2026

☀️ Test successful - CI
Approved by: nikic
Duration: 3h 9m 54s
Pushing f125037 to main...

@rust-bors rust-bors bot merged commit f125037 into rust-lang:main Mar 15, 2026
12 checks passed
@rustbot rustbot added this to the 1.96.0 milestone Mar 15, 2026
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 9e973d8 (parent) -> f125037 (this PR)

Test differences

Show 17 test diffs

Stage 1

  • [ui] tests/ui/float/minmax.rs: [missing] -> ignore (ignored when the LLVM version 21.1.2 is older than 22.0.0) (J2)

Stage 2

  • [ui] tests/ui/float/minmax.rs: [missing] -> ignore (ignored when the LLVM version 21.1.2 is older than 22.0.0) (J0)
  • [ui] tests/ui/float/minmax.rs: [missing] -> pass (J1)

Additionally, 14 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard f125037ccddbeb162bce09213548314988da97a6 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-various-2: 39m 12s -> 45m 8s (+15.1%)
  2. test-various: 1h 51m -> 2h 8m (+14.6%)
  3. pr-check-2: 40m 19s -> 45m 19s (+12.4%)
  4. x86_64-gnu-llvm-21-2: 1h 31m -> 1h 42m (+12.0%)
  5. aarch64-gnu-llvm-21-2: 47m 15s -> 52m 44s (+11.6%)
  6. tidy: 2m 30s -> 2m 45s (+10.1%)
  7. optional-x86_64-gnu-parallel-frontend: 2h 32m -> 2h 47m (+9.6%)
  8. aarch64-gnu: 2h 5m -> 2h 17m (+9.2%)
  9. aarch64-gnu-debug: 1h 9m -> 1h 15m (+8.0%)
  10. dist-s390x-linux: 1h 34m -> 1h 27m (-7.5%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f125037): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 1.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.4% [0.8%, 1.9%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (primary -2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.8% [-2.8%, -2.8%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.8% [-2.8%, -2.8%] 1

Binary size

Results (primary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.2%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.1%, 0.2%] 4

Bootstrap: 484.216s -> 479.863s (-0.90%)
Artifact size: 394.84 MiB -> 394.80 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. merged-by-bors This PR was explicitly merged by bors. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

8 participants