Use niche-filling optimization even when multiple variants have data. by mikebenfield · Pull Request #94075 · rust-lang/rust

mikebenfield · 2022-02-17T07:40:53Z

Fixes #46213

rust-highfive · 2022-02-17T07:40:55Z

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

rust-highfive · 2022-02-17T07:40:56Z

r? @michaelwoerister

(rust-highfive has picked a reviewer for you, use r? to override)

mikebenfield · 2022-02-17T07:42:46Z

Several static size checks needed the size lowered. I'm unsure how to handle this since they'll fail with a bootstrapped compiler, so I've just commented them out for now.

oli-obk

This is amazing! Will review in detail later, but starting a perf run now

compiler/rustc_ast/src/ast.rs

oli-obk · 2022-02-17T09:45:55Z

@bors try @rust-timer queue

rust-timer · 2022-02-17T09:45:56Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-02-17T09:46:04Z

⌛ Trying commit 31b86437df15d7b335697d3f971721104335d04e with merge dbc0f1b7dd1014d9c7e56714953a0daf1b4f4c35...

michaelwoerister · 2022-02-17T10:09:30Z

r? @eddyb -- reviewing this is out of my league ;)

Before merging something like this, we need to make sure that debuginfo generation handles this correctly (especially the cpp-like encoding). cc @wesleywiser

RalfJung · 2022-02-17T14:15:51Z

Wow that's cool. :D

It would probably be good to also have a ui test ensuring specific cases are handled (like some of the examples from the issue). I imagine there would be an existing ui test that can be extended, but I am not sure.

mikebenfield · 2022-02-17T16:59:09Z

I added a simple test case.

not(bootstrap) doesn't seem to do the trick; I still get assertion failures with that.

mikebenfield · 2022-02-17T17:20:52Z

Nevermind; not(bootstrap) works fine.

RalfJung · 2022-02-17T18:10:25Z

src/test/ui/feature-gates/feature-gate-cfg-target-has-atomic.rs

This test case seems to have crept in from another PR...?

compiler/rustc_feature/src/accepted.rs

compiler/rustc_feature/src/builtin_attrs.rs

joshtriplett · 2022-02-17T20:18:03Z

This is incredible work! I've wanted to see this level of niche optimization for a long time. I think this makes niche optimization work more like people often expect it'll work.

mikebenfield · 2022-09-12T20:31:00Z

@nnethercote What do you find unsatisfactory about the output now?

nnethercote · 2022-09-13T00:24:55Z

The main problem is that there are things marked as padding that aren't padding. E.g. for Call and Path, bytes 8..16 contain the tag.

The smaller problem is that it's not shown how the tag is stored for MethodCall.

Something like this would be much clearer:

print-type-size     variant `MethodCall`: 64 bytes
print-type-size         field `.0`: 24 bytes
print-type-size         field `.1`: 8 bytes (includes niche discriminant)
print-type-size         field `.2`: 24 bytes
print-type-size         field `.3`: 8 bytes
print-type-size     variant `Path`: 64 bytes 
print-type-size         discriminant: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 8 bytes, alignment: 8 bytes
print-type-size         field `.1`: 40 bytes

pnkfelix · 2022-09-14T04:15:59Z

On the subject of the performance regression, it seems like the impact to syn here might be severe: 9.44% regression for incr-patched:println sounds bad to me.

From the detailed info, it seems like a big portion of that is blamed on time in LLVM_lto_optimize. Maybe there's nothing that can be done to address that, but it seemed worth noting.

(Is it conceivable that the change here might actually be causing LLVM_lto_optimize to be doing new useful work when compiling syn? )

mikebenfield · 2022-09-14T16:03:28Z

Is it conceivable that the change here might actually be causing LLVM_lto_optimize to be doing new useful work when compiling syn?

Not in a way that is obvious to me.

mikebenfield · 2022-09-14T21:44:25Z

I don't understand the linked performance results. The command says the cachegrind profiler is used, but the results don't look like those produced by cachegrind. Here the results are measured in seconds and organized by query. When I run the cachegrind profiler the results are measured in CPU cycles and organized by individual function.

bjorn3 · 2022-09-14T21:56:35Z

The cachegrind commands are if you want to investigate what changed. They are not the actual commands that ran. What you see in the table is the result of rustc -Zself-profile being formatted by the summarize command of the rust-lang/measureme repo. This is noisier than cachegrind, but a lot faster. Cachegrind is way too slow to run on the perf bot and doesn't account for ipc changes.

RalfJung · 2022-09-15T05:51:07Z

It's probably worth opening an issue for the perf investigation; discussions in merged PRs tend to get lost.

jdahlstrom · 2022-11-01T21:08:41Z

Hmm, should the following enum be eligible for niche optimization now (courtesy u/gitpy on Reddit)?

enum Data {
    A(f64),
    B(Buf),
}

struct Buf(*const u8, i16);

fn main() {
    println!("{}", std::mem::size_of::<Buf>());  // 16
    println!("{}", std::mem::size_of::<Data>()); // 24 even on nightly
}

cuviper · 2022-11-01T21:20:23Z

Padding is not used as a niche -- it could contain anything, including the value that you would want to indicate the other variant.

jdahlstrom · 2022-11-01T22:01:03Z

@cuviper Ah, right. Amusingly you can now offer the compiler a place that works as a niche, though =)

enum Data {
    A(f64),
    B(Buf),
}
struct Buf(*const u8, i16, bool);
//                       ++++++

println!("{}", std::mem::size_of::<Buf>());  // 16
println!("{}", std::mem::size_of::<Data>()); // 16 on nightly!

cuviper · 2022-11-01T22:10:33Z

Yep! And if you want even more niche, you can use a univariant enum, with a repr to keep it from being a ZST:

#[repr(u8)] // or bigger!
enum Pad { Empty }

rust-highfive assigned michaelwoerister Feb 17, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Feb 17, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 17, 2022

This comment has been minimized.

Sign in to view

mikebenfield force-pushed the wip-enum branch from cb54662 to 31b8643 Compare February 17, 2022 07:56

This comment has been minimized.

Sign in to view

oli-obk reviewed Feb 17, 2022

View reviewed changes

compiler/rustc_ast/src/ast.rs Outdated Show resolved Hide resolved

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 17, 2022

mikebenfield force-pushed the wip-enum branch from 31b8643 to ffc62f6 Compare February 17, 2022 10:08

rust-highfive assigned eddyb and unassigned michaelwoerister Feb 17, 2022

This comment has been minimized.

Sign in to view

mikebenfield force-pushed the wip-enum branch from ffc62f6 to 690139c Compare February 17, 2022 16:55

mikebenfield force-pushed the wip-enum branch from 690139c to 73d9948 Compare February 17, 2022 17:20

This comment has been minimized.

Sign in to view

RalfJung reviewed Feb 17, 2022

View reviewed changes

src/test/ui/feature-gates/feature-gate-cfg-target-has-atomic.rs Outdated

Copy link

Member

RalfJung Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case seems to have crept in from another PR...?

joshtriplett reviewed Feb 17, 2022

View reviewed changes

compiler/rustc_feature/src/accepted.rs Outdated Show resolved Hide resolved

joshtriplett reviewed Feb 17, 2022

View reviewed changes

compiler/rustc_feature/src/builtin_attrs.rs Outdated Show resolved Hide resolved

RalfJung mentioned this pull request Sep 8, 2022

Missed enum layout optimization with a NonZeroU64 + more space in an enum #101567

Open

nnethercote mentioned this pull request Sep 14, 2022

Shrink ast::Expr harder #101562

Merged

mikebenfield mentioned this pull request Sep 15, 2022

Performance regression with niche optimization #101872

Open

nnethercote mentioned this pull request Sep 20, 2022

Tell rustc about unused bits in Span. #102035

Closed

Kijewski mentioned this pull request Sep 23, 2022

Add niche optimization rust-ux/uX#49

Open

nnethercote mentioned this pull request Sep 30, 2022

Reinstate hir-stats.rs test for stage 1. #102495

Merged

celinval mentioned this pull request Sep 30, 2022

Upgrade toolchain to nightly-2022-09-13 model-checking/kani#1737

Merged

4 tasks

lqd mentioned this pull request Oct 1, 2022

cranelift: update x64 inst_size_test bytecodealliance/wasmtime#4991

Closed

ikrivosheev mentioned this pull request Oct 28, 2022

Add clippy into CI and fix clippy warnings rust-bakery/nom#1569

Closed

the8472 mentioned this pull request Nov 18, 2022

optimize field ordering by grouping m*2^n-sized fields with equivalently aligned ones #102750

Merged

luqmana mentioned this pull request Nov 20, 2022

DW_AT_discr_value elided in DWARF for some enum variants #104625

Open

the8472 mentioned this pull request Nov 24, 2022

Niche placement heuristic: place at beginning or end of type #104807

Closed

decathorpe mentioned this pull request Dec 2, 2022

doctests fail with Rust 1.65+: std::borrow::Cow size assumption no longer valid maciejhirsz/beef#52

Open

tjni mentioned this pull request Dec 6, 2022

zee: fix for rust 1.65 NixOS/nixpkgs#204862

Merged

13 tasks

Geal mentioned this pull request Dec 28, 2022

fix build issues on different rust versions rust-bakery/nom#1596

Merged

lqd mentioned this pull request Feb 12, 2023

Crash, invalid free in monterey #107929

Closed

relrelb mentioned this pull request Mar 24, 2023

chore: Remove static_assertions dependency ruffle-rs/ruffle#10348

Merged

adwinwhite mentioned this pull request Sep 19, 2024

missed optimization: fat pointers in two-variant enums with small second variants #48654

Open

ginnyTheCat mentioned this pull request Mar 17, 2025

Outdated claims maciejhirsz/beef#57

Open

Uh oh!

Conversation

mikebenfield commented Feb 17, 2022

Uh oh!

rust-highfive commented Feb 17, 2022

Uh oh!

rust-highfive commented Feb 17, 2022

Uh oh!

mikebenfield commented Feb 17, 2022

Uh oh!

This comment has been minimized.

This comment has been minimized.

oli-obk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oli-obk commented Feb 17, 2022

Uh oh!

rust-timer commented Feb 17, 2022

Uh oh!

bors commented Feb 17, 2022

Uh oh!

michaelwoerister commented Feb 17, 2022

Uh oh!

This comment has been minimized.

RalfJung commented Feb 17, 2022

Uh oh!

mikebenfield commented Feb 17, 2022

Uh oh!

mikebenfield commented Feb 17, 2022

Uh oh!

This comment has been minimized.

RalfJung Feb 17, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

joshtriplett commented Feb 17, 2022

Uh oh!

mikebenfield commented Sep 12, 2022

Uh oh!

nnethercote commented Sep 13, 2022

Uh oh!

pnkfelix commented Sep 14, 2022

Uh oh!

mikebenfield commented Sep 14, 2022

Uh oh!

mikebenfield commented Sep 14, 2022

Uh oh!

bjorn3 commented Sep 14, 2022

Uh oh!

RalfJung commented Sep 15, 2022 via email

Uh oh!

jdahlstrom commented Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cuviper commented Nov 1, 2022

Uh oh!

jdahlstrom commented Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cuviper commented Nov 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

jdahlstrom commented Nov 1, 2022 •

edited

Loading

jdahlstrom commented Nov 1, 2022 •

edited

Loading