Skip to content

Do not deduplicate captured args while expanding format_args!#149926

Open
ShoyuVanilla wants to merge 1 commit intorust-lang:mainfrom
ShoyuVanilla:no-dedup-fmt
Open

Do not deduplicate captured args while expanding format_args!#149926
ShoyuVanilla wants to merge 1 commit intorust-lang:mainfrom
ShoyuVanilla:no-dedup-fmt

Conversation

@ShoyuVanilla
Copy link
Member

@ShoyuVanilla ShoyuVanilla commented Dec 12, 2025

View all comments

Resolves #145739

I ran crater with #149291.
While there are still a few seemingly flaky, spurious results, no crates appear to be affected by this breaking change.

The only hit from the lint was
https://github.com/multiversx/mx-sdk-rs/blob/813927c03a7b512a3c6ef9a15690eaf87872cc5c/framework/meta-lib/src/tools/rustc_version_warning.rs#L19-L30,
which performs formatting on consts of type ::semver::Version. These constants contain a nested ::semver::Identifier (Version.pre.identifier) that has a custom destructor. However, this case is not impacted by the change, so no breakage is expected.

@rustbot
Copy link
Collaborator

rustbot commented Dec 12, 2025

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred in compiler/rustc_ast_lowering/src/format.rs

cc @m-ou-se

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 12, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 12, 2025

r? @spastorino

rustbot has assigned @spastorino.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

@ShoyuVanilla
Copy link
Member Author

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 13, 2025
@ShoyuVanilla
Copy link
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 13, 2025
@theemathas theemathas added the I-lang-nominated Nominated for discussion during a lang team meeting. label Dec 14, 2025
@theemathas
Copy link
Contributor

Nominating as per #145739 (comment)

@traviscross traviscross added P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang T-lang Relevant to the language team needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. labels Dec 14, 2025
@traviscross
Copy link
Contributor

traviscross commented Dec 14, 2025

It'd be worth adding a test for the drop behavior.

@traviscross
Copy link
Contributor

traviscross commented Dec 14, 2025

Given that this makes more sense for the language, along with the clean crater results and the intuition that it'd be surprising if anything actually leaned on this, I propose:

@rfcbot fcp merge lang

@rust-rfcbot
Copy link
Collaborator

rust-rfcbot commented Dec 14, 2025

Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rust-rfcbot rust-rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Dec 14, 2025
@m-ou-se m-ou-se assigned m-ou-se and unassigned spastorino Dec 17, 2025
@m-ou-se
Copy link
Member

m-ou-se commented Dec 24, 2025

I don't think we should do this. It will make the generated code for println!("{x} {x}"); less efficient, as it will get two separate arguments instead of one.

I don't want to end up in a situation where it would make sense for Clippy to suggest something like:

warning: using the same placeholder multiple times is inefficient as of Rust 1.94.0
 --> src/main.rs:3:5
  |
3 |     println!("{x} {x}");
  |     ^^^^^^^^^^^^^^^^^^^
  |
help: change this to
  |
3 -     println!("{x} {x}");
3 +     println!("{x} {x}", x = x);
  |

Adding , x = x shouldn't make a difference. If adding that makes the resulting code more efficient, I strongly feel like we've done something wrong.

@rust-rfcbot concern equivalence

@nia-e
Copy link
Member

nia-e commented Feb 10, 2026

@theemathas In that case RIP that idea, but the performance argument applies regardless. Deduplicating the pointers inside format_args!'s buffer still saves us a good bit of memory; there's a line in that display impl with a format_args! call which has 6 different locals 5 times each in different places for pretty-printing some tables. This change would make the 6 pointers expand to 30 in that buffer.

@m-ou-se
Copy link
Member

m-ou-se commented Feb 10, 2026

I think we're making a mistake if we accept this.

Today, Clippy's uninlined_format_args lint suggests:

warning: variables can be used directly in the `format!` string
 --> src/main.rs:5:5
  |
5 |     println!("Hello, {0}! How are you, {0}?", name);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
help: change this to
  |
5 -     println!("Hello, {0}! How are you, {0}?", name);
5 +     println!("Hello, {name}! How are you, {name}?");
  |

Because these two lines are considered equivalent.

The change proposed in this PR makes these two lines no longer equivalent. After this change, the user is either expected to be okay with a subtle performance hit, or to apply this seemingly useless change to keep things the same:

  |
5 -     println!("Hello, {name}! How are you, {name}?");
5 +     println!("Hello, {name}! How are you, {name}?", name=name);
  |

The only suggestion to prevent that in the common case is by making the format_args!() builtin macro way more magical, by changing how it expands depending on whether arguments refer to a local variable.

While this is technically implementable, this is simply not a reasonable/acceptable change. As I mentioned above, this would make the macro beyond reasonably magical and above all hard to maintain. To be extra clear: as the implementor and maintainer of the format_args builtin macro: I will not accept such a change.

Got it. I personally would not want to see such a lint, and would rather see us avoid pushing people to do that. But it sounds like we're aligned on @dianne's current approach, assuming the implementation is as feasible as expected.

@joshtriplett Such an implementation is not feasible/reasonable. We will end up in a situation where users who care about performance will need to add things like arg=arg.

In particular, even if we end up with a Clippy lint that fires for println!("{x} {x}"), I think we can live with that.

I don't think we should accept that. Using the same captured argument multiple times is very normal and even suggested by Clippy today.

It'd be a bad developer experience if in Rust 1.58 we suggest that users change print!("{0} {0}", x) to print!("{x} {x}"), but in 1.94 we then go "ha never mind, change that back or make it print!("{x} {x}", x=x)". How is that acceptable?

I would not feel great sending out a bunch of PRs to projects on GitHub adding arg1=arg1, arg2=arg2 to a bunch of formatting macro invocations.

It is true that for many projects a slight performance hit for formatting doesn't matter much, but using that to argue for changes like this is exactly what makes Rust's string formatting something that's currently being avoided on embedded platforms.

I've been working for years to try to improve the situation. But changes like the one in FCP here are pushing in the opposite direction, making the situation worse.

Consistency for funny weird edge cases like #145739 is great, but please let's not use that as an argument for making the real world experience of and performance of Rust worse.

@m-ou-se m-ou-se added I-lang-nominated Nominated for discussion during a lang team meeting. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. labels Feb 10, 2026
@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting. This issue has been going back and forth for a while. At this point, rather than arguing over the feasibility or non-feasibility of an optimization, we felt like we'd like to see the code of the optimization and evaluate that. Knowing that it's possible, seeing whether it looks maintainable, and knowing how much or how little code it takes, would affect our decision on this.

@rfcbot concern lets-see-the-optimization

@dianne, would you be able to show us a PR for the optimization?

@rust-rfcbot rust-rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Feb 10, 2026
@iago-lito
Copy link
Contributor

Deduplication feels wrong to me in the context of #145739, but m-ou-se arguments seem right to me here.

Do I understand correctly that this introduces a tension with potential future extensions like:

println!("{a.b} {a.b}");

because deduplicating these would require deduplicating expressions, not just identifiers?

My guess is that stopping deduplication opens the door to such extensions, and sticking to deduplication instead makes them less realistic.

(although I am having a hard time reasoning about my sentence here with all these stacked negations)

I am okay with a future without format!("{field.access.paths}") and only format!("{strict_formatting_identifiers}"), if this makes sense to Rust. The former is much comfortable but it actually does feel a bit "too magic" to me.

@joshtriplett
Copy link
Member

joshtriplett commented Feb 10, 2026

because deduplicating these would require deduplicating expressions, not just identifiers?

Or alternatively, defining increasingly complex rules like "we deduplicate the a but not the a.b", factoring out common prefixes...

We would have to define those because they'd affect the semantics of the program. If we ensure that we only deduplicate as an as-if optimization that isn't semantically visible, the user doesn't have to care about deduplication as a complicating factor, they can just treat it as an optimization.

I am okay with a future without format!("{field.access.paths}") and only format!("{strict_formatting_identifiers}"), if this makes sense to Rust. The former is much comfortable but it actually does feel a bit "too magic" to me.

The ability to format {self.field} is a very common request.

@ShoyuVanilla
Copy link
Member Author

As the author of this PR, I approached this work under the assumption that it was a fairly straightforward and uncontroversial change, based on the discussions in #145739
. The main challenge I anticipated was identifying the affected crates.

I didn't expect the topic to be this debatable, and had I known that upfront, I might have reconsidered taking this on, since I'm not deeply familiar with the formatting. For that reason, I've mostly stayed quiet so far. I don't feel I have particularly strong insights to add beyond what's already been discussed.

Apologies if this PR caused extra discussion or confusion; that wasn't my intention 😢

@RalfJung
Copy link
Member

@ShoyuVanilla you don't have to be sorry! You just had the bad luck of stumbling into a superficially simple problem that has some deep subtle issues. This happens occasionally when one builds a complicated system such as a programming language, but we can't predict which issues it affects. It's nor your fault that these issues exist. :) I'm sorry this turned out to be heavier than you thought.

@jackh726
Copy link
Member

because deduplicating these would require deduplicating expressions, not just identifiers?

Or alternatively, defining increasingly complex rules like "we deduplicate the a but not the a.b", factoring out common prefixes...

There's also the only-slightly more complex rule of format_args!("{a} {a}") is deduplicated, but format_args!("{a.b} {a.b}") is not, under the logic that you can factor out the former into format_args!("{a} {a}", a=a) but not the latter into format_args!("{a.b} {a.b}", a.b=a.b).

Also, to be clear: the above rule would allow us to move forward with implementing (and even stabilizing) format_args!("{a.b} {a.b}") without even needing to worry about whether or not we want to change the deduplication of format_args!("{a} {a}"). I don't think anybody is arguing that doing any set of deduplication on format_args!("{a.b} {a.b}") makes much sense, but even less so for format_args!("{a.b.c} {a.b.c}").

@m-ou-se
Copy link
Member

m-ou-se commented Feb 10, 2026

Do we expect format_args!("{p.name}", p=get_person()) to work? If so, format_args!("{a.b}") could simply desugar to format_args!("{a.b}", a=a).

@traviscross traviscross added the P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang label Feb 11, 2026
@dianne
Copy link
Contributor

dianne commented Feb 11, 2026

@dianne, would you be able to show us a PR for the optimization?

I've got a half-baked PR that does the optimization for captured formatted arguments: #152480. It's missing the equivalent optimization for captured width and precision arguments for now, since I think that would be more invasive. Plus, when attempting to preserve the behavior of writing the args out separately, {:w$} only seems about as de-duplicable as {a.b}; both reasonably should be pure, but can execute user code via Deref impls (though that's probably much less likely from coercing a width to &usize than from a field access).

I'm not sure this is the right thing to do, especially given that width/precision messiness. Hopefully the draft PR makes things a bit clearer at least?

@joshtriplett
Copy link
Member

As the author of this PR, I approached this work under the assumption that it was a fairly straightforward and uncontroversial change, based on the discussions in #145739 . The main challenge I anticipated was identifying the affected crates.

I didn't expect the topic to be this debatable, and had I known that upfront, I might have reconsidered taking this on, since I'm not deeply familiar with the formatting. For that reason, I've mostly stayed quiet so far. I don't feel I have particularly strong insights to add beyond what's already been discussed.

Apologies if this PR caused extra discussion or confusion; that wasn't my intention 😢

You didn't do anything wrong here. We're having a vigorous discussion about whether we should do this, but none of that was caused by your PR; we appreciate you doing the work on this.

@joshtriplett
Copy link
Member

joshtriplett commented Feb 16, 2026

Giving a bit of additional context here, which may not be clear because while this has been discussed in several different Rust team meetings, the discussion and motivations haven't all made it onto GitHub threads, or has been spread across several threads. I've heard from a couple of folks that that context wasn't at all clear, and that can be frustrating and make it feel like things are happening for unclear or insufficiently motivated reasons.

This issue was originally discovered by @theemathas in #145739 (comment) .

This came up further in discussion of RFC 3626, but this is not a dependency of that RFC and that RFC is not the motivation for making the change; it is reasonably straightforward to make that RFC handle either approach.

The primary motivation for many has been the consistency of the language, and avoiding what appears to be a user trap or potential source of confusion, caused by the interaction of several features, some of which users don't commonly think about. (For instance, several people reacted to the use of a nullary type constructor as a captured argument with "you can do that?".)

When looking at it, folks on both lang and opsem felt it didn't make sense that f(something, something) would evaluate something twice (just as something + something does) but format!("{something} {something}") would not; that difference felt like surprising behavior for people to deal with. There were a couple of different ways to make that evaluation difference apparent, but the reactions generally seemed more like "are those obscure enough that it's not too much of a hazard to leave it this way", rather than any feeling that the current behavior was the right behavior on principle.

That, alone, provided strong enough motivation to look into changing this behavior. The initial reaction was to consider if there'd be substantive performance impact for just removing the behavior entirely. However, feedback on this and other threads surfaced that it's reasonably common to write multiple instances of the same capture in the same format string, and that this was thus an important optimization; that led us to hesitate a bit. We wondered, at that point, if it'd be possible to do an as-if optimization rather than a user-visible optimization. @dianne came to the rescue, suggesting that an as-if optimization for many common cases should be possible. (A draft of this optimization has now been posted at #152480 .)

Several folks have weighed in on this, including @RalfJung, regarding the value and desirability of the semantic change. That feedback applies to the consistency of the current Rust language, leaving aside any potential future changes we might make that would benefit from it.

I'm hoping that that makes the motivations and some of the thinking behind this proposed change more explicitly clear. In particular, I hope that makes it sufficiently clear that this change is not driven primarily by the desire to make further language changes, and multiple people across multiple teams have expressed a desire to make this change even if we don't add any further extensions to the formatting machinery.

@RalfJung
Copy link
Member

Thanks for the summary!

FWIW, I think the current behavior, while surprising, does at least have a "low-entropy explanation" via the desugaring to a single named argument. But with struct fields entering the picture, the behavior that includes user-visible deduplication becomes quite messy.

@joshtriplett
Copy link
Member

joshtriplett commented Feb 17, 2026

@RalfJung Sure, it's definitely explainable. But a "low-entropy explanation" for different behavior is never going to beat a "no-entropy explanation" (namely "it works the same as it does everywhere else").

@joshtriplett
Copy link
Member

So, I filed the concern lets-see-the-optimization for exactly the reason it says on the tin: libs-api wanted to see and consider the optimization before deciding this, to evaluate the maintainability concerns.

We've now seen the optimization. I no longer have this concern myself, and don't want to maintain this concern by proxy. If someone else wishes to file a concern, they should do so.

@rfcbot resolve lets-see-the-optimization

@rust-rfcbot rust-rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Feb 25, 2026
@rust-rfcbot
Copy link
Collaborator

🔔 This is now entering its final comment period, as per the review above. 🔔

@rust-rfcbot rust-rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Feb 25, 2026
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Feb 25, 2026

@rfcbot concern just-does-not-seem-worth-it

So I've been kind of sitting this out, and I hate to do this, but I am going to raise a concern for a bit more discussion. The bottom line is that I don't really buy the "consistency" motivation for this change. I see no reason to expect that format!("{X}, {X}") could be consistent with foo(X, X). In fact, I would much more expect consistency between format!("{X}, {X}") and format!("{X}, {X}", X=X).

It seems like the consistency concern comes up when you look at something like format!("{expr()}, {expr()}")?

But I see a rather consistent mental model which is that embedded {} blocks introduce an argument which can be named (in the case of X or a field chain like x.y.z) or anonymous (all other kinds of expressions).

I gather that the main argument in FAVOR of this chang is that it'd be simpler if the embedded {} blocks desugared to always a fresh argument. And this is rather stronger with f strings, i.e., f"foo({X},{X})" doesn't have that "desugars to X=X" to lean on, rather you must explain in terms of format!.

I have a kind of internal heuristic which is like: don't break compatibility if you don't have to. Right now, if this were a fresh design, I suppose I'd be on the fence because of f strings, but given that (a) there is code in the wild; (b) the desugaring is still easy to understand either way; and (c) it has real-world impact on performance and code-size, I don't understand why we would make this change.

Am I missing something? Is there a more complex argument?

If I do have this right, can somebody summarize the perf/cost hit and point me at the comments about the complexity of the proposed optimization?

(The optimization smacks to me "sufficiently smart compiler" when a rather "dumb compiler hack" would do the trick...)

EDIT: I read over more the comments. I can see that the f"" strings and the case with more fields does make the desugaring more complex to think about. I'll ponder a bit.

@rust-rfcbot rust-rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Feb 25, 2026
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Feb 25, 2026

Actually, what would help me is to write-out the precise desugaring expected for format!("{X.f}, {X.f}"). @m-ou-se you had a comment but skimming it I didn't quite see what desugaring you expected there?

EDIT: I guess that based on this comment...

Do we expect format_args!("{p.name}", p=get_person()) to work? If so, format_args!("{a.b}") could simply desugar to format_args!("{a.b}", a=a).

...it's clear enough.

@traviscross
Copy link
Contributor

The bottom line is that I don't really buy the "consistency" motivation for this change. I see no reason to expect that format!("{X}, {X}") could be consistent with foo(X, X). In fact, I would much more expect consistency between format!("{X}, {X}") and format!("{X}, {X}", X=X).

What about between format!("{X} {X}") and format!("{} {}", X, X)?

Am I missing something? Is there a more complex argument?

I think the best case for the simple argument is demonstrated by the earnest surprise that @theemathas (in #145739) and @Jules-Bertholet (in rust-lang/rfcs#3626 (comment)) had about the behavior. Obviously both are Rust experts. As @Jules-Bertholet said:

const Z: SomeStruct = SomeStruct::new();
println!("{Z.field1} {Z.field2} {Z.field3}");

If Z has interior mutability, then it has to be evaluated several times, no? That’s how consts work everywhere else in Rust. If you want only one instance, use a static

The simple argument really is that simple. From lang, we've pushed a consistent evaluation opsem for consts, the interpolation behavior is inconsistent with that, and that surprises people.


The slightly-more-complex argument, in an RFC 3626 context, goes as follows.

Let's say we want to desugar this:

fn main() {
    println!("{X.f} {X.f}"); //~ Format string to desugar.
}

// Feel free to ignore this scaffolding...
use core::{fmt, sync::atomic::{AtomicU8, Ordering::Relaxed}};

const X: W = W(Y);
const Y: U = U { f: V(AtomicU8::new(0)) };

struct W(U);
struct U { f: V }
struct V(AtomicU8);

impl fmt::Display for V {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let x = self.0.fetch_add(1, Relaxed);
        write!(f, "displayV({x})")
    }
}

impl Drop for W {
    fn drop(&mut self) {
        let x = self.0.f.0.fetch_add(1, Relaxed);
        println!("dropW({x})");
    }
}

impl core::ops::Deref for W {
    type Target = U;
    fn deref(&self) -> &Self::Target {
        let x = self.0.f.0.fetch_add(1, Relaxed);
        println!("derefW({x})");
        &self.0
    }
}

There are three ways that come to mind:

fn main() {
    //println!("{X.f} {X.f}"); //~ Format string to desugar.
    println!("-- 1:");
    println!("{} {}", X.f, X.f); //~ Desugaring 1.
    println!("-- 2:");
    { let x = X; println!("{} {}", x.f, x.f) }; //~ Desugaring 2.
    println!("-- 3:");
    println!("{} {0:}", X.f); //~ Desugaring 3.
}

Playground link

These produce outputs:

-- 1:
derefW(0)
derefW(0)
displayV(1) displayV(1)
dropW(2)
dropW(2)
-- 2:
derefW(0)
derefW(1)
displayV(2) displayV(3)
dropW(4)
-- 3:
derefW(0)
displayV(1) displayV(2)
dropW(3)

I find the output of desugaring 1 satisfying, given our intended opsem expectations for consts. I find the output of desugarings 2 and 3 various degrees of unsatisfying. In particular, the idea that with println!("{X.f} {X.f}") we would get (with desugaring 2) two derefs but only one drop would just seem to be a real wart and a likely-continual source of surprise. But desugaring 3 is no picnic either, especially when we start thinking about how we'd want to handle {x.y} {x.y.z}.

For my part, if we were not to clean this up, then I'd become more skeptical of whether we'd want to expand our interpolation syntax at all.

Is it worth it? I really can't imagine there being meaningful breakage here. Regarding the performance, I'm hopeful that as-if optimizations can cover the common cases, but maybe there would be some cost. Maybe in some cases people should indeed prefer to write format!("{x} {x}", x=X.f) rather than format!("{X.f} {X.f}") -- unless we went with desugaring 3, this would have, if nothing else, fewer derefs. But I can live with that. In the rest of the language, there are plenty of reasons that people should similarly write let x = EXPR; /* Uses x twice */ rather than /* Uses EXPR twice */. That's the cost of our language not being pure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. I-lang-nominated Nominated for discussion during a lang team meeting. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. I-libs-api-nominated Nominated for discussion during a libs-api team meeting. needs-fcp This change is insta-stable, or significant enough to need a team FCP to proceed. P-lang-drag-1 Lang team prioritization drag level 1. https://rust-lang.zulipchat.com/#narrow/channel/410516-t-lang proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

format_args deduplicates consts with interior mutability or destructor