Conversation
This comment has been minimized.
This comment has been minimized.
|
@petrochenkov: you suggested to me the idea of using jointness during pretty-printing, this is a draft attempt.
|
de764f4 to
bbdb7cc
Compare
|
I am working on a fix for the #76399 issue. It's going well and I will put it up soon. In the meantime, don't look at the old code. |
bbdb7cc to
fb04eb0
Compare
|
New code is up and ready for review.
These questions are still relevant.
I have fixed this.
I have done this now, in the final commit.
I have done this too. The new type is called |
fb04eb0 to
0a92053
Compare
|
I posted a doc update for stable public-facing |
|
I don't think The "jointness" is an inherent property attached to a single token (tree) rather than to a sequence of token trees. I think the matklad's scheme in #76285 was fine - when parsing a token stream we just need to track whether a token is immediately followed by something or not, and then "censor" that information at proc macro boundary to match the public facing behavior documented in #114617. I'd ideally first land a version of #76285 first, without any pretty-printing changes or renamings, but with regressions fixed. |
|
As for pretty-printing streams from proc macros, I think we can extend the internal spacing to |
|
You haven't answered the following questions yet.
It would be helpful if you did.
What is the right model? You've made two suggestions:
This is a PR mostly about pretty-printing of parsed code, but your suggestions don't relate to that.
I don't understand how
Arbitrarily modified how? Token streams can be modified in proc macros but
It caused a regression.
That's exactly what this PR does. The three-value
The problem in #76285 was the addition of code in My original code had a similar check, causing the same problem. I fixed it by moving the
What about pretty-printing streams from parsed code? The whole point of this PR is to improve that case. #76285 didn't affect that case. Your original suggestion on Zulip was "it would be preferable to preserve jointness for all tokens and use it instead during pretty-printing." That is what this PR does. You said that
|
|
I'll return to this in a few days. In the meantime a question - how does this PR deal with the |
The same way the current code does.
No. The |
No :D |
|
There's a possibility, at least. It's better to run the changes through crater in any case, because even the |
There are a few logical changes here that I would prefer to review and land separately.
|
The current |
|
The issue in #76399 is that the token stream constructed purely through proc macro interface (
UPD: Ah, I see, #114571 (comment) talks about the regression reason, but I don't like the way in which it's fixed in this PR. |
|
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
|
🎉 Experiment
|
|
I looked at the most recent full run. There are two new errors, each repeated a few times. I think neither is significant. vgtkVarious errors like this, coming from the
openbook-v2Various errors like this, coming from This is from ConclusionBased on this and the previous analysis, I think this is ok to be merged. The only follow-up would be to modify @petrochenkov: what do you think? |
| let this_spacing = if next_tok.is_punct() { | ||
| Spacing::Joint | ||
| } else if next_tok.kind == token::Eof { | ||
| Spacing::Alone |
There was a problem hiding this comment.
Does this exception fixes cases like this?
// EOF immediately after string, so spacing is joint-hidden
"string"
The string token is then put into a token stream before an identifier.
"string"suffix
After pretty-printing and re-lexing the lexical structure is broken, "string"suffix is interpreted as 1 token instead of 2.
Spacing::Alone for EOF doesn't address the root issue here.
"string" can still get joint spacing in cases like "string"+ and be printed as "string"suffix later.
This case needs a special case in pretty-printer.
Then special treatment for EOF in this file won't be necessary.
That's not an urgent issue though, it can be addressed separately later.
There was a problem hiding this comment.
The one crate for which this was a problem is described in this comment above:
DarumaDocker.reactor-wasm
Uses wasmedge_bindgen macro from wasmedge-bindgen-macro-0.1.13.
Weird one... this:
let gen = quote! {
#[no_mangle]
pub unsafe extern "C" fn #func_ident(params_pointer: *mut u32, params_count: i32) {
...
}somehow becomes this:
#[no_mangle] pub unsafe extern "C"fn run(params_pointer : * mut u32, params_count : i32) {The lack of space after "C" is a legitimate problem. Hmm.
Somehow, Eof is involved. If we change things so that tokens followed by
Eof have Alone spacing instead of JointHidden spacing, that fixes the
problem. I don't understand why.
So it is a string suffix issue like you suggested. A follow-up sounds good, it only affected one crate in all the crater runs.
|
r=me after the style fixes. |
`tokenstream::Spacing` appears on all `TokenTree::Token` instances, both punct and non-punct. Its current usage: - `Joint` means "can join with the next token *and* that token is a punct". - `Alone` means "cannot join with the next token *or* can join with the next token but that token is not a punct". The fact that `Alone` is used for two different cases is awkward. This commit augments `tokenstream::Spacing` with a new variant `JointHidden`, resulting in: - `Joint` means "can join with the next token *and* that token is a punct". - `JointHidden` means "can join with the next token *and* that token is a not a punct". - `Alone` means "cannot join with the next token". This *drastically* improves the output of `print_tts`. For example, this: ``` stringify!(let a: Vec<u32> = vec![];) ``` currently produces this string: ``` let a : Vec < u32 > = vec! [] ; ``` With this PR, it now produces this string: ``` let a: Vec<u32> = vec![] ; ``` (The space after the `]` is because `TokenTree::Delimited` currently doesn't have spacing information. The subsequent commit fixes this.) The new `print_tts` doesn't replicate original code perfectly. E.g. multiple space characters will be condensed into a single space character. But it's much improved. `print_tts` still produces the old, uglier output for code produced by proc macros. Because we have to translate the generated code from `proc_macro::Spacing` to the more expressive `token::Spacing`, which results in too much `proc_macro::Along` usage and no `proc_macro::JointHidden` usage. So `space_between` still exists and is used by `print_tts` in conjunction with the `Spacing` field. This change will also help with the removal of `Token::Interpolated`. Currently interpolated tokens are pretty-printed nicely via AST pretty printing. `Token::Interpolated` removal will mean they get printed with `print_tts`. Without this change, that would result in much uglier output for code produced by decl macro expansions. With this change, AST pretty printing and `print_tts` produce similar results. The commit also tweaks the comments on `proc_macro::Spacing`. In particular, it refers to "compound tokens" rather than "multi-char operators" because lifetimes aren't operators.
This is an extension of the previous commit. It means the output of something like this: ``` stringify!(let a: Vec<u32> = vec![];) ``` goes from this: ``` let a: Vec<u32> = vec![] ; ``` With this PR, it now produces this string: ``` let a: Vec<u32> = vec![]; ```
Because the spacing-based pretty-printing partially preserves that.
6826e47 to
940c885
Compare
|
I fixed the style nits. @bors r=petrochenkov |
|
☀️ Test successful - checks-actions |
|
Finished benchmarking commit (a9cb8ee): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 668.008s -> 669.249s (0.19%) |
By slightly changing the meaning of
tokenstream::Spacingwe can greatly improve the output ofprint_tts.r? @ghost