-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve print_tts
by making space_between
smarter
#117433
Conversation
This PR changes punctuation jointness in many cases, something I advised against in the previous similar PR - #97340 (comment). After #114571 lands and after So, right now, I think, that would be equivalent to never using |
Update: when both |
bae356b
to
6e849e8
Compare
Ok, I have updated the code to never remove the space between adjacent punctuation tokens. The following cases are worse:
but overall it's not too bad, and still a lot better than the current output (though The existing I ended up merging all the previous commits that change the behaviour of |
|
||
/// Should two consecutive token trees be printed with a space between them? | ||
/// | ||
/// NOTE: should always be false if both token trees are punctuation, so that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this function should return anything in this case, it should rather have assert!(!is_punct(tt1) || !is_punct(tt2))
at the start instead.
The decision should be made based on Spacing
in that case, and we should never reach this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or the Spacing
-based decision can be made inside this function, then it will be if is_punct(tt1) && is_punct(tt2) { ... }
at the start instead of the assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid making any pretty-printing decisions based on Spacing
in this PR. We can leave those to #114571, which will change how space_between
is called. I plan to add the Spacing::Unknown
in that PR, for tokens coming from proc macros. Those will be the cases where space_between
is used.
With that decided, the current position of the assertion has the advantage that it's only checked in the case where space_between
returns false.
So I think this is good enough to merge, or do a crater run if you think that is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Crater run is needed in any case.
To avoid `!matches!(...)`, which is hard to think about. Instead every case now uses direct pattern matching and returns true or false. Also add a couple of cases to the `stringify.rs` test that currently print badly.
We currently do the wrong thing on a lot of these. The next commit will fix things.
As well as nicer output, this fixes several FIXMEs in `tests/ui/macros/stringify.rs`, where the current output is sub-optimal.
6e849e8
to
9b9f8f0
Compare
// NON-PUNCT + `;`: `x = 3;`, `[T; 3]` | ||
// NON-PUNCT + `.`: `x.y`, `tup.0` | ||
// NON-PUNCT + `:`: `'a: loop { ... }`, `x: u8`, `where T: U`, | ||
// `<Self as T>::x`, `Trait<'a>: Sized`, `X<Y<Z>>: Send`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples still involve punctuation and need an update.
@bors try |
Improve `print_tts` by making `space_between` smarter `space_between` currently handles a few cases that make the output nicer. It also gets some cases wrong. This PR fixes the wrong cases, and adds a bunch of extra cases, resulting in prettier output. E.g. these lines: ``` use smallvec :: SmallVec ; assert! (mem :: size_of :: < T > () != 0) ; ``` become these lines: ``` use smallvec::SmallVec; assert!(mem::size_of:: < T >() != 0); ``` This overlaps with rust-lang#114571, but this PR has the crucial characteristic of giving the same results for all token streams, including those generated by proc macros. For that reason I think it's worth having even if/when rust-lang#114571 is merged. It's also nice that this PR's improvements can be obtained by modifying only `space_between`. r? `@petrochenkov`
☀️ Try build successful - checks-actions |
@craterbot check |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🎉 Experiment
|
There are some legitimate regressions here, waiting on author to triage. |
I still don't understand how to read crater reports properly. But looking at the "regressed: dependencies" section, there are five problems, all involving conversions of a token stream to a string and then "parsing" the resulting string. Annoying stuff, and I'm not sure how to proceed. atspi-proxies-0.1.0It uses // TODO: this is sketchy as all hell
// it replaces all mentions of zbus::Result with the Generic std::result::Result, then, adds the Self::Error error type to the second part of the generic
// finally, it replaces all mentions of (String, zbus :: zvairnat :: OwnedObjectPath) with &Self.
// this menas that implementors will need to return a borrowed value of the same type to comply with the type system.
// unsure if this will hold up over time.
fn genericize_method_return_type(rt: &ReturnType) -> TokenStream {
let original = format!("{}", rt.to_token_stream());
let mut generic_result = original.replace("zbus :: Result", "std :: result :: Result");
let end_of_str = generic_result.len();
generic_result.insert_str(end_of_str - 2, ", Self :: Error");
let mut generic_impl = generic_result.replace(OBJECT_PAIR_NAME, "Self");
generic_impl.push_str(" where Self: Sized");
TokenStream::from_str(&generic_impl).expect("Could not genericize zbus method/property/signal. Attempted to turn \"{generic_result}\" into a TokenStream.")
} The spacing of awto-0.1.2It uses let db_type_is_text = ty.to_string().ends_with(":: Text");
if let Some(max_len) = &field.attrs.max_len {
if !db_type_is_text {
return Err(syn::Error::new(
max_len.span(),
"max_len can only be used on varchar & char types",
));
}
ty = quote!(#ty(Some(#max_len)));
} else if db_type_is_text {
ty = quote!(#ty(None));
} The change in spacing of ink-analyzer-ir-0.7.0
pub fn impl_from_ast(ast: &syn::DeriveInput) -> syn::Result<TokenStream> {
let name = &ast.ident;
if let Some(fields) = utils::parse_struct_fields(ast) {
if let Some(ast_field) = utils::find_field(fields, "ast") {
let ir_crate_path = utils::get_normalized_ir_crate_path();
let ast_field_type = &ast_field.ty;
let ast_type = if ast_field_type
.to_token_stream()
.to_string()
.starts_with("ast ::")
{
quote! { #ast_field_type }
} else {
quote! { ast::#ast_field_type }
}; The change of spacing from kcl-lib-0.1.35It uses let ret_ty = ast.sig.output.clone();
let ret_ty_string = ret_ty
.into_token_stream()
.to_string()
.replace("-> ", "")
.replace("Result < ", "")
.replace(", KclError >", "");
let return_type = if !ret_ty_string.is_empty() {
let ret_ty_string = if ret_ty_string.starts_with("Box <") {
ret_ty_string
.trim_start_matches("Box <")
.trim_end_matches('>')
.trim()
.to_string()
} else {
ret_ty_string.trim().to_string()
}; An example return type changes from this: -> Result < Box < ExtrudeGroup >, KclError > {} to this: -> Result < Box < ExtrudeGroup > , KclError > {} And the space inserted between the pagetop-0.0.46This uses let args: Vec<String> = fn_item
.sig
.inputs
.iter()
.skip(1)
.map(|arg| arg.to_token_stream().to_string())
.collect();
let param: Vec<String> = args
.iter()
.map(|arg| arg.split_whitespace().next().unwrap().to_string())
.collect();
#[rustfmt::skip]
let fn_with = parse_str::<ItemFn>(concat_string!("
pub fn ", fn_name.replace("alter_", "with_"), "(mut self, ", args.join(", "), ") -> Self {
self.", fn_name, "(", param.join(", "), ");
self
}
").as_str()).unwrap(); On a signature like this:
the old pretty printer printed |
I think we can follow the usual breaking change procedure here. If the regressed crate is alive - send a fix. If the regressed crate is also popular (e.g. responsible for the largest number of regressions in the report), then special case it in the compiler. @rustbot author |
Interestingly, the joint-based pretty printer in #114571 doesn't break the five examples above. Presumably because the affected token streams come from proc macros, and #114571 doesn't make changes to how those are printed. Now that I better understand the effects of these kinds of changes on real world code, I have the following plan:
|
These have all been done. |
space_between
currently handles a few cases that make the output nicer. It also gets some cases wrong. This PR fixes the wrong cases, and adds a bunch of extra cases, resulting in prettier output. E.g. these lines:become these lines:
This overlaps with #114571, but this PR has the crucial characteristic of giving the same results for all token streams, including those generated by proc macros. For that reason I think it's worth having even if/when #114571 is merged. It's also nice that this PR's improvements can be obtained by modifying only
space_between
.r? @petrochenkov