-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST Improvement: Store dotted name parts #7760
Comments
I think the counterargument is that CPython uses this representation: >>> print(ast.dump(ast.parse(s)))
Module(body=[Import(names=[alias(name='foo.bar')])], type_ignores=[]) And the ASDL uses the same But I care more about the ergonomics and performance than I do exact compatibility with CPython on these decisions. Would need to see what this makes easier or harder. |
**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures
**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures
**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures
@charliermarsh Is my understanding correct that CPython normalizes the identifier name (removes the whitespace)? I was a bit surprised when I saw this representation first because it's somewhat uncommon (at least in the languages that I have used thus far)—especially considering that it re-joins the identifier tokens that the lexer identified. Are there any upsides in the semantic model to have a single string? I would expect it to be easier to have the individual parts when e.g. resolving imports. The alternative is that we implement a |
Yeah it's probably an improvement to store a list of dot-separated segments. There is likely no upside in the semantic model since we always decompose into segments. |
Let us fix this, regardless of how it gets implemented. Splitting the names in the formatter would be silly, but something we could do. I think we're at least lucky that the following is not valid import (a # comment
.b) |
I think this actually was fixed, we format it correctly: https://play.ruff.rs/86d1a181-4d27-4bbe-ad13-0c425e1976c0. I think this issue was about changing the AST to better reflect the real structure. |
Currently, the path of imports is not formatted, e.g.
remains as-is. This is due to a bug in our AST:
ruff/crates/ruff_python_ast/src/imports.rs
Lines 6 to 31 in 6824b67
The entire path is represented as a single string, even though it should be dot-separated identifier (the parser calls it
DottedName
, but then emits an identifier), especially since identifier can not contain dots.Vec<Identifier>
or something similarIdentifier
is only used for strings matching the rulesThe text was updated successfully, but these errors were encountered: