AST Improvement: Store dotted name parts #7760

konstin · 2023-10-02T09:50:01Z

Currently, the path of imports is not formatted, e.g.

import a . b

remains as-is. This is due to a bug in our AST:

ruff/crates/ruff_python_ast/src/imports.rs

Lines 6 to 31 in 6824b67

    
           /// A representation of an individual name imported via any import statement. 
        
           #[derive(Debug, Clone, PartialEq, Eq)] 
        
           pub enum AnyImport<'a> { 
        
               Import(Import<'a>), 
        
               ImportFrom(ImportFrom<'a>), 
        
           } 
        
           /// A representation of an individual name imported via an `import` statement. 
        
           #[derive(Debug, Clone, PartialEq, Eq)] 
        
           pub struct Import<'a> { 
        
               pub name: Alias<'a>, 
        
           } 
        
           /// A representation of an individual name imported via a `from ... import` statement. 
        
           #[derive(Debug, Clone, PartialEq, Eq)] 
        
           pub struct ImportFrom<'a> { 
        
               pub module: Option<&'a str>, 
        
               pub name: Alias<'a>, 
        
               pub level: Option<u32>, 
        
           } 
        
           #[derive(Debug, Clone, PartialEq, Eq)] 
        
           pub struct Alias<'a> { 
        
               pub name: &'a str, 
        
               pub as_name: Option<&'a str>, 
        
           }

The entire path is represented as a single string, even though it should be dot-separated identifier (the parser calls it DottedName, but then emits an identifier), especially since identifier can not contain dots.

Fix the AST so the import path is a Vec<Identifier> or something similar
Format import paths by removing whitespace (we can't insert parentheses like we do for dotted expressions)
Check that Identifier is only used for strings matching the rules

The text was updated successfully, but these errors were encountered:

charliermarsh · 2023-10-02T13:52:32Z

I think the counterargument is that CPython uses this representation:

>>> print(ast.dump(ast.parse(s)))
Module(body=[Import(names=[alias(name='foo.bar')])], type_ignores=[])

And the ASDL uses the same identifier symbol as in other nodes: https://docs.python.org/3/library/ast.html.

But I care more about the ergonomics and performance than I do exact compatibility with CPython on these decisions. Would need to see what this makes easier or harder.

**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures

MichaReiser · 2023-10-16T07:44:14Z

@charliermarsh Is my understanding correct that CPython normalizes the identifier name (removes the whitespace)?

I was a bit surprised when I saw this representation first because it's somewhat uncommon (at least in the languages that I have used thus far)—especially considering that it re-joins the identifier tokens that the lexer identified.

Are there any upsides in the semantic model to have a single string? I would expect it to be easier to have the individual parts when e.g. resolving imports. The alternative is that we implement a components method similar to Rust's Path::components that returns the individual parts (splitting by string).

charliermarsh · 2023-10-16T13:35:24Z

Yeah it's probably an improvement to store a list of dot-separated segments. There is likely no upside in the semantic model since we always decompose into segments.

MichaReiser · 2023-10-27T02:12:37Z

Let us fix this, regardless of how it gets implemented. Splitting the names in the formatter would be silly, but something we could do.

I think we're at least lucky that the following is not valid

import (a # comment
   .b)

charliermarsh · 2023-10-27T02:17:45Z

I think this actually was fixed, we format it correctly: https://play.ruff.rs/86d1a181-4d27-4bbe-ad13-0c425e1976c0. I think this issue was about changing the AST to better reflect the real structure.

konstin added the formatter Related to the formatter label Oct 2, 2023

konstin mentioned this issue Oct 2, 2023

Avoid printing continuations within import identifiers #7744

Merged

charliermarsh added parser Related to the parser and removed formatter Related to the formatter labels Oct 2, 2023

konstin added a commit that referenced this issue Oct 9, 2023

Remove spaces from import statements

350ea0d

**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures

konstin mentioned this issue Oct 9, 2023

Remove spaces from import statements #7859

Merged

konstin added a commit that referenced this issue Oct 11, 2023

Remove spaces from import statements

f14dcb0

**Summary** Remove spaces from import statements such as ```python import tqdm . tqdm from tqdm . auto import tqdm ``` See also #7760 for a better solution. **Test Plan** New fixtures

MichaReiser added this to the Formatter: Stable milestone Oct 16, 2023

MichaReiser changed the title ~~Fix the AST representation of import paths and format imports~~ Format import paths Oct 27, 2023

MichaReiser added formatter Related to the formatter and removed parser Related to the parser labels Oct 27, 2023

MichaReiser changed the title ~~Format import paths~~ AST Improvement: Store dotted name parts Oct 27, 2023

MichaReiser added formatter Related to the formatter parser Related to the parser and removed formatter Related to the formatter labels Oct 27, 2023

MichaReiser removed this from the Formatter: Stable milestone Oct 27, 2023

MichaReiser removed the formatter Related to the formatter label Oct 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AST Improvement: Store dotted name parts #7760

AST Improvement: Store dotted name parts #7760

konstin commented Oct 2, 2023 •

edited by MichaReiser

Loading

charliermarsh commented Oct 2, 2023

MichaReiser commented Oct 16, 2023

charliermarsh commented Oct 16, 2023

MichaReiser commented Oct 27, 2023

charliermarsh commented Oct 27, 2023

AST Improvement: Store dotted name parts #7760

AST Improvement: Store dotted name parts #7760

Comments

konstin commented Oct 2, 2023 • edited by MichaReiser Loading

charliermarsh commented Oct 2, 2023

MichaReiser commented Oct 16, 2023

charliermarsh commented Oct 16, 2023

MichaReiser commented Oct 27, 2023

charliermarsh commented Oct 27, 2023

konstin commented Oct 2, 2023 •

edited by MichaReiser

Loading