Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include the expression ID in the Lint object #710

Closed
MichaelChirico opened this issue Jan 22, 2021 · 9 comments
Closed

Include the expression ID in the Lint object #710

MichaelChirico opened this issue Jan 22, 2021 · 9 comments
Labels
feature a feature request or enhancement lint-metadata

Comments

@MichaelChirico
Copy link
Collaborator

I'm trying to use the lint output to go back to the file & make some edit around the lint.

Currently the Lint object contains the line (l1) & column (c1) number, but this is not sufficient:

  • I want to work on the parse tree, not the raw text -- using the raw text, I would have to start from (l1, c1) and, in general, "parse locally". For some purposes, a regex or simple stack-based approach (for matching delimiters) would be enough. Therefore it's better to work back to some form of parsed data (e.g. the xml_parse_data tree, or the getParseData data.frame).
  • However, in the parse tree, l1 and c1 might be shared by multiple nodes:
writeLines("if (check(x)) {
 TRUE
}", tmp <- tempfile())
utils::getParseData(parse(tmp))
#    line1 col1 line2 col2 id parent                token terminal  text
# 26     1    1     3    1 26      0                 expr    FALSE      
# 1      1    1     1    2  1     26                   IF     TRUE    if
# 2      1    4     1    4  2     26                  '('     TRUE     (
# 11     1    5     1   12 11     26                 expr    FALSE      
# 3      1    5     1    9  3      5 SYMBOL_FUNCTION_CALL     TRUE check
# 5      1    5     1    9  5     11                 expr    FALSE      
# 4      1   10     1   10  4     11                  '('     TRUE     (
# 6      1   11     1   11  6      8               SYMBOL     TRUE     x
# 8      1   11     1   11  8     11                 expr    FALSE      
# 7      1   12     1   12  7     11                  ')'     TRUE     )
# 12     1   13     1   13 12     26                  ')'     TRUE     )
# 23     1   15     3    1 23     26                 expr    FALSE      
# 14     1   15     1   15 14     23                  '{'     TRUE     {
# 16     2    2     2    5 16     17            NUM_CONST     TRUE  TRUE
# 17     2    2     2    5 17     23                 expr    FALSE      
# 21     3    1     3    1 21     23                  '}'     TRUE     }

Notice multiple expr nodes with (l1, c1) = (1, 5).

  • ranges isn't enough either because that's used for pretty printing. Usually (not always) there is different logic for how ranges is determined if the target expr spans multiple lines, which makes it way more difficult (I don't think impossible? but at least requires way more knowledge about the linter's internals) to back out which row we are looking for exactly.

A very simple (IINM) workaround to all of this is to include the row ID of the target node (e.g. 11 or 5 at (1, 5) above) in the Lint object.

The fix is simple but I added a lot of context since I think this would require changes for downstreams writing custom linters. That said, IINM the older style of object would probably still work without issue for existing use cases.

Any other things to consider here?

@AshesITR
Copy link
Collaborator

How would you define the expression ID for linters that don't operate on the expression level?
e.g. line_length_linter or linters looking at indentation?

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jan 22, 2021 via email

@AshesITR
Copy link
Collaborator

WDYT about making the potential new data an optional argument to Lint?

And maybe default it to NULL which I think is a more explicit way of missingness than NA...
At least making the argument optional should prevent downstream breaking changes.

Code that wants to use the information should check for its presence on every Lint object it touches for safety. Linters that want to support the new info (expression_id?) would have to be changed, though.

What exactly is your use-case? Do you want to mass-transform code with a certain type of lint to an equivalent but lint-free alternative?

@MichaelChirico
Copy link
Collaborator Author

WDYT about making the potential new data an optional argument to Lint?

I think it would have to be an argument to Lint, agree it should be optional given that it only applies to expression-level lints.

Linters that want to support the new info (expression_id?) would have to be changed, though.

Yep, I would handle that in the PR (or in a follow-up perhaps)

Do you want to mass-transform code with a certain type of lint to an equivalent but lint-free alternative?

pretty much yes.

@AshesITR AshesITR added feature a feature request or enhancement internals Issues related to inner workings of lintr, i.e., not user-visible and removed internals Issues related to inner workings of lintr, i.e., not user-visible labels Jan 30, 2021
@MichaelChirico MichaelChirico added this to the 3.0.0 milestone Jan 31, 2021
@AshesITR
Copy link
Collaborator

I just noticed that the XPath based linters (our preferred way of doing stuff) seem to not contain expression IDs either, restricting the possible linters to only those using the parse_data df.
Given that's the case, do you really think we should add this feature?

@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jan 31, 2021 via email

@AshesITR
Copy link
Collaborator

AshesITR commented Feb 1, 2021

Here's a random nodeset from testing the #583 PR.

{xml_nodeset (1)}
[1] <OP-LEFT-BRACE line1="1" col1="19" line2="1" col2="19" start="40" end="40">{</OP-LEFT-BRACE>

Maybe we should remove the issue from the 3.0.0 milestone then?

@AshesITR
Copy link
Collaborator

Is this feature still considered useful, despite most of our linters using XPath?
Or should we close this in favor of #784?

@MichaelChirico
Copy link
Collaborator Author

Agree there's not much use for this at this point, especially given the technical blockers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement lint-metadata
Projects
None yet
Development

No branches or pull requests

2 participants