-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exclude parenthese determination from comments and string literals #4964
Exclude parenthese determination from comments and string literals #4964
Conversation
rewrite-groovy/src/main/java/org/openrewrite/groovy/GroovyParserVisitor.java
Outdated
Show resolved
Hide resolved
rewrite-groovy/src/main/java/org/openrewrite/groovy/GroovyParserVisitor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a shame that we need to handcode a lexer here. Maybe we should at least explore the possibility of using org.codehaus.groovy.antlr.parser.GroovyLexer
instead?
rewrite-groovy/src/main/java/org/openrewrite/groovy/GroovyParserVisitor.java
Outdated
Show resolved
Hide resolved
rewrite-groovy/src/main/java/org/openrewrite/groovy/GroovyParserVisitor.java
Outdated
Show resolved
Hide resolved
rewrite-groovy/src/main/java/org/openrewrite/groovy/GroovyParserVisitor.java
Show resolved
Hide resolved
I fear figuring it all out and using it is neither worth the time nor the effort. The Groovy Compiler uses ANTLR to do its lexing, so for example the dollar slashy string The GroovyLexer java class does not contain the delimiter fields I defined. Internally the Groovy compiler does indeed have that knowledge somewhere (see for example the internal StringUtil class, where they use an int to define whether a string is of a slashy type). Edit: The GroovyLexer does of course contain: public static final int DQ_GSTRING_MODE = 1;
public static final int TDQ_GSTRING_MODE = 2;
public static final int SLASHY_GSTRING_MODE = 3;
public static final int DOLLAR_SLASHY_GSTRING_MODE = 4;
public static final int GSTRING_TYPE_SELECTOR_MODE = 5; But that doesn't really do any good. You still need to translate those integers to Edit 2: Ah, now I understand it better, after reading a little bit more about lexers (never before have I read those kind of files). All the "lexer rules" are defined in this ANTLR lexer file. This file uses grammar rules to define a language. The java code is generated, most keywords are listed as integer constants in GroovyLexer java class. Other useful stuff is generated as private members, for instance: public class AstBuilder extends GroovyParserBaseVisitor<Object> {
//..
private static final String SLASH_STR = "/";
private static final String SLASH_DOLLAR_STR = "/$";
//..
} As we can't use al the private stuff, unless you want to use reflection, I wouldn't go down this road. It's just not worth to invest anymore time. |
The idea with the lexer would have been to let the ANTLR lexer tokenize some given part of the source code. It would then return a stream of tokens, where each token is stuff like keywords, literals, identifiers, punctuation characters, comments, and of course also tokens representing open and close parentheses. So the idea would have been to just zip through this token stream and count the parentheses. The lexer would worry about literals, comments, etc. Whether it is any easier or better in the end is hard to tell without trying it out. But an ANTLR lexer is definitely very well optimized for tokenization. |
Ok, now I understand what you were getting at. My knowledge of lexers is zero to none, as you could see in the above post. I guess if we want to try that, you should take over. Still, I think it's an unnecessary action now that we have everything in place. Let's wrap up this PR; if we ever face the parenthesis question again, let's reconsider whether using the ANTLR lexer is a good idea. |
…elimiter.java Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions could not be made:
- rewrite-maven/src/test/java/org/openrewrite/maven/internal/MavenPomDownloaderTest.java
- lines 1278-1278
What's changed?
Exclude parentheses handling for comments and string literals.
What's your motivation?
Improvement to
Checklist