-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the legalizer #438
base: main
Are you sure you want to change the base?
Conversation
this contains all of the functions used in the legalizer as well as the recursive tree walk, but it does not contain the diagnostics, which will come in commits that follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made it through 16 out of 26 commits. I know I've got some big commits still ahead of me (legalize message and friends; legalize fields), which I am still working through.
But I figured I should go ahead and publish the remarks I have so far since there has been no other activity on the PR since we talked a couple of days ago. I promise I've been working on it and not ignoring it! :)
19 | package 42; | ||
| ^^^^^^^ | ||
= help: to place a file in the empty package, remove the `package` declaration | ||
= help: however, using the empty package is discouraged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like a good diagnostic. Perhaps the AST node should allow an expression instead of just a path here, so that you diagnose 42
is an invalid type for the package name?
(Totally fine to be a TODO: I know changing the AST at this point might be enough churn to warrant a separate PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained elsewhere why this doesn't work properly. There's already a TODO in the .proto file.
... | | ||
22 | | import weak "foo.proto"; | ||
| | ^^^^^^^^^^^^^^^^^^^^^^^^ this weak import... | ||
23 | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this diagnostic have an extra line of context after the snippet? The two above do not, so perhaps it's a bug?
If it's a quirk/bug in the reporting, totally fine to fix that in a separate PR. But, if it's an issue in the way the diagnostic is constructed in this commit, then let's try to fix in this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a quirk with the renderer, but it was easy enough to find and fix.
| | ||
19 | / message M { | ||
... | | ||
23 | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: why is there an extra line of context before the snippet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also fixed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing the rest of the commits from the original push.
I saw you already replied to some of my first round of comments and pushed changes, so I'll take a look at those next.
experimental/parser/legalize_decl.go
Outdated
if in != taxa.Reserved { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to legalize the values for extension ranges -- like make sure they are all numeric ranges. The only non-numeric value allowed in extension ranges is builtin max
.
Despite this early return, I see several (currently unnecessary) if in == taxa.Reserved
checks below, that perhaps should be expanded to perform other checks for extension ranges. For example, both cases below are checking usage of identifiers and names in reserved ranges, which should be outright disallowed in extension ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment explaining why I haven't done that (although it appears in a later commit maybe?).
} | ||
|
||
seq.Values(decl.Ranges())(func(expr ast.ExprAny) bool { | ||
switch expr.Kind() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing a case for ast.ExprKindRange
(for both reserved and extension ranges): I imagine we'd want to legalize the bounds -- start must be an integer and end must either be integer or max
builtin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, both of the diagnostics below warrant suggestions: where an identifier is found but unexpected, it should be quoted. Where a string literal is found but unexpected, it should be unquoted. Though if the string literal does not contain an identifier, we can't make that suggestion, but might also pair it with a warning that string literals should contain valid identifiers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added suggestions.
51 | | / oneof O { | ||
... | | | ||
57 | | | } | ||
| \___- ...cannot be declared within this file scope | ||
| \_^ this oneof definition... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interleaving of oneof and file appear in correct. Shouldn't the file always be rendered as "wider range" than anything else inside it?
Also, I don't find the file scope annotation to be helpful. In reality, the oneof is absoluately allowed inside this file scope -- it just must be nested inside a message. So I think it would be better to just leave out the file-scope annotation and to revise the wording about it not being allowed at the top-level. Maybe even use the map of allowed parents to construct a message about where it is allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Fixed, and added the help text.
def.MarkCorrupt() | ||
kw := taxa.Keyword(def.Keyword().Text()) | ||
p.Error(errUnexpected{ | ||
what: def.Name(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be nice to just highlight the first separator instead of the entire path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it but idk if I like the way it looks.
error: unexpected `.` in identifier
--> testdata/parser/def/bad_path.proto:20:14
|
20 | oneof foo.Bar {}
| ^ expected identifier
= note: the name of a oneof definition must be a single identifier
experimental/parser/legalize_file.go
Outdated
@@ -222,6 +220,8 @@ func legalizePackage(p *parser, parent classified, idx int, first *ast.DeclPacka | |||
legalizePath(p, taxa.Package.In(), decl.Path(), pathOptions{Relative: true}) | |||
} | |||
|
|||
var isOrdinaryFilePath = regexp.MustCompile("[0-9a-zA-Z./_-]*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move this down? The convention in most of Buf's codebases is to keep consts and vars at the top of the file (consistency makes them easier to find when you're not in an IDE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closer to where it's used. I know that's the style, but for some reason I thought you'd said something about how you preferred to have stuff closer together? I can move it back.
(I wish go just had function-local globals lmao, this is such a caveman misfeature)
This PR adds the legalizer: a collection of diagnostics that follow parsing to eliminate and diagnose invalid constructs that the parser accepted in order the make progress.
This PR also adds a new
syntax
package for e.g.syntax.Proto2
and other editions knowledge. It adds functions for classifyingpredeclared.Name
s and some newtaxa.Noun
s for naming concepts in diagnostics relevant to the legalizer. Parsing is also tweaked in a few places to generate better diagnostics.