-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace regex with winnow
based parser
#255
base: main
Are you sure you want to change the base?
Conversation
This file is intended to define valid `Procfile` inputs and parser behavior separate from implementation. It may be updated in the future based on upstream requirements (CNB spec or kubernetes spec) or through future discussion.
@@ -0,0 +1,42 @@ | |||
# CNB Procfile format specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this exists, but I wonder if this spec should be shared/enforced more widely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shared
You mean you're wanting more feedback from a diverse group of people (such as other Heroku teams) or you're wanting to promote the format to a larger group outside of Heroku such as language communities (or both)?
I'm open to both. I'm also open to suggestions. Perhaps we link to the SPEC from the readme and make it more prominent.
enforced
We could share the logic by providing a stand-alone parser utility that converts it to some other format like json or toml. We could start by calling both parsers in cedar and instrumenting failures. While people are using #
like comments in today's procfile, technically any leading character would work //
.
I also had the idea that I could expose the parser or ship it via crates.io so that other buildpacks could use it. Some buildpacks have logic like "If app has a Procfile..." but it would be more robust to share the same parsing logic.
Getting a little ahead of myself. We could also consider upstreaming RFC 1123 to the CNB spec to make it more clear so then it's not "the procfile buildpack supports the intersection of kubernetes and CNB" and then becomes simply "the procfile buildpack supports the CNB spec".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was just thinking share/enforce this across Heroku more broadly. This will be the third Procfile parser at Heroku, and I doubt they are consistent. This isn't a blocker for this PR though.
I'd support adjusting the CNB spec to have process names that match RFC 1123 if you wanted to do that. I expect many / most CNB users expect to use apps built with CNBs in kubernetes in some way.
* Use "separated" instead of terminated/preceded * Add a missing word * Specify that spaces are not part of they key * Document duplicate key behavior
- An empty line MUST be zero or more spaces followed by a line ending or end of file (EOF). | ||
- Key/Value pairs | ||
- A line MAY contain a key/value pair where the key represents the name of a process and the value represents a command | ||
- A key MUST be separated from its value by a colon (`:`) followed by zero or more spaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this spec covers a space before the :
? Like my-process : bin/blah
. I think that should be legal, but I could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should be legal, but I could be wrong.
Looking at the prior regex
^[[:space:]]*([a-zA-Z0-9_-]+):?\\s+(.*)[[:space:]]*
- starts with one or more whitespace characters
- alphanumeric or
_
or-
- funny enough the
:
is optional, but it doesn't support spaces before:
- A space after the (optional)
:
is enforced (which is a requirement we've loosened up). It cannot be a tab, it must be a space (which we've loosened up). - Then it grabs the value
- Then the regex makes it look like it strips off any trailing spaces but it doesn't. Rust won't strip it out:
Nor will ruby:
irb(main):005:0> match = "web: rails s ".match(/^[[:space:]]*([a-zA-Z0-9_-]+):?\s+(.*)[[:space:]]*/);
=> #<MatchData "web: rails s " 1:"web" 2:"rails s ">
irb(main):006:0> match[2]
=> "rails s "
So a space before the :
wouldn't work today, but instead of an error or warning you'll quietly get an ignored line. I don't think we want to support optional :
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also in favour of making the spec (and the Procfile CNB implementation) more strict than the current implementation in cases where we think it makes sense. So long as the Procfile CNB error message is clear, users can tweak their Procfile if needed as part of migrating to CNB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on @edmorley's comment. I imagine most of the allowed weird Procfile
s on cedar are accidentally valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably warrants a major version bump. Looks good to me!
Replace the existing
Regex
based "parser" with an actual honest-to-goodness parser/combinator based parser.Procfile
validation more strict #73How to review
Some suggestions on how to review this:
procfile.rs
. Some of them were useful for development, the majority are asserting overall behavior and edge cases