Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Generating constant str rules #718

Open
shilangyu opened this issue Oct 8, 2022 · 3 comments
Open

[feature] Generating constant str rules #718

shilangyu opened this issue Oct 8, 2022 · 3 comments

Comments

@shilangyu
Copy link

Problem

Often the grammar requires some constant strings in it (keywords, operators, etc). These can be introduced as a separate rule, example:

return_type_operator = { "->" }
function_keyword = { "fn" }

and then reused in some other rules. When parsing and then generating the AST, these constant tokens are lost (which is obviously a good thing). But when we want to serialize the AST back to its code form we need these constant tokens. Since pest does not generate these constant strings as public (they are inlined), one could not reuse them. The final solution is to redefine them, for example by attaching them to the AST nodes:

impl FunctionNode {
	const KEYWORD: &str = "fn";
}

This leads to a problem when we want to change a constant string; we need to remember to change it in both places.

Proposal

Pest could detect const rules and generate constants for them, for example:

// After #[derive(Parser)]
impl Constants {
	const return_type_operator: &str = "->";
	const function_keyword: &str = "fn";
}

I see two paths that can be taken here:

  1. If a rule is composed of a single literal string, we generate a constant for it
  2. If the rule can be evaluated as const, we generate a constant for it

Path 2 is obviously preferred, but harder to implement: then there would be a need of some const eval engine. For instance, a rule is constant if all rules used inside are constant, if the repetition count is constant, etc. Then a the rule would be have to be evaluated to a single final constant string.

I don't know the codebase at all, but I'm pretty sure I would be able to implement path 1 and send a PR for it if it is wanted. I would only worry about whether such a primitive feature would feel incomplete compared to path 2.

@tomtau
Copy link
Contributor

tomtau commented Oct 8, 2022

I'm thinking whether other feature (e.g. node/branch tagging) could be hijacked/reused for this use case in some way.

For the path 2, you can check out https://github.com/pest-parser/pest/tree/master/meta/src/optimizer -- e.g. this https://github.com/pest-parser/pest/blob/master/meta/src/optimizer/concatenator.rs will turn e.g. function_keyword = @{ "f" ~ "n" } into function_keyword = @{ "fn" }

@shilangyu
Copy link
Author

Is the node/branch tagging you are referring to discussed somewhere?

@tomtau
Copy link
Contributor

tomtau commented Oct 8, 2022

#550 -- it's primarily for the parser state and results, but it could perhaps be exposed in the generated rules code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants