Suppress output of unnecessary nodes #8

oovm · 2021-10-20T09:39:26Z

This is my grammar file,

https://github.com/oovm/lrpeg-test/blob/master/projects/lrpeg/src/ygg.peg

This is my parsing output of x=0

https://github.com/oovm/lrpeg-test/blob/95b032c9ca0fd46c39ea31edf480d41eaecdec1b/projects/lrpeg/tests/assign.yaml#L23-L34

I don't understand why there are two Terminals here, and the statement nodes I need are wrapped in children.

In my understanding, Terminal is ε or string or regex, it should have no children.

The text was updated successfully, but these errors were encountered:

seanyoung · 2021-10-20T11:20:53Z

You are right, this looks broken.

seanyoung · 2021-10-20T16:03:30Z

(statement IGNORE)* becomes a node in tree, which is incorrectly labelled Terminal. I've just pushed a change which labels them List (amongst others). Let me know how this works for you.

oovm · 2021-10-20T16:22:03Z

I'm just translating pest

program  = {SOI ~ statement* ~ EOI}
vs
program  <- IGNORE (statement IGNORE)* EOI;

// IGNORE means anything that can be skipped
IGNORE <- space* / newline* / comment?

a ~ b   =>  a IGNORE b
a ~ b?  =>  a IGNORE b?
a ~ b*  =>  a IGNORE (b IGNORE)*
a ~ b+  => a ~ b ~ b*

From the results, pest did not generate additional nodes

oovm · 2021-10-20T16:58:54Z

Okay, it makes sense, after I traverse it once and flatten it, the result is correct

seanyoung · 2021-10-20T17:08:58Z

lrpeg does generate too many nodes. Lots of them do not have useful information.

Does pest have a way of marking a rule/line as "do not generate nodes for this" or it is clever in some other way?

oovm · 2021-10-20T17:51:50Z

In my opinion, whether it is useful or not needs to be determined according to the purpose. My classification is like this

Useless: Hard to think of usage
- ε, EOI
Ignored: Formatting needs to use these semantics
- comment, space, newline
Unnamed(Weak semantics): macros need to use these semantics
- keywords, brackets, operators
Effective semantics:
- others

According to this classification, my filter looks like this

pub fn flatten(node: Node) -> Node {
    let mut buffer = vec![];
    for node in node.children {
        flatten_rec(node, &mut buffer)
    }
    Node {
        rule: node.rule,
        start: node.start,
        end: node.end,
        children: buffer,
        alternative: node.alternative,
    }
}

pub fn flatten_rec(node: Node, buffer: &mut Vec<Node>) {
    match node.rule {
        // flatten these nodes
        Rule::Any | Rule::List => {
            for node in node.children {
                flatten_rec(node, buffer)
            }
        }
        // not important
        Rule::EOI => {}
        #[cfg(feature = "no-ignored")]
        Rule::IGNORE => {}
        #[cfg(not(feature = "no-ignored"))]
        Rule::IGNORE if node.start == node.end => {}
        #[cfg(feature = "no-unnamed")]
        Rule::Terminal => {}
        #[cfg(not(feature = "no-unnamed"))]
        Rule::Terminal if node.start == node.end => {}
        _ => buffer.push(flatten(node)),
    }
}

seanyoung · 2021-10-23T07:16:35Z

How can the parser generator decided which nodes to create and which not to create nodes for?

seanyoung · 2021-10-23T08:58:07Z

We could take inspiration from pest and do not create nodes for rules which start with an underscore.

oovm · 2021-10-27T01:12:43Z

It sounds like a feasible design, but

If the node is hidden, who will hold the label and alternative attached to the node.

eg: what's the result of 1+2 under rule:

expr <- _expr0;
_expr0 <- 
    add:/ <lhs:_expr0> ("+"/"-") <rhs:_expr1>
        / _expr1;
expr1 <- 
    mul:/ <lhs:_expr1> ("*"/"/") <rhs:_expr2>
        / _expr2;
_expr2 <- 
    pow:/ <lhs:_expr3> "^" <rhs:_expr2>
        / _expr3;
_expr3 <- num:/num;

seanyoung · 2021-10-27T07:50:10Z

So the idea is that if a node is hidden, then it will inherit its (non-hidden) children. So lhs and rhs are in the parse tree, even though _expr0 is not.

Now that does leave the question about the alternative though..

seanyoung added the enhancement New feature or request label Oct 23, 2021

seanyoung changed the title ~~Qustion about children of Terminal~~ Suppress output of unnecessary nodes Oct 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suppress output of unnecessary nodes #8

Suppress output of unnecessary nodes #8

oovm commented Oct 20, 2021

seanyoung commented Oct 20, 2021

seanyoung commented Oct 20, 2021

oovm commented Oct 20, 2021

oovm commented Oct 20, 2021

seanyoung commented Oct 20, 2021

oovm commented Oct 20, 2021 •

edited

Loading

seanyoung commented Oct 23, 2021

seanyoung commented Oct 23, 2021

oovm commented Oct 27, 2021

seanyoung commented Oct 27, 2021

Suppress output of unnecessary nodes #8

Suppress output of unnecessary nodes #8

Comments

oovm commented Oct 20, 2021

seanyoung commented Oct 20, 2021

seanyoung commented Oct 20, 2021

oovm commented Oct 20, 2021

oovm commented Oct 20, 2021

seanyoung commented Oct 20, 2021

oovm commented Oct 20, 2021 • edited Loading

seanyoung commented Oct 23, 2021

seanyoung commented Oct 23, 2021

oovm commented Oct 27, 2021

seanyoung commented Oct 27, 2021

oovm commented Oct 20, 2021 •

edited

Loading