Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange results with stack and non-matching alternative #394

Closed
golddranks opened this issue Jun 8, 2019 · 3 comments · Fixed by #395
Closed

Strange results with stack and non-matching alternative #394

golddranks opened this issue Jun 8, 2019 · 3 comments · Fixed by #395

Comments

@golddranks
Copy link
Contributor

golddranks commented Jun 8, 2019

I'm getting strange results with the following grammar:

raw_string_inner = { (!PEEK_ALL ~ ANY)* }
raw_string = ${ PUSH("'")+ ~ raw_string_inner ~ POP_ALL }

named_exp = { raw_string ~ "->" ~ ASCII_ALPHA_LOWER }

exp = { named_exp | raw_string }

With test code '' it's a test '' and matching against exp, I'm expecting it to find a match, but it doesn't. Instead, with '' it's a test ''' (unbalanced single quotes), it does find a match.

When I test directly against raw_string, it finds a match. It seems, from the behaviour, that exp, failing to match the first alternative, messes up the state of the stack (one-off error?) when it starts to match the next one.

It might, of course, also be that I'm misunderstanding something about how the stack works, but my general understanding is that once one alternative fails to match, it shouldn't have an effect to matching the next one...?

I haven't looked in the generated code or runtime yet; just gonna start investigating now, but figured that I'd make an issue first in case there's a quick gotcha I haven't taken into account.

@golddranks
Copy link
Contributor Author

Btw. I got it working with a better grammar that uses the stack in a saner way:

raw_string_inner = { (!PEEK ~ ANY)* }
raw_string = ${ PUSH("'"+) ~ raw_string_inner ~ POP }

named_exp = { raw_string ~ "->" ~ ASCII_ALPHA_LOWER }

exp = { named_exp | raw_string }

However, it still bothers me whether it's a bug; seems like the state of the stack is not the same when trying the next alternative after a failed one.

@golddranks
Copy link
Contributor Author

Okay, tried to instrument the generated parsing code a bit. (Btw. if there is a better way to debug than doing cargo expand and instrumenting by hand, it would be very helpful to mention it in the documentation.) Here is the exp rule parsing function:

#[inline]
#[allow(non_snake_case, unused_variables)]
pub fn exp(
    state: Box<::pest::ParserState<Rule>>,
) -> ::pest::ParseResult<Box<::pest::ParserState<Rule>>> {
    dbg!(&state);
	dbg!(
    state.rule(Rule::exp, |state| {
        dbg!(&state);
        dbg!(state
            .restore_on_err(|state| {
                dbg!(&state);
                dbg!(self::named_exp(state))
            })
            .or_else(|state| {
                dbg!(&state);
                dbg!(state.restore_on_err(|state|
                    self::raw_string(state)
                ))
            }))
    }))
}

I'm getting a trace like this. Indeed, the return value of named_exp shows two pushes and only one pop. Gonna investigate further.

[src/main.rs:121] &state = ParserState {
    position: Position {
        pos: 0,
    },
    queue: [],
    lookahead: None,
    pos_attempts: [],
    neg_attempts: [],
    attempt_pos: 0,
    atomicity: NonAtomic,
    stack: Stack {
        ops: [],
        cache: [],
        snapshots: [],
    },
}
[src/main.rs:124] &state = ParserState {
    position: Position {
        pos: 0,
    },
    queue: [
        Start {
            end_token_index: 0,
            input_pos: 0,
        },
    ],
    lookahead: None,
    pos_attempts: [],
    neg_attempts: [],
    attempt_pos: 0,
    atomicity: NonAtomic,
    stack: Stack {
        ops: [],
        cache: [],
        snapshots: [],
    },
}
[src/main.rs:127] &state = ParserState {
    position: Position {
        pos: 0,
    },
    queue: [
        Start {
            end_token_index: 0,
            input_pos: 0,
        },
    ],
    lookahead: None,
    pos_attempts: [],
    neg_attempts: [],
    attempt_pos: 0,
    atomicity: NonAtomic,
    stack: Stack {
        ops: [],
        cache: [],
        snapshots: [
            0,
        ],
    },
}
[src/main.rs:128] self::named_exp(state) = Err(
    ParserState {
        position: Position {
            pos: 0,
        },
        queue: [
            Start {
                end_token_index: 0,
                input_pos: 0,
            },
        ],
        lookahead: None,
        pos_attempts: [
            raw_string,
        ],
        neg_attempts: [],
        attempt_pos: 0,
        atomicity: NonAtomic,
        stack: Stack {
            ops: [
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
                Push(
                    Span {
                        str: "\'",
                        start: 1,
                        end: 2,
                    },
                ),
                Pop(
                    Span {
                        str: "\'",
                        start: 1,
                        end: 2,
                    },
                ),
            ],
            cache: [
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
            ],
            snapshots: [
                0,
                1,
            ],
        },
    },
)
[src/main.rs:131] &state = ParserState {
    position: Position {
        pos: 0,
    },
    queue: [
        Start {
            end_token_index: 0,
            input_pos: 0,
        },
    ],
    lookahead: None,
    pos_attempts: [
        raw_string,
    ],
    neg_attempts: [],
    attempt_pos: 0,
    atomicity: NonAtomic,
    stack: Stack {
        ops: [
            Push(
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
            ),
        ],
        cache: [
            Span {
                str: "\'",
                start: 0,
                end: 1,
            },
        ],
        snapshots: [
            0,
        ],
    },
}
[src/main.rs:132] state.restore_on_err(|state| self::raw_string(state)) = Err(
    ParserState {
        position: Position {
            pos: 0,
        },
        queue: [
            Start {
                end_token_index: 0,
                input_pos: 0,
            },
        ],
        lookahead: None,
        pos_attempts: [
            raw_string,
            raw_string,
        ],
        neg_attempts: [],
        attempt_pos: 0,
        atomicity: NonAtomic,
        stack: Stack {
            ops: [
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
            ],
            cache: [
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
            ],
            snapshots: [
                0,
                1,
            ],
        },
    },
)
[src/main.rs:125] state.restore_on_err(|state|
                         {
                             dbg!(& state);
                             dbg!(self :: named_exp ( state ))
                         }).or_else(|state|
                                        {
                                            dbg!(& state);
                                            dbg!(state . restore_on_err (
                                                 | state | self :: raw_string
                                                 ( state ) ))
                                        }) = Err(
    ParserState {
        position: Position {
            pos: 0,
        },
        queue: [
            Start {
                end_token_index: 0,
                input_pos: 0,
            },
        ],
        lookahead: None,
        pos_attempts: [
            raw_string,
            raw_string,
        ],
        neg_attempts: [],
        attempt_pos: 0,
        atomicity: NonAtomic,
        stack: Stack {
            ops: [
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
            ],
            cache: [
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
            ],
            snapshots: [
                0,
                1,
            ],
        },
    },
)
[src/main.rs:122] state.rule(Rule::exp,
           |state|
               {
                   dbg!(& state);
                   dbg!(state . restore_on_err (
                        | state | {
                        dbg ! ( & state ) ; dbg ! (
                        self :: named_exp ( state ) ) } ) . or_else (
                        | state | {
                        dbg ! ( & state ) ; dbg ! (
                        state . restore_on_err (
                        | state | self :: raw_string ( state ) ) ) } ))
               }) = Err(
    ParserState {
        position: Position {
            pos: 0,
        },
        queue: [],
        lookahead: None,
        pos_attempts: [
            exp,
        ],
        neg_attempts: [],
        attempt_pos: 0,
        atomicity: NonAtomic,
        stack: Stack {
            ops: [
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
                Push(
                    Span {
                        str: "\'",
                        start: 0,
                        end: 1,
                    },
                ),
            ],
            cache: [
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
                Span {
                    str: "\'",
                    start: 0,
                    end: 1,
                },
            ],
            snapshots: [
                0,
                1,
            ],
        },
    },
)

@golddranks
Copy link
Contributor Author

Allright, got a bit further. It seems that the problem is with snapshots: It saves 0 and 1, but restores only 1 on error, resulting one Push operation staying on stack.

bors bot added a commit that referenced this issue Jun 13, 2019
395: Clearing checkpoints in error handler on successful parse r=dragostis a=golddranks

* When `restore_on_err` is called, a checkpoint is added.
* When parsing fails inside the call, the checkpoint is resumed. Specifically, stack is resumed to the state it was before entering `restore_on_err`.
* However, when parsing inside `restore_on_err` succeeds, at the moment, the checkpoint is not cleared on the exit.
* This leads to bugs where after returning from `restore_on_err`, another error is encountered, but the checkpoint that was set in `restore_on_err` is incorrectly resumed; this causes the stack to become in inconsistent state.
* Fixed the bug by adding a function for clearing checkpoints and calling it on the successful path of `restore_on_err`.
* Resolves #394
* Added test for this case. However, the testing infrastructure isn't quite clear for me, so it might be that the test would be better expressed somewhere else, in some other way. Please advise on this.

Co-authored-by: Pyry Kontio <[email protected]>
@bors bors bot closed this as completed in #395 Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant