Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to parse inline JSX element with MDX expression child #136

Closed
begleynk opened this issue Sep 17, 2024 · 5 comments · Fixed by #138
Closed

Unable to parse inline JSX element with MDX expression child #136

begleynk opened this issue Sep 17, 2024 · 5 comments · Fixed by #138

Comments

@begleynk
Copy link
Contributor

It seems that since 1.0.0-alpha.17 this syntax doesn't work any more:

<span>{1}</span>

You get the following error:

Err(
    Message {
        place: Some(
            Point(
                2:1 (17),
            ),
        ),
        reason: "Expected a closing tag for `<span>` (1:1)",
        rule_id: "end-tag-mismatch",
        source: "markdown-rs",
    },
)

1.0.0-alpha.16 parses it correctly:

Ok(
    Root {
        children: [
            Paragraph {
                children: [
                    MdxJsxTextElement {
                        children: [
                            MdxTextExpression {
                                value: "1",
                                position: Some(
                                    1:7-1:10 (6-9),
                                ),
                                stops: [
                                    (
                                        0,
                                        7,
                                    ),
                                ],
                            },
                        ],
                        position: Some(
                            1:1-1:17 (0-16),
                        ),
                        name: Some(
                            "span",
                        ),
                        attributes: [],
                    },
                ],
                position: Some(
                    1:1-1:17 (0-16),
                ),
            },
        ],
        position: Some(
            1:1-2:1 (0-17),
        ),
    },
)

I used the following config:

    let mut opts = markdown::Options::default();
    let mut parse_opts = markdown::ParseOptions::mdx();
    parse_opts.constructs.gfm_table = true;

    opts.parse = parse_opts;

    // Parse the markdown file into an AST
    let mdast = markdown::to_mdast(&content, &opts.parse);

Perhaps this is the commit that caused it? 71361e4

Happy to take a stab at fixing this. Any pointers about what might be going on and where to start would be appreciated!

@wooorm
Copy link
Owner

wooorm commented Sep 17, 2024

Would love to have you help!

The commit you link to could definitely be it.
Weird that your input code doesn’t match? It seems that the comments talk about it, and the tests cover it?

In the commit message, I linked to the commits I tried to port. Perhaps there‘s a difference?

@begleynk
Copy link
Contributor Author

Sounds good! I'll take closer look and familiarize myself with the internals a bit.

I've started by adding a test case to tests/mdx_jsx_text.rs that repro's the issue (ignore incorrect positions atm):

    assert_eq!(
        to_mdast("<a>{1}</a>", &mdx.parse)?,
        Node::Root(Root {
            children: vec![Node::Paragraph(Paragraph {
                children: vec![Node::MdxJsxTextElement(MdxJsxTextElement {
                    name: Some("a".into()),
                    attributes: vec![],
                    children: vec![Node::MdxTextExpression(MdxTextExpression {
                        value: "1".into(),
                        position: Some(Position::new(1, 1, 0, 1, 3, 2)),
                        stops: vec![(0, 2)]
                    }),],
                    position: Some(Position::new(1, 1, 0, 1, 3, 2))
                }),],
                position: Some(Position::new(1, 1, 0, 1, 11, 10))
            })],
            position: Some(Position::new(1, 1, 0, 1, 11, 10))
        }),
        "should support mdx jsx (text) with expression child"
    );

I'll see how far I get!

@begleynk
Copy link
Contributor Author

begleynk commented Sep 18, 2024

I think I can see the issue, but I'm struggling to figure out the right fix.

The issue is indeed in src/construct/mdx_expression_flow.rs:

pub fn end(tokenizer: &mut Tokenizer) -> State {
    // ...comments removed for brevity...
    match tokenizer.current {
        None | Some(b'\n') => {
            reset(tokenizer);
            State::Ok
        }
        // -------- Attempting to parse a JSX tag after the expr ---------
        Some(b'<') if tokenizer.parse_state.options.constructs.mdx_jsx_flow => {
            tokenizer.attempt(
                State::Next(StateName::MdxExpressionFlowAfter),
                State::Next(StateName::MdxExpressionFlowNok),
            );
            State::Retry(StateName::MdxJsxStart)
        }
        // ----------------------------------------------------------------
        _ => {
            reset(tokenizer);
            State::Nok
        }
    }
}

If you take out those lines handling the JSX tag case, what seems to happen is that we try to parse the closing tag </a>, and the parser realizes "aha, we're no longer in a block-level JsxFlow", and backtracks to the beginning of the line, parsing everything inside a paragraph.

With this line, what instead happens is we never backtrack, and we get the mismatch error because it seems there's a leftover event after we run compile (to_mdast.rs).

So unless I'm way off here, it seems like we'd like to be able to do the same thing and backtrack once we've parsed the full line?

e.g. if you change the highlighted code above to this:

        Some(b'<') if tokenizer.parse_state.options.constructs.mdx_jsx_flow => {
            tokenizer.tokenize_state.token_1 = Name::MdxJsxFlowTag;
            tokenizer.attempt(
                State::Next(StateName::MdxJsxFlowAfter),
                State::Next(StateName::MdxJsxFlowNok),
            );
            State::Retry(StateName::MdxJsxStart)
        }

This parses, but we don't get an inline JSX flow inside a Paragraph, but a block-level JSX tag instead.

Any pointers here would be really appreciated!

@begleynk
Copy link
Contributor Author

begleynk commented Sep 18, 2024

I actually just noticed that calling to_mdast with this input: {1}<x/> causes a panic:

thread 'main' panicked at /Users/nik/.asdf/installs/rust/1.75.0/registry/src/index.crates.io-6f17d22bba15001f/markdown-1.0.0-alpha.20/src/to_mdast.rs:1557:30:
expected tag
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

to_html_with_options returns a blank string, and there's a test for this case.

This was the config I used:

    let mut opts = markdown::Options::default();
    let parse_opts = markdown::ParseOptions::mdx();

    opts.parse = parse_opts;

    // Parse the markdown file into an AST
    let mdast = markdown::to_mdast(&content, &opts.parse);

Update
Looking deeper, it looks like the code that causes the panic is assuming that there was an opening tag previously.

Assuming that should be valid syntax, what would the AST be? The test claims it should be a block-level construct, but what is the container?

    // test in question:
    assert_eq!(
        to_html_with_options("{1}<x/>", &swc)?,
        "",
        "should support an expression and then a tag (flow)"
    );

(Sorry this is getting a bit rambly... I'm going to take another look later with some fresh eyes!)

@wooorm
Copy link
Owner

wooorm commented Sep 19, 2024

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants