Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: JSX captures whitespaces in nested, multiline tags #306

Open
2 tasks done
SomeoneToIgnore opened this issue Jul 23, 2024 · 2 comments
Open
2 tasks done

bug: JSX captures whitespaces in nested, multiline tags #306

SomeoneToIgnore opened this issue Jul 23, 2024 · 2 comments
Labels

Comments

@SomeoneToIgnore
Copy link

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-typescript

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

No response

Describe the bug

For a given TSX template,

a["b"] = <C d="e">
    <F></F>
    { g() }
</C>;

nested jsx_opening_element on a different line is captured with all whitespaces, as \n <F> instead of just <F>.

Steps To Reproduce/Bad Parse Tree

The Parse Tree is correct in both cases, but tree elements' ranges are not.
I have not found a way to include ranges inside the node-based tests with *.txt files, so I've created a Rust test draft:

#[cfg(test)]
mod tests_f_node {
    use tree_sitter::Node;

    use super::*;

    #[test]
    fn tsx_tag_parse_ranges() {
        let code = r#"
                a["b"] = <C d="e">
                    <F></F>
                    { g() }
                </C>;
            "#;

        let mut parser = tree_sitter::Parser::new();
        parser
            .set_language(&super::language_tsx())
            .expect("Error loading TypeScript TSX grammar");

        let tree = parser.parse(code, None).unwrap();
        let root_node = tree.root_node();

        let f_node = get_f_node(root_node, code).expect("<F> node not found");

        // Assert the ranges. Modify these values according to the actual positions in your code.
        let start_byte = f_node.start_byte();
        let end_byte = f_node.end_byte();

        assert_eq!(start_byte, 36); // Replace with the correct start byte
        assert_eq!(end_byte, 39); // Replace with the correct end byte

        let start_position = f_node.start_position();
        let end_position = f_node.end_position();

        assert_eq!(start_position.row, 2); // Line number containing <F>
        assert_eq!(start_position.column, 16); // Column where <F> starts
        assert_eq!(end_position.row, 2);
        assert_eq!(end_position.column, 19); // Column where <F> ends
    }

    fn get_f_node<'a>(node: Node<'a>, code: &'a str) -> Option<Node<'a>> {
        for child in node.children(&mut node.walk()) {
            if child.kind() == "jsx_opening_element"
                && dbg!(child.utf8_text(code.as_bytes()).unwrap()) == "<F>"
            {
                return Some(child);
            }
            if let Some(found) = get_f_node(child, code) {
                return Some(found);
            }
        }
        None
    }
}

which outputs

---- tests_f_node::tsx_tag_parse_ranges stdout ----
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<C d=\"e\">"
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "\n                    <F>"
thread 'tests_f_node::tsx_tag_parse_ranges' panicked at bindings/rust/lib.rs:97:50:
<F> node not found
stack backtrace:

on current master.

Expected Behavior/Parse Tree

I've bisected that to

37ced086ad8bb4fa67e8c53711e9f30e869bb78f is the first bad commit
commit 37ced086ad8bb4fa67e8c53711e9f30e869bb78f (HEAD)
Author: Amaan Qureshi <[email protected]>
Date:   Fri Jul 5 23:13:15 2024 -0400

    chore: generate

 tsx/src/grammar.json           |    370 +-
 tsx/src/node-types.json        |    843 +-
 tsx/src/parser.c               | 552504 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------------------------------------------------
 typescript/src/grammar.json    |    366 +-
 typescript/src/node-types.json |    847 +-
 typescript/src/parser.c        | 530546 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------
 6 files changed, 440659 insertions(+), 644817 deletions(-)

and before this commit everything works fine:

[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<C d=\"e\">"
[bindings/rust/lib.rs:118:20] child.utf8_text(code.as_bytes()).unwrap() = "<F>"
thread 'tests_f_node::tsx_tag_parse_ranges' panicked at bindings/rust/lib.rs:103:9:
assertion `left == right` failed
// this failures is a cause of my test being a draft, but it's already exposing the issue hence useful in the current state

Repro

See the test above
@SomeoneToIgnore
Copy link
Author

Hello, I'm interested to fix this and would love to get any pointers for that.

@ediezindell
Copy link

I was able to resolve the issue by rerunning npm run build in my PC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants