tree-sitter / tree-sitter

An incremental parsing system for programming tools

Home Page:https://tree-sitter.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`ts_node_is_missing()` fails if missing node wraps hidden rules

DavisVaughan opened this issue · comments

Problem

This is my attempt to bring #1043 back to the forefront, but with a more convincing example to prove that this is a bug in tree-sitter core, and not just the cli (as it was tagged there), so hopefully a fix can get in the next release 🤞 .

Consider tree-sitter-julia's identifier rule identifier: $ => $._word_identifier:
https://github.com/tree-sitter/tree-sitter-julia/blob/acd5ca12cc278df7960629c2429a096c7ac4bbea/grammar.js#L975

For some reason unknown to me, identifier rules seem to be used as the nodes that get inserted as MISSING for pretty much all grammars. All good there.

However, in this particular grammar the identifier rule points to a hidden rule, _word_identifier. That somehow prevents ts_node_is_missing() from detecting that the inserted MISSING identifier node is actually missing, rendering that function pretty much useless for the julia grammar.

I have come up with a failing test for this to prove it, and I have pushed this to GitHub if that is easier for you to pull from:
DavisVaughan@34c5ffc

Note that I had to add Julia to the list of fixtures to fetch from, as I could not find an example of an existing fixture grammar where the identifier node points to a hidden rule like _word_identifier (which is probably why you all haven't really noticed this issue before).

#[test]
fn test_node_is_missing() {
    let mut parser = Parser::new();
    parser.set_language(&get_language("julia")).unwrap();
    let source = "x =";

    let tree = parser.parse(source, None).unwrap();

    let node = tree.root_node().child(0).unwrap();
    assert_eq!(node.kind(), "assignment");

    // The `x`
    let child = node.child(0).unwrap();
    assert_eq!(child.kind(), "identifier");

    // The `=`
    let child = node.child(1).unwrap();
    assert_eq!(child.kind(), "operator");

    // This is a MISSING node!
    let child = node.child(2).unwrap();
    assert_eq!(child.kind(), "identifier");

    // See, it is 0 width and everything (this passes)
    assert_eq!(
        child.range(), 
        Range {
            start_byte: 3, 
            end_byte: 3, 
            start_point: Point { row: 0, column: 3 }, 
            end_point: Point { row: 0, column: 3 }
        }
    );

    // But it isn't declared as missing here.
    // This should pass, but does not!
    assert!(child.is_missing());
}

Steps to reproduce

Check out the commit from above and run the tests

Expected behavior

ts_node_is_missing() should detect the missing node, even if it wraps a hidden rule like that

Tree-sitter version (tree-sitter --version)

tree-sitter 0.22.5

Operating system/version

macOS 13.6.5