Parsing-After-Editing Test Case is incorrect (and passing)
rooney opened this issue · comments
Problem
The test test_parsing_after_editing_tree_that_depends_on_column_values
is incorrect.
It starts with parsing the following source code:
a = b
c = do d
e + f
g
h + i
and asserting that the parse tree should be:
"(block ",
"(binary_expression (identifier) (identifier)) ",
"(binary_expression (identifier) (do_expression (block (identifier) (binary_expression (identifier) (identifier)) (identifier)))) ",
"(binary_expression (identifier) (identifier)))",
Then, it perform_edit
to the source code, to become:
a = b
c1234 = do d
e + f
g
h + i
(so far so good)
The problem is, it then asserts that the parse tree of the edited source code should become:
"(block ",
"(binary_expression (identifier) (identifier)) ",
"(binary_expression (identifier) (do_expression (block (identifier)))) ",
"(binary_expression (identifier) (identifier)) ",
"(identifier) ",
"(binary_expression (identifier) (identifier)))",
Which doesn't seem to be correct because what the perform_edit
did is just renaming the identifier c
to c1234
-- it should not result in any change to the tree structure at all.
But yes, the test passes. Which means, there's some bug in the tree-editing implementation.
Steps to reproduce
- open
cli/src/tests/parser_test.rs
- Run the test suite
- All tests pass
Expected behavior
test_parsing_after_editing_tree_that_depends_on_column_values
shall not pass.
Or, if it were to, then the edited parse tree should be equal to the original.
Tree-sitter version (tree-sitter --version)
tree-sitter 0.22.2 (b7fcf98)
Operating system/version
macOS 13.6.1
No, the test is correct as is.
That test is simulating a Haskell-like language, where indentation blocks are based on the column where the block begins (in this case, the word do
). The exact grammar is in the test_grammars
folder. You can see its indentation logic here.
tree-sitter/test/fixtures/test_grammars/uses_current_column/scanner.c
Lines 79 to 91 in 78b6067
Prior to the edit, the do
is at column 4. So the following two lines, which begin at column 6, are considered indented (relative to the first line).
After the edit, do
is now at column 8. So now, the second and third lines are not indented. This changes the entire syntactic structure of the block. That's the whole point of this test - the grammar depends on the column positions of tokens.