tree-sitter / tree-sitter-c

C grammar for tree-sitter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: Fail to parse conactenated_string

mingodad opened this issue · comments

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-c

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.20.9 (8759352542e298a537ff7d96d74b362d9011684b)

Describe the bug

It fails to recognize a concatenated_string.

[translation_unit](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [5, 0]
  [preproc_def](https://tree-sitter.github.io/tree-sitter/playground#) [0, 0] - [1, 0]
    name: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [0, 8] - [0, 11]
    value: [preproc_arg](https://tree-sitter.github.io/tree-sitter/playground#) [0, 11] - [0, 20]
  [function_definition](https://tree-sitter.github.io/tree-sitter/playground#) [1, 0] - [4, 1]
    type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [1, 0] - [1, 3]
    declarator: [function_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [1, 4] - [1, 14]
      declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [1, 4] - [1, 8]
      parameters: [parameter_list](https://tree-sitter.github.io/tree-sitter/playground#) [1, 8] - [1, 14]
        [parameter_declaration](https://tree-sitter.github.io/tree-sitter/playground#) [1, 9] - [1, 13]
          type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [1, 9] - [1, 13]
    body: [compound_statement](https://tree-sitter.github.io/tree-sitter/playground#) [2, 0] - [4, 1]
      [declaration](https://tree-sitter.github.io/tree-sitter/playground#) [3, 4] - [3, 38]
        [type_qualifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 4] - [3, 9]
        type: [primitive_type](https://tree-sitter.github.io/tree-sitter/playground#) [3, 10] - [3, 14]
        declarator: [init_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [3, 15] - [3, 37]
          declarator: [pointer_declarator](https://tree-sitter.github.io/tree-sitter/playground#) [3, 15] - [3, 19]
            declarator: [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 16] - [3, 19]
          value: [concatenated_string](https://tree-sitter.github.io/tree-sitter/playground#) [3, 22] - [3, 37]
            [string_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 22] - [3, 28]
            [ERROR](https://tree-sitter.github.io/tree-sitter/playground#) [3, 29] - [3, 32]
              [identifier](https://tree-sitter.github.io/tree-sitter/playground#) [3, 29] - [3, 32]
            [string_literal](https://tree-sitter.github.io/tree-sitter/playground#) [3, 33] - [3, 37]
              [escape_sequence](https://tree-sitter.github.io/tree-sitter/playground#) [3, 34] - [3, 36]

Steps To Reproduce/Bad Parse Tree

tree-siter parse test.c

Expected Behavior/Parse Tree

translation_unit [0, 0] - [5, 0]
preproc_def [0, 0] - [1, 0]
name: identifier [0, 8] - [0, 11]
value: preproc_arg [0, 11] - [0, 20]
function_definition [1, 0] - [4, 1]
type: primitive_type [1, 0] - [1, 3]
declarator: function_declarator [1, 4] - [1, 14]
declarator: identifier [1, 4] - [1, 8]
parameters: parameter_list [1, 8] - [1, 14]
parameter_declaration [1, 9] - [1, 13]
type: primitive_type [1, 9] - [1, 13]
body: compound_statement [2, 0] - [4, 1]
declaration [3, 4] - [3, 38]
type_qualifier [3, 4] - [3, 9]
type: primitive_type [3, 10] - [3, 14]
declarator: init_declarator [3, 15] - [3, 37]
declarator: pointer_declarator [3, 15] - [3, 19]
declarator: identifier [3, 16] - [3, 19]
value: concatenated_string [3, 22] - [3, 37]
string_literal [3, 22] - [3, 28]
identifier [3, 29] - [3, 32]
string_literal [3, 33] - [3, 37]
escape_sequence [3, 34] - [3, 36]

Repro

#define STR "string"
int main(void)
{
    const char *str = "The " STR "\n";
}

With the changes shown bellow it does parse the example shown above, but maybe there is a better solution for it !

...
   inline: $ => [
@@ -67,6 +68,7 @@ module.exports = grammar({
     [$.enum_specifier],
     [$._type_specifier, $._old_style_parameter_list],
     [$.parameter_list, $._old_style_parameter_list],
+    [$._expression_not_binary, $.concatenated_string],
   ],
...
-    concatenated_string: $ => prec.right(seq(
-      choice($.identifier, $.string_literal),
-      $.string_literal,
-      repeat(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
-    )),
+    concatenated_string: $ => prec.right(choice(
+      seq(
+        $.identifier,
+        $.string_literal,
+        repeat(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
+      ),
+      seq(
+        $.string_literal,
+        repeat1(choice($.string_literal, $.identifier)), // Identifier is added to parse macros that are strings, like PRIu64
+      )),
+    ),

#189 fixed this