z7zmey / php-parser

PHP parser written in Go

Home Page:https://php-parser.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

No reliable way to work with parenthesis

quasilyte opened this issue · comments

Proposal: add WithParens in analogy with func (l *Parser) WithFreeFloating().
That function asks parser to preserve info about parenthesis.

It can either introduce AST nodes that represent parenthesis or make them implicit, but positions info should respect them.

It's almost impossible to handle parenthesis in the source code right now. We can't reliably count a number of expression surrounding ( and ).

Imagine this binary expression:

(($x) + ((($y))))

If we use GetPosition and take a slice of src[pos.StartPos-1 : pos.EndPos], result would be:

$x) + ((($y

If we go to the left and right, we'll count 2 parens, but the binary expression itself has only 1 surrounding, while $y is surrounded by 3 pairs of ().

With proposed feature, I would expect binary expression position to capture the entire (($x) + ((($y)))). Sub-expression 1 should be ($x) and the 2 is ((($y))).

For clone expr like clone ($x) we get positions that encloses clone ($x.

Note for #81: you can encode parenthesis as a int field that represents how many ( ) around expr there are. It can even be a few bits from a flags field, I would say encoding up to 3 pairs (2 bits) is enough (0 - no, 1 - (), 2 - (()), 3 - ((()))).

One possible fix is to update '(' expr ')' definitions to set positions like this:

'(' expr ')'
  $$ = $2;
  $$.SetPosition(yylex.(*Parser).positionBuilder.NewTokensPosition($1, $3))

So we don't lose parentheses positions info.

diff --git a/php7/php7.y b/php7/php7.y
index 3ff11c0..89ef7b4 100644
--- a/php7/php7.y
+++ b/php7/php7.y
@@ -3856,6 +3856,9 @@ expr_without_variable:
             {
                 $$ = $2;
 
+                // save position
+                $$.SetPosition(yylex.(*Parser).positionBuilder.NewTokensPosition($1, $3))
+
                 // save comments
                 yylex.(*Parser).setFreeFloating($$, freefloating.Start, append($1.FreeFloating, append(yylex.(*Parser).GetFreeFloatingToken($1), (*$$.GetFreeFloating())[freefloating.Start]...)...))
                 yylex.(*Parser).setFreeFloating($$, freefloating.End, append((*$$.GetFreeFloating())[freefloating.End], append($3.FreeFloating, yylex.(*Parser).GetFreeFloatingToken($3)...)...))
@@ -4764,6 +4767,9 @@ dereferencable:
             {
                 $$ = $2;
 
+                // save position
+                $$.SetPosition(yylex.(*Parser).positionBuilder.NewTokensPosition($1, $3))
+
                 // save comments
                 yylex.(*Parser).setFreeFloating($$, freefloating.Start, append($1.FreeFloating, append(yylex.(*Parser).GetFreeFloatingToken($1), (*$$.GetFreeFloating())[freefloating.Start]...)...))
                 yylex.(*Parser).setFreeFloating($$, freefloating.End, append((*$$.GetFreeFloating())[freefloating.End], append($3.FreeFloating, yylex.(*Parser).GetFreeFloatingToken($3)...)...))
@@ -4789,6 +4795,9 @@ callable_expr:
             {
                 $$ = $2;
 
+                // save position
+                $$.SetPosition(yylex.(*Parser).positionBuilder.NewTokensPosition($1, $3))
+
                 // save comments
                 yylex.(*Parser).setFreeFloating($$, freefloating.Start, append($1.FreeFloating, append(yylex.(*Parser).GetFreeFloatingToken($1), (*$$.GetFreeFloating())[freefloating.Start]...)...))
                 yylex.(*Parser).setFreeFloating($$, freefloating.End, append((*$$.GetFreeFloating())[freefloating.End], append($3.FreeFloating, yylex.(*Parser).GetFreeFloatingToken($3)...)...))

But I think having explicit paren expr nodes is better even though it would hurt the performance.

Now I see that "free floating" might be the thing I'm looking for. :)

Yes, you can count ( free-floating tokens at freefloating.Start position. But be careful, I move all freefloating.Start whitespaces, comments, and tokens to the most parent node, but freefloating.End saves as is so ( and ) tokens can be coupled to different nodes.