nikic / PHP-Parser

A PHP parser written in PHP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Any requests for PHP-Parser 5.0?

nikic opened this issue · comments

I plan to release the final version of PHP-Parser 5.0 soon. Are there any changes I should consider before doing so?

Quick link to the changelog and upgrade guide for those who are also interested.

Hey @nikic,
how about a ReverseNodeTraverser? This would be useful in SAST applications when looking for sources for a specific dangerous sink (Node e.g. eval()). After finding a dangerous sink, the ReverseNodeTraverser could be used to traverse in reverse direction and find any sources by following the data flow.

I've only used it a little so I'm speaking from limited context, but you've said in the past that it's a good parser, but only so-so as a code generator. Maybe improving the code-generator capabilities would be an area to explore? It's already the defacto standard parser, so having it also be the defacto standard generator would be efficient.

Is there an expected release date for the final version? Or are new requests still being developed?

What I had in mind here is more along the lines of backwards-incompatible changes that need to be part of the major version and can't be done afterwards.

I've just release beta 1 (https://github.com/nikic/PHP-Parser/releases/tag/v5.0.0beta1) and will probably leave it at that in terms of significant changes.

There are a bunch of breaking changes I originally had in mind for this major version:

  • Replace position attributes with just token start/end, from which the others can be derived if tokens available.
  • Compute comments with a separate visitor to avoid duplicate comment assignment.
  • Explicitly represent {} blocks in the AST.

But all of these come with some major complications, so I don't plan to pursue them for this version anymore.

What about this: #762

I think it's solvable in userland with a custom visitor with access to tokens, so for sure it must be solvable inside PHP-Parser too?

What about this: #762

I think it's solvable in userland with a custom visitor with access to tokens, so for sure it must be solvable inside PHP-Parser too?

The technical capability for that exists now that the parser has access to tokens (previously it only had access to one lookahead token and comment assignment happened in the lexer).

However, I think that just adding comments between attributes and class declaration to the comments is going to break the formatting-preserving pretty printer, as the nodes will be out of order. I'm not sure how to solve this.

One feature that would be nice is the ability to see the original rawValue in encapsed strings.

The rawValue is available in double-quoted scalar strings, and I make use of it in PHP-Styler here to only escape those escape-characters in the rawValue; doing so leaves newlines-as-entered in place, but escapes \n characters if they are present. This reduces surprises when the rawValue is something like ...

$foo = "bar
baz
dib";

... so that it does not print as $foo = "bar\nbaz\ndib";.

However, for encapsed lists, the rawValue is not available, so PHP-Styler can't look for those strings. Having the rawValue available would help with that.

Hope that made sense, and I can open a separate issue for it if you like.

Thank you for PHP-Parser, it has been a great aid.

p.s. It looks like #837 may address this.

Do you have an anticipated release date?

I've updated PsySH to support the latest 5.x beta, but if the release is still a ways out or there may be additional breaking changes, I'll leave it out of composer.json until 5.0 is out.

I am a little bit worried about the implications of a new major version for the entire ecosystem. This is a very widely used library and I'm a bit worried for the majority of the ecosystem to get stuck in the old PHP-Parser v4 world for a long time, similarly to Python 2/3 and IPv4/IPv6.

Here's an example: sebastianbergmann/php-code-coverage#1004 (comment)

@ondrejmirtes Yeah, I do hope that a handful of core ecosystem dependencies like PHP-Unit can support version 4.x and 5.x at the same time. I think that for "simple" users of PHP-Parser this should be possible with little effort. In the last 4.x version I pushed some forward-compatibility APIs to make it easier to support both.

@bobthecow I don't plan any more code changes, just need to rewrite the UPGRADE file. So hopefully soon(TM).

Thank @nikic!

FWIW I was able to make PsySH support 4.x and 5.x just fine 🙂

I've tagged an rc1 release: https://github.com/nikic/PHP-Parser/releases/tag/v5.0.0rc1

The final release will hopefully be that plus or minus some docs changes.

@ondrejmirtes Yeah, I do hope that a handful of core ecosystem dependencies like PHP-Unit can support version 4.x and 5.x at the same time.

This is already possible.