phpv8 / v8js

V8 Javascript Engine for PHP — This PHP extension embeds the Google V8 Javascript Engine

Home Page:http://pecl.php.net/package/v8js

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compromise between FORCE_ARRAY and standard setup

redbullmarky opened this issue · comments

commented

Hey!
I wonder if there is medium that could be added between the standard setup and the setup with FORCE_ARRAY?
We use FORCE_ARRAY mostly for legacy reasons (many years ago, it was easy to segfault standard mode from user code, whereas FORCE_ARRAY was mostly protected), but also because 99% of the time when passing back objects, associative arrays are want we want in PHP.

However, the benefit of standard mode is the use of 'this'. A basic example of how we sort of implement things:

// the main runner...
$v8js = new V8Js('cs');
$userCode = ...get user code...;
$result = $v8js->executeString('
   function userCode() {
      ' . $userCode . '
   }
   result = userCode();
', null, V8Js::FLAG_FORCE_ARRAY);

and some example 'userCode':

return {
   main: function (param1, param2) {
   }

  somefunc: function (param1) {
  }
}

This means the user can write 'hooks' that are called on the $result, e.g. $result['main']($someval, $otherVal); at various points. This works well, but as userCode gets more complex, you often want to use this to organise things better:

return {
   name: "foo",

   main: function (param1, param2) {
      return somevar = this.somefunc(param1) + param2;
   },

   somefunc: function (param1) {
      return this.name;
   }

   another: function () {
      return [
         "foo": "alice",
         "bar": "bob"
      ];
   }
}

or something like that. The problem though, is FORCE_ARRAY returns, well, an array and not a V8Object, so the concept of this doesn't exist.

On the flipside, if we use it as standard without FORCE_ARRAY, then (using the last example):

function someThingInPHP(array $array) {
}

someThingInPHP($result->another());

This will result in a fatal error because $result->another() will return a V8Object, not an array so the typehint will fatal.

So I guess I'm wondering if there is a halfway house, where BOTH could be used somewhat. MAYBE something like:

  1. Add an extra option, such as AUTO_CONVERT or whatever (doesn't matter what it's called at this point)
  2. This will essentially work 99% like FORCE_ARRAY entirely, with some exceptions:
  3. Date objects and some of the other clever things, those will remain - so anything not a V8Function or V8Object will remain as they are. PHP will get the object.
  4. Then - either a class can be instantiated in JS (and as its constructor.name is not V8Object, it will be left alone) - or an extra feature/special wrapper object can be utilised to wrap a JS object to 'protect' it from conversion to an array.
  5. Otherwise, any other associative array would be converted to a PHP associative array.
  6. This would apply to both assoc arrays returned from JS, as well as those passed in to extra custom functions attached to the V8Js instance (i.e. $v8js->someFunc = function (array $settings) { };)

TLDR: Introduce an option almost entirely like FORCE_ARRAY, but with a handful of exceptions and the ability to skip conversion at runtime for certain objects.

Any thoughts? @stesie

Hej Mark,
maybe I haven't understood it fully yet, hence some further enquiry first ... I get that you prefer array access over object, ... but, what's wrong with explicitly casting to array in that case?

I.e. in your example someThingInPHP( (array) $result->another() );

I like that it makes explicit what's going on and doesn't involve any magic. But YMMV.
Also it should be pretty easy for static type analysis to find code locations where you've type-hinted functions (like someThingInPHP to take an array) but have forgotten the explicit cast.

So to put it differently, is it just "preference" that you'd like to get rid of the explicit cast (by means of convention) or do I miss something?

commented

Perhaps there is a preference of sorts, but there are thousands of typehinted functions that would need to be un-typehinted and casting added. I always prefer (where possible) PHP typehints, though it's heavily documented too so static analysis isn't an issue.

Partly also, I've simplified things greatly :-p The use cases and requirements from the 'userCode' I speak of above are massively varied. I'd like to migrate to using the standard approach without FORCE_ARRAY.

But mostly, it feels like that given PHP is the 'processor' of logic for the most part, the data is likely wanted in regular PHP forms, rather than coming back in a different form to how it was passed. The rest is just kind of an extension on that, really :)

I guess it depends on the complexity to implement. FORCE_ARRAY is used twice, from what I can see - once to declare it as a flag, the other to check if enabled/disabled.

I think - like #454 - it's sort of just trying to deal with the preservation of things as they go back and forth between PHP and JS.


Having said that, it almost feels like a PHP thing. With an object having a __toString(), you can:

function testString(string $string) { return $string; }

$cast = (string)$myObject; // works
$cast = testString($myObject); // works
// the latter doesn't fatal because of the typehint, either, because __toString() kicks in!

but with an array object:

function testArray(array $array) { return $array; }

$myObject = new ArrayObject(['foo' => 'bar']);
$cast = (array)$myObject; // works
$cast = testArray($myObject); // fatal because of typehint

but there are thousands of typehinted functions that would need to be un-typehinted and casting added

I don't quite get that. You need not do both. More like just one or the other. If you have the explicit type cast you can perfectly pass the (then) array to the function with an array type parameter.

And regarding the __toString comparison, well, PHP just has no implicit to-array conversion ¯\(ツ)

... and php-v8js cannot do much about that. Regarding your example

function someThingInPHP(array $array) {
}

someThingInPHP($result->another());

... this extensions involvement ends when another() call returns, passing a resulting zval to PHP, which then will invoke someThingInPHP. The extension cannot infer that the resulting value will need to be an array. It just can be configured (via flags) to either convert to array or pass the object.


So to summarize, you initially invoke executeString to evaluate a user script, that returns an object with hooks, your application is going to call afterwards. And you'd want to not set FORCE_ARRAY flag in order to retain the JS object (for "this" access to the object itself) + you'd like all hook invocations to "silently" cast to array type (at least if \V8Object instances are returned, and possibly one or two exceptions to the rule).

What about wrapping the hook object on PHP side? Hence leaving it to the facade to "guess" if it should be cast to an array?

class FancyHeuristicsCastingWrapper {
	private $wrapped;

	function __construct($wrapped) {
		$this->wrapped = $wrapped;
	}

	function __call(string $name, array $args) {
		$result = $this->wrapped->$name(...$args);

		if ($result instanceof \V8Object) {
			return (array) $result; // plain V8Object, convert to array
		}

		return $result; // retain date object
	}
}

// the main runner...
$v8js = new V8Js('cs');
$userCode = getUserCode();
$result = new FancyHeuristicsCastingWrapper($v8js->executeString('
   function userCode() {
      ' . $userCode . '
   }
   result = userCode();
'));


// hook result processor
function someThingInPHP(array $array) {
}

// invoke hook
$tmp = $result->another();  // variable could be inlined, just for verbosity

// if another returned a plain JS object, the wrapper will have cast this to array
var_dump( is_array($tmp) );  // true

someThingInPHP( $tmp ); // passes, since matches array typehint
commented

Ok, so consider this:

class V8JsWrapper extends V8Js
{
   public function __construct()
   {
      parent::__construct('cs');
   }

   public function getArray(): array
   {
      return ['foo', 'bar'];
   }

   public function explicitFunction(array $params)
   {
   }

   public function _wrappedFunction(array ($params)
   {
   }

   public function __call(string $method, array $args)
   {
      if (method_exists($this, '_' . $method) {
         // convert inner $args to array and call the method
      }
   }
}

$v = new V8JsWrapper();
$result = $v->executeString('
   const myArray = cs.getArray();

   cs.wrappedFunction(myArray); // this is ok
   cs.explicitFunction(myArray); // this is not ok - Fatal error

   ret = myArray;
');

// i can also work with and convert $result if I needed to, before passing elsewhere. Not one of our use-cases, but just pointing out for completeness.

All of our functions are currently set up like the explicitFunction() above - there is nowhere to apply any sort of conversion code, as the call from JS code directly maps to the PHP function.

So my options, if I'm to move to removing FORCE_ARRAY are below. I'll call the option FORCE_ARRAY_AUTO just so there's not confusion thinking I want to entirely change how FORCE_ARRAY works:

  1. Rename/prefix all of the explicit functions and proxy them via a __call()
  2. Remove the explicit typehints from all of the explicit functions (so PHP wont fatal), and convert them on the first line of the function so the remaining code will work as normal.
  3. Implement an facility that can be used when needed whilst FORCE_ARRAY_AUTO is in operation so I can tell V8Js NOT to convert it to an array. And then just apply an extra condition to this line: if ((flags & V8JS_FLAG_FORCE_ARRAY && !jsValue->IsFunction()) || jsValue->IsArray()) {.
// just an example
var thing = { "foo": "bar" }

cs.doSomething(thing); // would send an array to PHP land. currently V8Object

thing = new V8ProtectThing(thing);
cs.doSomethingElse(thing); // would send a V8Object or whatever.

Apologies if it's not making sense, maybe I just over-simplified things :/

there is nowhere to apply any sort of conversion code, as the call from JS code directly maps to the PHP function

well, after all this is due to tight coupling introduced by extending directly from V8Js class.

Generally I'd prefer to use composition over inheritance. If your code would allow for composition you could easily add and remove conversion layers between your class and V8Js class.

This is, instead of extending V8Js class, provide a V8Js instance to the constructor and let it ad-hoc register functions that shall be available to JS userscript code. For customized auto conversion you could then just add a converter instance between the two.


However given your codebase, I'd propose to change inheritance from V8Js to a customized kind of shim/proxy class you provide. This class can instantiate V8Js sandbox + export all the public functions from the derived classes + take care of customized on-the-fly conversions as needed.

That way you only have to change inheritance once, everything else can remain the same.

This is what I think of:

class V8JsShim {
        private $v8;

        public function __construct() {
                $this->v8 = new V8Js();

                $ref = new ReflectionClass( $this );
                foreach( $ref->getMethods( ReflectionMethod::IS_PUBLIC ) as $method ) {
                        if ($method->name === '__construct') {
                                continue;
                        }

                        $this->v8->{$method->name} = function( ...$args ) use ($method) {
                                // you could go fully fancy here and even inspect the typehint of
                                // the called method (and convert only if it has 'array' typehint)
                                if ( count($args) === 1 && $args[0] instanceof \V8Object ) {
                                        $args[0] = (array) $args[0];
                                }

                                return $this->{$method->name}( ...$args );
                        };
                }
        }

        // forward executeString etc.
        public function __call(string $method, $args) {
                return $this->v8->$method( ...$args );
        }
}

class Blarg extends V8JsShim {
        public function moep( array $data ) { // array typehint here
                var_dump($data);
        }
}

$blarg = new Blarg();
$blarg->executeString('
                print("hello there :)\\n");
                PHP.moep( { blarg: 42 } ); // without auto conversion it would fail due to the typehint
        ');

V8JsShim::__construct first creates a V8Js instance (like you used to do via inheritance). Then it uses ReflectionClass to fetch a list of public methods of it's $this instance. For each method it creates a closure that invokes it's wrapped method, possibly casting a single \V8Object instance to array before invocation. The closure is exported to V8Js userscript code.

commented

I'd accept that as workable.

I actually experimented a bit by seeing if I could return a PHP Callable instance from V8Js in place of a V8Function (so at least it can be bound via the whole Closure::bind() ) but then realised I was a little out of my depth and that it might not work anyway :)

I still think having some kind of control (within the extension itself) over the conversion (rather than just FORCE_ARRAY or not for the entire execution) would be a good option to have at one's disposal, but will close this one out for now :)

Thanks for the suggestions, as always!

IMHO everything that can be done in PHP itself should be done in PHP and not be pushed into an extension. The latter is way more inflexible, and intransparent to people not knowing the C++ side.

That said I don't even think that FORCE_ARRAY is a great feature to have. After all it's pretty straight forward to do the same completely in PHP code. But no worries, I'm not going to remove it 🙂
Compare it to e.g. simplexml_load_file, which also returns an object ... and also there many people immediately cast the result to array.

commented

Compare it to e.g. simplexml_load_file, which also returns an object ... and also there many people immediately cast the result to array.

Quite true, as does json_decode() - yet most cases I see of that function pop a true in as it’s second argument to get an array back :) I guess it’s all based on preference and habits.

For me, the only time I want to use objects in PHP is if there are methods, getters, setters, etc. if it’s just representing structures of non-contextual data, an array is always my preference.