tc39 / ecma262

Status, process, and documents for ECMA-262

Home Page:https://tc39.es/ecma262/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proxy [[Enumerate]] overconstrains implementations

rossberg opened this issue · comments

When for-in is invoked directly on a proxy, then it simply executes the iterator returned by the proxy handler. That prescribes a much more rigid iteration sequence than for regular objects that breaks existing implementations. In particular, implementations often want to compute the list of keys before the iteration starts, which would currently be forbidden due to observable calls to iterator.next.

This is inconsistent not just with regular objects, but also with enumeration over regular objects that have a proxy as their prototype -- because in that case the object's [[Enumerate]] still has the liberty to invoke the prototype's [[Enumerate]] whenever.

To fix this, we probably need to loosen the spec for proxy [[Enumerate]], such that it does not prescribe that the original iterator is invoked in lock-step.

See also #160.

@rossberg-chromium
I think this is already allowed by the spec. Note that Ordinary Object [[Enumerate]] is not specified via an actual algorithm. It just list some requirements. The timing or ordering of accesses along the prototype chain are not included in those requirements. Essentially, Ordinary Object [[Enumerate]] makes no guarantees related to the ordering of next calls to the Iterators returned from objects on the [[Prototype]] chain.

So, an implementation of Ordinary Object [[Enumerate]] is still free to precompute the property enumeration list (either within its implementation of Ordinary Object [[Enumerate]] or its implementation of for-in).

However, there is still one case where an observable ordering is currently specified. That would be for a Proxy that has an explicit enumerate trap handler which does not invoke [[Enumerate]] on an ordinary object. I think we could tweak ForIn/OfHeadEvaluation to also allow for an implementation defined ordering in this case.

Here is my suggestion for that fix:
replace step 7.b, which current reads:
c. Return obj.[[Enumerate]]().
with:
c. Return either obj.[[Enumerate]]() or an implementation defined Iterator that is derived from the value of obj.[[Enumerate]]().

Of course, if we ever agreed to specify a normative algorithm for Ordinary Object [[Enumerate]] we would face this problem for real. But, that prospect still seems unlikely for the same legacy reasons that motivated raising this issue.

@allenwb, yes, I was referring to the Proxy [[Enumerate]] being invoked directly. I agree that there is no problem for the other cases (including the one where a proxy just occurs as a prototype).

As for resolving this, wouldn't it make more sense to loosen the specification of Proxy [[Enumerate]] itself, roughly in the way you describe? That would seem more in line with the underspecification of Ordinary Object [[Enumerate]], and would equally keep the hack out of for-in as such.

@rossberg-chromium
Beyond the issue of simply trying to minimize the transit overhead of Proxy traps, there is an observability issue with doing this within Proxy [[Enumerate]]. If somebody directly invokes Reflect.enumerate on a proxy the resulting Iterator is observable. If Proxy [[Enumerate]] added an extra layer of Iterator wrappering then it would be observable that Proxy [[Enumerate]] returns a different object (whose characteristic would have to be specified) than the value produced by the user provided enumerate trap handler.

@allenwb, indeed, but my thinking is that [[Enumerate]] is (as opposed to [[OwnPropertyKeys]]) mostly a legacy mechanism to support for-in, and there probably isn't much point in making it more specified in special cases when we can't help the common case anyway. So why not leave it underspecified for direct invocations on proxies as well?

So why not leave it underspecified for direct invocations on proxies as well?

But why? It seems like limiting unspecified behavior as much as possible is ideal. Hard to say what uses [[Enumerate]] might find in the future. Seems like unspecified behavior should have a really strong justification. I buy it for for-of enumeration (and am OK with Allen's proposed fix) but is there a strong reason for making direct invocations of the proxy implementation defined?

By that argument, why was the undefinedness of for-in for ordinary objects moved from for-in into the [[Enumerate]] internal method in the first place? [[Enumerate]] already is completely poised, I have a hard time imagining a use case where deterministic behaviour on proxies only would help.

As for why:

  1. Consistency wrt spec factoring.
  2. Maintain consistency between the observable behaviour of Reflect.enumerate and for-in.
  3. The ability to self-host Reflect.enumerate in JavaScript, using a simple for-in loop.

Another way to put it is that for (x in o) should (continue to) give the same as for (x of Reflect.enumerate(o)), which probably is more useful as an invariant than one defined edge case for the latter.

I buy those points @rossberg-chromium. I don't know why [[Enumerate]] got the indeterministic behavior (though I would be surprised if it was strongly motivated).

Should probably run this buy folks next week. Want to get your delegation on it or shall I bring this point up?

Thanks. Yes, please feel free to run it, Adam & Dan will be there as back-up if needed.

Committee was unwilling to have an underspecified semantics here. However, it makes sense that the properties would be enumerated before entering the loop body. I propose that step 6.c of the loop head evaluation be updated to effectively spread the iterator into an array. @rossberg-chromium thoughts?

Sounds okay. The string check/conversion (as of #160) should happen at this point as well.

Will this require additional property checks in ForIn/OfBodyEvaluation to conform to the "A property that is deleted before it is processed by the iterator’s next method is ignored." rule from 9.1.11 [[Enumerate]]?

Doesn't it make more sense to do the spreading in the Proxy [[Enumerate]] method (9.5.11)? Otherwise the semantics of for-in would be inconsistent with that of Reflect.enumerate.

Already spreading the iterator in Proxy [[Enumerate]] defeats the reason for using an iterator, namely to avoid enumerating all property keys right away.

But the suggested proposal already requires the apriori exhaustion of the iterator! The one exception (for which I don't see the justification) is the rare case where Reflect.enumerate is called manually (not as part of an enumerate trap triggered by a for-in).

Ok, let me rephrase: Spreading the iterator defeats the reason for using an iterator. [[Enumerate]] should either use an iterator + lazy property key computation or alternatively an array/list + eager property key computation, but not an iterator in conjunction with eager property key computation.

Hi, I'm wondering what the status of this proposal is.

Thanks.

I think this is now fixed. Thanks @GeorgNeis!

Committee was unwilling to have an underspecified semantics here. However, it makes sense that the properties would be enumerated before entering the loop body. I propose that step 6.c of the loop head evaluation be updated to effectively spread the iterator into an array.

Historically not all browsers precomputed the list of properties before entering a for-in loop. Some (perhaps even most or all) at one time incremented computed the next key on each loop iteration. That per iteration computation is the moral equivalent of doing a next on an Iterator that lazily gets the next property key.

The reason for the loosey–goosey spec. language regarding property enumeration in all versions of the spec. up through ES6 was to impose some requirements (some implementation (e.g, IE) did not originally conform to all of the current requirements) while accommodating other actual implementation variations.

If all browsers now pre-precompute the key list (but is that actually the case??) or if TC39 is now willing to mandate pre-computation of the for-in key list then it seems like we have an easy way forward WRT for-in:

  1. Leave the [[Enumerate]] contract, ordinary object [[Enumerate]], and Proxy [[Enumerate]] as originally specified in ES6 (modulo any unrelated bug fixes that have been identified)
    1. Optionally, we could remove the loosey-goosey language in ordinary object [[Enumerate]] and make the pseudo-code equivalent of the informative definition normative. If we really want deterministic implementation consistent for-in behavior going forward we should do this.
  2. Replace step 6.c of 13.7.5.12 with steps that that drain the result of obj.[Enumerate] into a List; ensures each List item is a string value and finally returns a ListIterator over that list.

This will:

  1. Preserve the MOP/language stratification feature by making for-in specific requirements part of the for-in specification rather polluting the MOP specification/implementation with them.
    1. The (new) requirement that for-in precomputes its key set is part of the for-in specification
    2. Ensuring that for-in only produces string values becomes part of the for-in spec
  2. [[Enumerate]] continues to actually produce an iterator
    1. If somebody wants to enumerate over property keys in a lazy manner they can choose to code it using for-of rather than for-in:
   for (key of Reflect.enumerate(obj) {
       if (typeof key !== "string") continue;
       ...
   }

Recall that browsers do not enumerate keys that are deleted after entering the loop.

So, if we precompute the for/in key list, we should probably add a HasProperty check at each iteration to see if the key hasn't been deleted.

@claudepache It seems that requirement derived from the fact that browsers historically did not precompute and hence would not see properties that were deleted before they were visited. If it is still a requirement for pre-computed for-in then a HasProperty check would have to be included in the for-in specific list iterator.

It seems very unlikely that this requirement (not visiting deleted keys if they haven't been visited yet) could be relaxed. It is currently implemented uniformly across all implementations afaict.

@bterlson then it should be considered part of the for-in (rather than [[Enumerate]]) semantics and could be specified as part of the for-in specific list iterator.

@rossberg-chromium can you comment on what you think is the best fix for this? The original fix seems really bad because it doesn't allow the web-reality requirement that keys deleted during for-in enumeration are not visited. Further it is not possible to use a proxy to transparently log and forward operations as doing so will change the semantics of delete during enumeration since keys are pre-loaded. Do you like the additional normative requirement of a HasOwnProperty check in each iteration of for-in? I don't like this as it just adds more overhead to proxies and is a special case that would happen only with proxies which seems surprising.

I wanted to make a generator to iterate over paginated remote collections (http://example.com/things/1 up to wherever it say returns a 404, or alternatively non-predictable URLs taken from 'next' buttons), and expose this generator through a Proxy's enumerable trap for iteration.

In this type of use-case it does not seem elegant to pre-compute keys, as they might potentially only gradually become known as you go (one per fetch, which for the sake of politeness you wouldn't want to rush through just to tell JS the list of keys).

In the Reflector thread I found dismissal of the notion of iterating over infinite collections, but in my use-case the problem isn't so much about a collection being infinite, but rather whether its size and keys are even known before-hand. I didn't quite catch all the reasons keys would have required eager evaluation here, but I'm sad to find [[Enumerate]] deprecated at this point.

Disclaimer: I'm a just a JS user, not a commission member. I might not belong here.

@tycho01 [[Enumerate]] is all about listing the keys an object has at a given time, as far as I can see. You may be interested in the Asynchronous Iterators proposal.

@UltCombo: thank you, let me check that out. I fear I misinterpreted handler.enumerate as trapping for-of statements (iteration) rather than for-in (key enumeration). It does raise questions for me whether an iteration trap might be a viable addition, but I suppose that would fall outside the scope of this thread.

@tycho01 Yep, obj.[[Enumerate]] is called by for-in, for-of calls obj[Symbol.iterator]().