meteor / proposal-referential-destructuring

ECMAScript proposal to allow destructured variables that refer to object properties

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ECMAScript Proposal: referential destructuring

Stage: 0, looking to advance to 1

  • Stage 1 criteria:

    • Identified “champion” who will advance the addition
    • Prose outlining the problem or need and the general shape of a solution
    • Illustrative examples of usage
    • High-level syntax
    • Discussion of key algorithms, abstractions and semantics
    • Identification of potential “cross-cutting” concerns and implementation challenges/complexity

    While I have attempted to meet each of these criteria, I have certainly not identified/addressed all possible concerns. I'm mostly looking for acceptance of the goals of this proposal, and/or guidance about how to proceed.

Author: Ben Newman

Reviewers: TBD

Specification: TBD

AST: TBD

Transpiler: TBD

Overview

Currently, when an object is destructured by assignment to a left-hand-side ObjectBindingPattern, the bound identifiers capture "snapshots" of the state of the object at the moment of destructuring:

let obj = { a: 1, b: 2 };
let { a, b: c } = obj;
console.log(c); // 2
obj.b += 10; // no effect on c
console.log(c); // still 2, not 12
c += 1; // no effect on obj.b

This value-snapshotting behavior often matches the desires of the programmer, but not always. Sometimes, one would like for a bound identifier to remain a shorthand for the current value of a property in the destructured object, rather than a copy of some previous value.

In terms of the example above, this proposal introduces new syntax that would allow c to continue to refer to the .b property of the object that was originally destructured.

Among its other benefits, this syntax should improve the usability of dynamic import() in the way it handles live bindings, making alternate proposals like my nested import declarations proposal unnecessary.

At this stage, we are not committed to any specific syntax. For lack of a better color to paint the shed, I will adopt the ampersand (&) reference notation used by other languages (such as C++), because it seems not to collide with existing syntax.

Examples

In its most basic form, when the & token prefixes a key in an object pattern, it indicates that the key should be bound as a reference to that property in the parent object:

let obj = { a: 1, b: 2 };
let { &a, &b: c } = obj;
console.log(a); // 1, same as console.log(obj.a)
console.log(c); // 2, same as console.log(obj.b)
obj.b += 10;
console.log(c); // 12

Although the parent object in this case happens to be called obj, of course the object need not have a name:

let { &a, &b: c } = getObject();
console.log(a);
console.log(c);

Naturally, a and c should be interpreted as references to properties of the unnamed object returned by the getObject() call expression. In other words, a reasonable transpilation of this example might look like

const _obj$0 = getObject();
console.log(_obj$0.a);
console.log(_obj$0.b);

where _obj$0 is a temporary variable. As this desugaring suggests, if _obj$0.a or _obj$0.b are implemented by getter properties, the getter function will be called whenever the bound identifier (e.g. a or c) is evaluated.

The & syntax also works well with computed properties:

let { &[getKey()]: value } = getObject();
console.log(value);

This example might be transpiled to

const _obj$0 = getObject(), _key$0 = getKey();
console.log(_obj$0[_key$0]);

Note that the getKey() expression is not reevaluated each time value is evaluated, but only once, at the time of destructuring.

The & syntax even works with destructuring patterns in function parameter lists:

function f({ &x: y, setX }) {
  setX(y + 1);
  return y;
}

const obj = {
  x: 1,
  setX(newX) {
    obj.x = newX;
  }
};

console.log(f(obj)); // 2

With a let (or var) destructuring declaration, the bound identifiers can be reassigned, and the original parent object will be updated:

const obj = { x: 1234 };
let { &x: y } = obj;
y += 1111;
console.log(obj.x); // 2345

If this mutability is undesirable, simply use a const destructuring declaration instead:

const obj = { x: 1234 };
const { &x: y } = obj;
console.log(y); // 1234
obj.x += 1111; // ok
console.log(y); // 2345
y += 2222; // throws

As in other languages with similar syntax, I anticipate that const references will come to be regarded as a best practice, just as const declarations are preferred wherever possible.

Note also that y is essentially an immutable live binding, much like a symbol imported by an import declaration (though the original value resides in an object, rather than a module environment record). This insight leads us to the most compelling application of this syntax...

Cooperation with dynamic import()

As currently proposed and implemented, the dynamic import() syntax has one lingering drawback compared with top-level import declarations: the only way to achieve live bindings is to retain a reference to the module's namespace object, so that you can access its latest properties.

In other words, the closest dynamic equivalent to this import declaration

import { a, b as c } from "./module";

function getSum() {
  return a + c;
}

would be something like

const moduleNs = await import("./module");

function getSum() {
  return moduleNs.a + moduleNs.b;
}

Note that this example pretends await is allowed at the top level, which is a feature that has never been formally proposed, though this proposal should be compatible with top-level await.

If you are tempted to use a destructuring declaration to bind individual identifiers with the desired names (e.g., c instead of moduleNs.b), then you lose the live binding behavior:

const { a, b: c } = await import("./module");

// Every time this function is called, it returns the sum of the values
// of `a` and `b` exported by `./module` that were available at the time
// the `await import(...)` expression was evaluated, even if `./module`
// has exported new values since then.
function getSum() {
  return a + c;
}

If ./module exports new values for moduleNs.a and moduleNs.b, those new values won't be visible to the getSum function, since it only has access to local variables that contain the original values.

However, with referential destructuring, the addition of two &s makes a and c behave as you would hope:

const { &a, &b: c } = await import("./module");

// Every time this function is called, it returns the sum of the *latest* values
// of `a` and `b` exported by `./module`.
function getSum() {
  return a + c;
}

Not only does & preserve live bindings, but the destructuring syntax is also more statically analyzable than the moduleNs object, since it's more obvious which properties are being used and which are not, and a reference to the namespace object can never leak into hard-to-analyze code.

This static analysis is the basis of important optimizations like "tree shaking," which (without this proposal) become significantly more difficult when dynamic import() is used.

The static analysis argument applies even if you choose to use the Promise API, thanks to function parameter destructuring:

import("./module").then(({ &a, &b: c }) => {
  // Return a closure that always has access to the latest values.
  return () => a + c;
}).then(...);

In principle, since every ECMAScript module has a namespace object, it should be possible to desugar any top-level import declaration to a combination of dynamic import(), top-level await, and referential destructuring:

import def, { a, b as c } from "./module";
// ...is roughly equivalent to...
const { &default: def, &a, &b: c } = await import("./module");

This desugaring won't work until both this proposal and top-level await are implemented, but I think harmonizing language features in this way is a worthwhile long-term goal.

Relationship to nested import declarations

In the July 2016 TC39 meeting, I presented a proposal to allow nesting import declarations inside blocks and functions, with support for live bindings. This proposal technically predated the dynamic import() proposal, though there was already talk of a module-scoped replacement for System.import(id, parent) at the time of my proposal.

Two concerns were raised during that discussion that led to my withdrawing the nested import proposal until I could find satisfactory solutions:

  1. It was unclear when (or if) module source code should be obtained, since it's too late to start asynchronous network activity at the moment when the import declaration is first evaluated.

  2. Synchronous import declarations seemed at odds with a future in which module evaluation may be asynchronous, thanks to the possibility of top-level await.

Since I first presented that proposal, I've become convinced that the answer to the first objection is that the runtime should make no special effort to fetch the code for modules imported in nested contexts. It's simply the responsibility of the programmer to use a bundling tool that ensure the code is synchronously available.

I do not have a perfect solution for the second objection, other than to throw if a nested import declaration tries to import an asynchronous module, which should encourage the programmer to hoist the import declaration to the top level, or use dynamic import().

While I could argue that these solutions are good enough to justify reviving the nested import proposal, the truth is that dynamic import() solves both problems already, and has much more momentum as an ECMAScript proposal.

My one remaining regret is the loss of live bindings when using dynamic import() and destructuring together. In addition to its other benefits, referential destructuring solves exactly this problem, and so I would be happy to withdraw the nested import proposal permanently if referential destructuring gains traction with the committee.

Relationship to Babel

If referential destructuring was already part of ECMAScript when Babel implemented their ES2016-to-CommonJS modules transform, then the & syntax would have made a very convenient compilation target.

Instead of transpiling

import { a, b as c } from "./module"
console.log(a, c);

to

var _module = require("./module");
console.log(_module.a, _module.b);

Babel could simply generate

const { &a, &b: c } = require("./module");
console.log(a, c);

and then transpile the referential destructuring syntax using subsequent compiler plugins.

Put another way, if the referential destructuring proposal makes progress, Babel should consider adding support for & syntax to their destructuring transform, which would allow the CommonJS modules transform to be significantly simplified.

Relationship to Reify

I maintain another compiler for ECMAScript module syntax, called Reify, which Meteor uses via this Babel plugin. Disclosure: I'm also the lead maintainer of Meteor.

One of Reify's claimed benefits is that it simulates live bindings better than the default Babel transform:

import a, { b, c as d } from "./module";

becomes

// Local symbols are declared as ordinary variables.
let a, b, d;
module.watch(require("./module"), {
  // The keys of this object literal are the names of exported symbols.
  // The values are setter functions that take new values and update the
  // local variables.
  default(value) { a = value; },
  b(value) { b = value; },
  c(value) { d = value; },
});

Whenever ./module exports a new value for default, b, or c, the callback function associated with that export will be called to update the local variable. No references need to be rewritten, since the simulated live bindings are just local variables, and those local variables are easier to inspect in a debugger.

However, if referential destructuring was available, the generated code could be vastly simpler:

const { &default: a, &b, &c: d } = module.watch(require("./module"));

This is beginning to look a lot like the code that Babel would generate. So much so, I'm not sure the Reify compiler should continue to exist as an alternative tool. And that's a good thing.

Non-goals of this proposal

No references to identifiers

In contrast to languages like C++, this proposal does not allow for references from one identifier to another:

const &ref = someOtherVariable; // not allowed

Instead, just refer to someOtherVariable.

No reference parameters

In languages like C++ and Scala, it's possible for a function to mutate the caller's copy of a variable passed as an argument to the function. This can be surprising if you aren't paying close attention to the signature and implementation of the function.

The following code is not supported by this proposal:

function illegal(a, &b) {
  b = a + 1;
}

let a = 1, b = 2;
illegal(a, b);

The & token is allowed in parameter lists only inside object destructuring patterns, which means it can only be used to modify the contents of caller-provided objects that were already mutable.

Not another with statement

Unlike the widely-deprecated with statement, this proposal requires explicitly mentioning each reference property that you want to declare, and does not insert any new records into the scope chain.

The presence of a with statement made it impossible to know, statically, whether a free variable inside the with block referred to a property of the with object or a variable from another enclosing scope.

That was a disaster for static analysis and engine performance, but this proposal is no more problematic than the desugared code we've seen above.

Open questions

Although I hope the foregoing examples justify exploring this proposal further, there are several details that need to be worked out in order to make the proposal fully concrete.

Is there a better term than "referential destructuring"?

JavaScript, like Java and many other languages, already has a concept of a "reference," meaning a variable that refers to a heap-allocated object. Heap-allocated objects are always passed by reference in JavaScript, rather than by value, and there are no "pointers" in the language that must be explicitly "dereferenced."

The syntax this proposal introduces doesn't really fit this traditional definition of a "reference," though it is inspired by similar syntax in languages (like C++) that distinguish between pointers and references.

Is there a better term for what this proposal introduces?

Interaction with default expressions?

If an object destructuring pattern specifies a default expression for an optional property, that default expression might or might not be used, since the property might or might not be found in the object:

const {
  a = 1,
  b: c = 2,
} = obj;

The & syntax still works for these properties, but it's not obvious what the semantics should be when the default expression is used:

const {
  &a = 1,
  &b: c = 2,
} = obj;

Two possibilities:

  1. If there is no a key in obj, then the a identifier should still be bound to the default value (1), and that value simply will not change. Code that consumes a doesn't have to know whether a is a referential binding, or just a normal variable.

  2. As long as obj has no a key, the value of the a identifier will be 1. If obj eventually acquires an a key, the a identifier will begin evaluating to obj.a instead.

Deeper nesting?

A single & character declares a reference whose evaluation involves only one object property lookup:

const { a: { b: { &c, d }}} = obj;
console.log(c); // same as console.log(p.c), where p === original value of obj.a.b
console.log(d); // same as any normal bound identifier (not a reference)

By extension, it's easy to imagine a chain of multiple & keys declaring a reference to a series of property lookups:

const { &a: { &b: { &c, d }}} = obj;
console.log(c); // same as console.log(obj.a.b.c);
console.log(d); // same as before

The question to answer is whether all possible combinations of this syntax make sense:

const {  a: {  b: { &c }}} = obj; // ok
const {  a: { &b: {  c }}} = obj; // ??
const {  a: { &b: { &c }}} = obj; // ok
const { &a: {  b: {  c }}} = obj; // ??
const { &a: {  b: { &c }}} = obj; // ??
const { &a: { &b: {  c }}} = obj; // ??
const { &a: { &b: { &c }}} = obj; // ok

Since c is the only bound identifier, only those variations with &c make sense to me, and &a without &b seems pointless unless the &a object pattern contains other reference keys.

However, uselessness does not always imply illegality, so perhaps all these variations should be legal, and it should be the job of linters to point out questionable patterns.

Array patterns?

It is tempting to allow & in an ArrayBindingPattern, like so:

const [a, &b] = elements;
console.log(b); // same as console.log(elements[1])?

However, this use of & feels different, because the & isn't prefixing a key of an object, but an identifier bound by the destructuring pattern.

The following code should work, though it certainly feels awkward:

const { 0: a, &1: b } = elements;
console.log(b); // same as console.log(elements[1])

I would prefer to restrict this proposal to ObjectBindingPattern destructuring, unless there is a strong argument for supporting array patterns as well.

Assignment patterns?

A variable declared with a & is a fundamentally different sort of variable, so it's not clear what (if anything) the following code should mean:

let a, b;
({ a, &b } = obj);

Again, I suppose this syntax could be made to do something useful, but I would prefer to forbid it until a compelling argument appears.

Interaction with TypeScript and/or Flow syntax?

Type annotations tend to be a source of surprising syntactic conflicts, so I want to be mindful that & may pose problems for TypeScript and/or Flow.

With that said, object destructuring is a syntax that both TypeScript and Flow have to deal with, so I am optimistic that & should not be a source of (m)any additional problems.

In TypeScript, the type annotations for an object destructuring pattern are completely separate from the pattern itself, so the syntax within the pattern should be the same as usual:

let { a, &b: c }: { a: string, c: number } = obj;

Flow has an open issue about this topic.

About

ECMAScript proposal to allow destructured variables that refer to object properties