WICG / import-maps

How to control the behavior of JavaScript imports

Home Page:https://html.spec.whatwg.org/multipage/webappapis.html#import-maps

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cascading Import Map Resolution

michaelficarra opened this issue · comments

Problem

For the life of the web, JavaScript scripts have coordinated by mutating the global, either by adding properties that later-run scripts depend upon or by replacing properties that later scripts depend upon with ones that have the same API but augmented behaviour. We call the latter type of coordination "virtualisation".

As a concrete example of virtualisation, XMLHttpRequest is often virtualised by analytics tools, developer tools, performance monitors, etc. When early-run scripts replace the reference to XMLHttpRequest on the global with a wrapper, two important things happen: 1) the later scripts do not have access to the original XMLHttpRequest constructor, so the early-run script can ensure they observe all usage, and 2) the augmented functionality composes with other virtualisers, in a last-to-first order.

Or consider polyfills: polyfilling scripts may augment built-in APIs with enriched forms. If multiple polyfills use the import maps API as specified today to virtualise a built-in, only the last-run polyfill will be applied. A more useful behaviour for import maps (and likely the one that would match developer intuition) would allow for the polyfill functionality to compose.

Finally, consider the example of denying access to a built-in module. It's natural to expect that a single import map early on the page which maps "std:kv-storage" to null would mean that access to "std:kv-storage" is denied to the rest of the page (just as when an early-run script deletes window.localStorage and so denies it to the rest of the page). But with the current strategy, that is not the case: a script later on the page can undo that mapping (i.e., add an import map mapping "std:kv-storage" to "std:kv-storage") and then import("std:kv-storage") successfully.

Proposal

Import maps, as specified today, default to later maps entirely overriding earlier ones when both provide mappings for the same module specifier, and thus do not support composition of behaviour. How could we change import maps to support these use cases? I propose lifting the restriction that entries in the right-hand side of an import map entry must be URLs, instead allowing any module specifier. Then, rather than destructively merging maps, the right side of each map would be resolved according to the previous map, ultimately falling out to the default resolution behavior. Call this cascading resolution.

This would mean that, if you have two maps along the lines of

{
  "imports": {
    "foo": "bar"
  }
}

and

{
  "imports": {
    "baz": "foo"
  }
}

on the page in that order, then the module specifier "baz" would resolve to "bar", not "foo".

I'm aware that users are starting to use import maps today. Since cascading would not happen within a single map (only between maps), this proposal will not affect any usage with only a single map. Additionally, if all import maps specify a non-URL on the left-hand side and a URL on the right-hand side, they would also see no change in behaviour.

This should be a fairly straightforward change to the implementation. In the import map merging algorithm, when adding new keys to a map, including replacing existing ones, the value with which they are installed is given by looking up the provided value in the map as it existed before adding the new keys, rather than by using the provided value directly. In pseudocode, if today's merging logic for imports looks like:

let previousImportMap = Object.create(null);
for (let newImportMap of allImportMaps) {
  Object.assign(previousImportMap, newImportMap);
}

then the updated logic would look like

let previousImportMap = Object.create(null);
for (let newImportMap of allImportMaps) {
  let currentImportMap = Object.create(null);
  for (let [k, v] of Object.entries(newImportMap)) {
    currentImportMap[k] = v in previousImportMap ? previousImportMap[v] : v;
  }
  Object.assign(previousImportMap, currentImportMap);
}

This is slightly simplified in that it ignores fallbacks. Each fallback would need to be resolved individually in the previous map.

Caveats

URL Normalisation

For the access denial use case, in order to reliably deny access to a web resource, any URL normalisation that the HTTP server would do (effectively serving the same resource for many URLs) would need to be accounted for by the import map. As such, the browser should normalise URLs before looking them up in any import map, including when looking up URLs which were the result of looking up the module specifier in a later map.

Inferior Alternatives

First-wins

As an alternative, the merge strategy could be changed to first-wins instead of last-wins. This requires more effort on the part of the application developer than cascading, but at least allows them to retain control over the page in the usual way (i.e., by ensuring that the invariants they want to enforce and polyfills they want to load and so on are established early), in contrast to the existing proposal. In particular, it would mean that a later resource which tried to install its own import map would not be able to subvert the application developer's map.

While this would be better than the current strategy, I believe it would be strictly less useful and more surprising than cascading. In particular, it would not allow composition at all.

In pseudocode, as above:

let previousImportMap = Object.create(null);
for (let newImportMap of allImportMaps) {
  let currentImportMap = Object.create(null);
  for (let [k, v] of Object.entries(newImportMap)) {
    if (!(k in previousImportMap)) {
      currentImportMap[k] = v;
    }
  }
  Object.assign(previousImportMap, currentImportMap);
}

Equivalently, merely reverse the list of allImportMaps in the existing logic.

Virtualising import map creation APIs

Instead of having the browser perform cascading resolution, early-run scripts could in principle virtualise all dynamic import map creation APIs (.innerHTML = and so on) and perform this composition strategy themselves. But besides being totally impractical and a great deal of work, this also does not suffice to set up cascading or otherwise enforce invariants for import maps which are present in the HTML, and thus do not go through one of the dynamic APIs.

@domenic @hiroshige-g Have you reviewed this issue? If we are going to adopt this change, I'd really like to have it incorporated soon, before the experimental implementations gain too widespread adoption. Is there anything I need to clarify or research for this to proceed?

an excellently laid-out proposal!

i wonder what @guybedford thinks about these ideas?

👋 Chase

Some random questions and thoughts:

As for cascading virtualization of built-ins, could you show some concrete example import maps to do that?

Due to https://github.com/WICG/import-maps#packages-via-trailing-slashes (and HTTPS->HTTPS fallback), I expect cascaded resolution should be done each time a specifier is resolved (not in merging import maps).

Also, what should occur for cyclic cases, e.g. the following + import 'foo'; and import 'bar';?
Probably we should limit the maximum number of cascaded resolution steps.

{
  "imports": {
    "foo": "bar",
    "bar": "foo"
  }
}

Thanks for opening this issue @michaelficarra.

Virtualization and polyfills are definitely a goal of import maps. However, as noted in the README, the model for doing so is meant to be in the application author's control---i.e., the person who is aware of and controls all import maps in the page. It isn't designed to allow a library author to virtualize other libraries, in the way you get via currently patching globals.

So the example you quote, about denying std:kv-storage to the rest of the page, assumes that the application author has sufficiently vetted the import maps included by their application to ensure that nothing redirects it elsewhere. (This is similar to the model used by service workers; application authors need to ensure that any service workers installed on their page are doing what they expect.)

If a page author wants to include untrusted script, and not audit it for import map creation, then the best they can do is by controlling import map creation. As such, the method to ensure virtualization should be the one you mention as an "inferior alternative", of "Virtualising import map creation APIs". The simplest and most efficient way to do this would be via CSP, although the method you mention, of overwriting all DOM APIs, is another avenue they can pursue.

Similarly, composition among uncoordinated parties isn't a goal for import maps. Allowing more than one import map in a page was a relatively recent decision entirely, but was done only for authoring convenience. E.g., folks mentioned the idea of one manually-curated import map and one tool-generated.

The proposal to switch the right-hand side from URLs to module specifiers isn't a direction we'd like to take. @hiroshige-g mentions some of the technical complexities this brings; indeed, we specifically moved away from this cascading resolution, which was present in an earlier version of the proposal, for related reasons.

Hope this helps!

@hiroshige-g

As for cascading virtualization of built-ins, could you show some concrete example import maps to do that?

Let's say there's a new built-in called, conveniently, "built-in". A web application author might include a polyfill which would add the following import map.

{ "imports": { "built-in": "https://built-in-polyfill" } }

If the web application author also wants to do some debugging of consumers of this polyfill, they can add in debugging with a script that includes the following import map:

{ "imports": { "built-in": "https://augment-built-in-with-debug-tracing" } }

And if the new built-in has performance or profitability effects, maybe we would want to measure various aspects of it with a script that would add the following import map:

{ "imports": { "built-in": "https://augment-built-in-with-analytics" } }

The application author can control the order of composition to get the desired effect. Critically, they do not need to change the source of each script to chain the module specifiers. It is all handled through composition order.

I expect cascaded resolution should be done each time a specifier is resolved (not in merging import maps)

I don't know what you mean by this.

Also, what should occur for cyclic cases, e.g. the following

That's not a problem. Cascading resolution only happens across maps, not within a single one. I mention this in my OP.

@domenic

the person who is aware of and controls all import maps in the page

We must be honest with ourselves that authors of modern day web applications will not "maintain" an import map. If the current behaviour stands, import maps and module specifiers will be generated by a compiler, effectively implementing this cascading behaviour at compile time.

This is similar to the model used by service workers; application authors need to ensure that any service workers installed on their page are doing what they expect.

And I think service workers have the same failing and could be fixed by allowing multiple service workers with a similar composition strategy. I will be following up on that soon.

Similarly, composition among uncoordinated parties isn't a goal for import maps.

The thesis of the OP is that I think it's important for library authors to be able to write code that can ship to the web without the library author being aware of sibling libraries and without whole-application compilers doing import rewiring. Can we make that a goal?

If the current behaviour stands, import maps and module specifiers will be generated by a compiler, effectively implementing this cascading behaviour at compile time.

That's a very reasonable outcome, in my opinion.

without whole-application compilers doing import rewiring. Can we make that a goal?

No, sorry. The application needs to be aware of its dependencies (either using compilers, or CSP, or code review, or any of the other mechanisms available).

I had a call yesterday with @michaelficarra and @bakkot yesterday where they walked me through this issue in more detail. There was a key use case I was missing. However, I now think it can still be accomplished with the current proposal.

In particular, consider additive, separately-authored polyfills. Such as:

// kvs-v2-polyfill.mjs
import storage, { StorageArea } from "std:kv-storage";

export class StorageAreaObserver { ... };
export default storage;
export StorageArea;
// kvs-v3-polyfill.mjs
import storage, { StorageArea } from "std:kv-storage";

export class EphemeralStorageArea { ... };
export default storage;
export StorageArea;

An author would like to use both of these polyfills on their page, such that import "std:kv-storage" retrieves a version of KV Storage with both StorageAreaObserver and EphemeralStorageArea exports.

With their proposal in the OP, this would be done via two separate import maps that cascade:

<script type="importmap">
{
  "imports": {
    "std:kv-storage": "/kvs-v2-polyfill.mjs"
  },
  "scopes": {
    "/kvs-v2-polyfill.mjs": {
      "std:kv-storage": "std:kv-storage"
    }
  }
}
</script>
<script type="importmap">
{
  "imports": {
    "std:kv-storage": "/kvs-v3-polyfill.mjs"
  },
  "scopes": {
    "/kvs-v3-polyfill.mjs": {
      "std:kv-storage": "std:kv-storage"
    }
  }
}
</script>

However, I think this can be accomplished with today's proposal more simply by using a single import map and scopes:

<script type="importmap">
{
  "imports": {
    "std:kv-storage": "/kvs-v3-polyfill.mjs"
  },
  "scopes": {
    "/kvs-v2-polyfill.mjs": {
      "std:kv-storage": "std:kv-storage"
    }
    "/kvs-v3-polyfill.mjs": {
      "std:kv-storage": "/kvs-v2-polyfill.mjs"
    }
  }
}
</script>

I think we should expand the example at https://github.com/WICG/import-maps#extending-a-built-in-module to explain this.

However, I think this can be accomplished with today's proposal more simply by using a single import map and scopes

Of course, all usage of multiple import maps can be replaced with usage of a single import map. Cascading is not more powerful than any other composition strategy. The point we're trying to make is that that requires centralised coordination where dynamic composition of maps in the browser does not. In reality, many web apps do not have a single human coordinating all of the scripts. Also, there is no means for scripts to automatically communicate which module specifiers they wish to virtualise (as there is with a module's exported bindings): a human has to figure it out manually, from either documentation or source inspection, and get it right.

Maybe another example use case will be helpful?

Because libraries can install their own import maps, we can assume they will do so. So imagine you have two libraries, A.js and B.js, which wrap the built-in std:foo. Adding either to a page in the absence of the other will work with no further effort on your part (in particular, without you adjusting your import map). But if you try to add B.js to a page where A.js was already present, A.js will mysteriously break.

Of course, all usage of multiple import maps can be replaced with usage of a single import map.

I'd gotten the impression from our discussion that, with the current spec, folks would have to rewrite the source of kvs-v3-polyfill.mjs to reference kvs-v2-polyfill.mjs. That would be bad. But since that's not the case, I no longer see the issue.

In the case given, the extra burden from chaining your polyfills by dropping them in to import maps in order, referencing each other, is not really any greater than you have today with script tags.

Because libraries can install their own import maps, we can assume they will do so.

Maybe we should disallow this; the intent is for the page author to be authoring the import maps, not libraries. It falls out of the <script> infrastructure that we're reusing that they can be dynamically inserted, but perhaps that's causing too much of a false analogy, and we should add special cases to avoid libraries going down this path. Perhaps especially in v1?

Adding either to a page in the absence of the other will work with no further effort on your part (in particular, without you adjusting your import map)

The intention is for you to add libraries to your page by adding them to your import map. (Or, more realistically, by npm install/etc. doing so for you.) You haven't stated explicitly, but I think you're envisioning doing so via dropping in a <script> tag. The idea of modules in general, and import maps in particular, is to replace this installation flow.

Maybe we should disallow this; the intent is for the page author to be authoring the import maps, not libraries

Totally agree. If library starts to define their own tiny importMap, it will become hard to figure the final importMap the entire app will use.

While that is true, regardless of decisions made here, many libraries will have to invent their own convention for defining their own tiny import map, and tooling will have to exist to combine them into one to ship with the final application - otherwise app authors will have to manually read readmes to patch together their final import map.

Hopefully something like exports in package.json would allow <insert package manager> to generate an application-level import-map. I agree that there needs to be some ecosystem consensus on how import maps will be generated but I think packages having complete import maps that they actively write to the page is an awkward direction.

We must not leave buildless people behind, and we must assume libraries will try to streamline usage of their libraries for end users, by writing importmaps for their end users.

People inherently don't want to write importmaps, library authors inherently will try to solve it.

Consider this user, who wants to import two different libraries from two authors who know nothing of each other:

<script type="importmap" src="https://some-project.dev/some-lib/importmap.json"></script>
<script type="importmap" src="https://other-project.org/other-lib/importmap.json"></script>

<script type="module">
  import {milk} from 'some-lib'
  import {shake} from 'other-lib'
  milk()
  shake()
</script>

This needs to work well without build tools in some way. I don't think we can really expect otherwise.

Especially not once http3 is out in a major server software like Node.js, and we begin to see HTTP3-based multiplexing open-source self-hostable ES Module servers becoming a reality. Where HTTP2 failed, I think HTTP3 can succeed.

Lib authors are gonna want to make consumption as easy as possible even for those without build tools. Guaranteed!

<script scope="https://some-project.dev/some-lib/" type="importmap" src="https://some-project.dev/some-lib/importmap.json"></script>
<script scope="https://other-project.org/other-lib/" type="importmap" src="https://other-project.org/other-lib/importmap.json"></script>