whatwg / html

HTML Standard

Home Page:https://html.spec.whatwg.org/multipage/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introducing new HTML elements that are polyfillable

domenic opened this issue · comments

Recently, my team (within Chrome) has been working on initial explainers and explorations for new HTML elements: a virtual scroller, a toggle switch and a toast. There are a couple common issues we're attempting to tackle with these that need a central place to discuss, and the whatwg/html community is a good place to have those. These issues are polyfillability, and pay-for-what-you-use. They are somewhat connected, but I want to try to use two separate issues to discuss them; we'll see if that works. This issue is about polyfillability.

The problem: if we introduce new HTML elements the same way as we have in the past, they are not polyfillable, because custom elements are restricted to have a dash in their name, whereas built-in elements traditionally do not.

In the past this hasn't been too much of an issue; we'd introduce new HTML elements that aren't polyfillable anyway, such as <dialog> (magic top layer/inertness), or <details> (magic slot=""-less shadow DOM and marker rendering). Some new element proposals follow this trend: e.g. <portal> introduces a fundamentally new way of dealing with browsing contexts, that needs deep integration into the internals of browsers. But the ones I mention above do not; they explicitly attempt to follow the extensible web manifesto and not build any new magic into the element. I.e., if it weren't for the dash issue, they would be polyfillable.

There are several potential solutions we've thought of so far. I'll outline them here, but I'm sure the community can think of others. Note that the specific names ("virtual scroller", "switch", "toast") are used as examples, and discussions on those repos may point us toward better names (e.g. WICG/virtual-scroller#163).

  1. Introduce a new dash-containing prefix for such polyfillable built-in elements, e.g. <std-. Use it everywhere. Result: <std-virtualscroller>, <std-switch>, <std-toast>, ...

  2. Ensure that such elements have at least two words in its name, and separate them by dashes, instead of HTML's usual <textarea> or <frameset> convention. Result: <virtual-scroller>, <toggle-switch>, <toast-banner>, ...

  3. Minimize visual noise by just using a dash suffix. Result: <virtualscroller->, <switch->, <toast->, ...

  4. Lift the restriction that custom element names require a dash, thus allowing them to be used to polyfill built-ins. Result: <virtualscroller>, <switch>, <toast>. This is where things get interesting; let's dig in to the details.

    • Why did we divide up the namespace in the first place? To avoid future conflicts with new browser-provided elements. I.e., what happens if code does customElements.define('awesomeelement'), but browsers later introduce a native <awesomeelement>? It would either break web pages, if we made customElements.define() throw, or it would require extra work from browsers, to allow native <awesomeelement>s to get converted into custom <awesomeelement>s (probably using a variant of the custom element upgrade machinery). Note that one route out here is to implement that conversion.
    • But this issue is predicated on browsers introducing new elements to the global namespace. What if they didn't, or did so only rarely? For example, if we declared that all new elements will not be added to the global namespace, but instead only appear to pages that load a corresponding built-in module, then there would be no future collisions to worry about.
    • Even in today's world, the custom element dash restriction is not a guarantee of conflict avoidance. Authors are already using non-dashed element names, e.g. https://www.apple.com/shop/product/MRXJ2/airpods-with-wireless-charging-case uses <favorite>, <overlay>, <loader>, <content>, and <validator>. Or the famous picturefill incident. In practice, introducing a new element already requires some compat analysis on the existing web. So one possibility is that we could say that most new elements (maybe, all polyfillable new built-in elements?) use a built-in module or other opt-in form, but if we really need something baked in to the global namespace, we can do some compat analysis and figure out a name that works.
    • We could continue to encourage a cultural norm that dashes be used for most custom elements, reserving non-dashed custom elements mostly for polyfills. It's unclear whether this would work, but we have weak evidence, from the fact that relatively few sites are using undashed non-custom elements such as <favorite> or <loader>, that the norm might continue to hold.

Looking forward to hearing folks thoughts!

Would it be possible that <toast> is essentially an "alias" for <std-toast>? That way <std-toast> is developed with the constraints that it be polyfillable, but there is a nicer-named <toast> as well that is the same thing. Most developers with use <std-toast> until some future date when browser support meets their requirements, then they would switch to using <toast> instead.

I think 4 is much better than 1-3. But it should be coupled with strong recommendation to not use dashless names except for polyfills. (Edit: conformance checkers can check for this!)

In practice, we have the problem of web developers polluting the standard's namespace in JS, DOM, and even HTML attribute names (when they are reflected as IDL attirbutes, as most are). Maybe we consider HTML elements names to be more holy, but if so, our standard HTML element names shouldn't be "ugly".

@matthewp I think that would just mean that all new HTML elements have two names forever, which seems bad. Also, if abbr vs acronym is any indication, it can lead to a lot of wasted time for web developers discussing which one to use.

Remember when chrome prefixed all in-development CSS properties with -webkit-? Remember when they stopped doing that? Let's not go through that again. Option 4 is the only real option.

@zcorpan I don't think it would be confusing for long, as developers would quickly recognize the pattern.

I don't mind option 4 here either but it is more complicated. As you note, there is advice on attributes not to use non data- custom attributes but yet it is very common to do so anyways. Unless the language restricts it, people will do so. But it's also true that people already do use non-dash names, just without going through customElements.define().

I think for 2, there's bound to be new elements that this convention doesn't work well for. 3 just seems like a weird version of 1.

Re the point in 4...

We could continue to encourage a cultural norm that dashes be used for most custom elements, reserving non-dashed custom elements mostly for polyfills.

We build lots of custom elements, and we certainly would not want any conflicts so we would almost certainly continue our convention of prefixing our components (including hyphen). If built-in elements are to be explicitly imported (as described in your pay-for-what-you-use post) then such possible conflicts would be avoided. On the other hand, if they are lazily loaded by the browser without explicit import, then I think collision may be a concern.

Comments on the grand naming problem

I tend to agree that option 4 is the least bad of these.

But there is also option 5: don't aim for perfect polyfillability for new elements by client-side JavaScript.

The idea of perfectly polyfillable new elements, in cases where an element can be implemented on top of existing functionality, is cool. But it's not so cool that it's worth it at any cost. Weird naming artifacts are not a cost worth paying.

I think all the options that leave a visible artifact in the names of elements are not good if you project forward to the future. Imagine all new HTML elements follow this convention. Any HTML source that heavily uses new elements would be awkward to read. If only elements believed to be polyfillable at the time they are standardized followed the convention, that would be even more strange. A mysterious marker in the element name for what is at a spec-level implementation detail.

Imposing weird names on standard elements maybe tactically useful for the short term by making the sweet transparent polyfill is possible. But then a strange naming convention is stuck in HTML source forever, even for elements that are implemented directly in browser engines in. And the future is bigger than the present.

So why didn't the custom elements spec choose to support the well established XML namespaces, allowing non-standard elements to be namespaced to their authors unique URLs?

We would have gotten rid of the name clash problem for ever and ever...
Instead, this incomplete poor-man naming scheme was chosen, and here we are.

What's cheap, many times ends up coming out as expensive.

@othermaciej can you expand on what you mean by option 5 'don't aim for perfect polyfillability for new elements by client-side JavaScript.'?

I'm not sure specifically how this is different from option 4 and I feel like this could run a pretty wide course of possible interpretations and as there are already a number of thoughts/options here, I would prefer to not confuse the conversation by replying to the wrong one. For example, if there would ever be a new element in the regular namespace, could we for example, use Shadow DOM to polyfill those (or is that something about your perfection comment)? Currently it seems the compromises that @domenic explained which we currently have would prevent this which seemed unfortunate at the time, because while the 'first era' of polyfills helped us move forward, they also kind of informed ideas about how that could be better (a lifecycle and a shadow dom makes it better because your tree- the thing everything uses to reason about everything, doesn't at some undefined point in time disappear and get transformed into some completely different tree and leave you with two sets of relationships to deal with) -- but to me, at the time, seemed like a conversation which could be had at another time (like, now perhaps).

@dcleao I suppose that could still be added, though I haven't looked into it at all. It wouldn't fix existing uses, but it could help with future uses.

@bkardell Aside from the naming that this thread opened with, and upgrade, we're discussing some other surface-level differences to how built-in modules might work in WebIDL: maybe they should have same-realm brand checks and unenumerable properties, both of which would match JavaScript classes more closely.

I think it would be great to be able to define a polyfill for a new element that could be "upgraded" automatically if there was an actual native implementation, but the browser would need to be able to recognize it as such. I know it's dead dead dead, but the is attribute would have served this purpose, but maybe this is another way to achieve the same thing without that controversy.

You may be interested in our experience in this area in the Maps for HTML Community Group. I wrote a blog post about it, relating to polyfilling and eventually upgrading new functions for the <map> and <area> elements.

Not specific to maps, but the key issue I think is: when 1/3 browser engines have implemented a standard, what are authors expected to do to get their elements to work in the other 2/3?

Finally, I don't think it's a good idea, at the risk of being Mr. Anti-JavaScript, to say we can define new actually standard, html elements in JavaScript forever. That leaves the Web very much poorer, and forces JavaScript on everybody everywhere, forever. Maybe I'm misunderstanding the idea though.

For 4, have you considered a model where the built-in names are just the fallback for when no custom element of that name has been defined? One could imagine the following set of APIs:

  • window.elements.getType("name") - Get the value of a name

    • Returns the custom element constructor directly if a custom element.
    • Returns "native" if the name is not defined in userland but has a native fallback.
    • Returns null if no element is defined with that name.
  • tracker = new UpgradeTracker(callback) - Subscribe to upgrades within a particular root element, with optional data to pass to the callback.

    • The callback is called with a single ES Map where the keys are the previous elements and the values are their replacements.
      • This makes processing them very efficient.
    • The callback is called synchronously and immediately after element upgrade callbacks are called, so frameworks can avoid breaking when a custom element is updated to a new value the same tick it's defined.
      • This is a real problem when you render synchronously, but a custom element a user uses is defined in a <script type="module">.
    • tracker.add(elem) - Add a subscription for an element.
      • When an element is upgraded, upgrade trackers are removed from the old element and added to the new element while building the callback upgrade map, to simplify most use cases.
    • tracker.remove(elem) - Remove an existing subscription for an element.
    • There is no way to "disconnect all" - it stores the tracker info on the element rather than the tracker. (The tracker just encapsulates a callback and the tracker logic itself.)
      • This dramatically speeds up the element upgrade algorithm.
      • This avoids duplicating memory unnecessarily.
    • The API is intentionally similar to MutationObserver, so browsers can batch it and not slow down the element upgrade process.
      • This is especially useful for polyfilled built-ins, but it's also useful for userland.
      • It's lower-level and more direct to reduce overhead in the subscription process, as you're likely tracking multiple elements, not just one.
    • This is specifically with frameworks in mind, so they can much more easily keep their internal representation in sync with minimal effort and minimal performance penalty - it becomes a single concurrent walk of two trees to update, with zero DOM mutations aside from creating new subscriptions. (This is literally the reason why I'm currently not integrating with customElements.whenDefined - it only runs on first definition instead of all redefinitions, it resolves at the wrong time, and it requires a lot of subtle, complex tracking to properly line up custom element upgrades to roots. The problem would only worsen if any tag ever could be upgraded instead of just custom elements, and it'd result in a massive slowdown on every node creation.)
      • It also allows for tracking only the elements the framework knows and cares about, no more.
  • <script type="module" src="..." import="..."></script> where the default export of the module is the constructor used for that import. This module is only loaded if none of the imports are already defined.

    • Omit src if you want to use the built-in fallback specifically.
    • Add an override attribute to override the existing value and, if the module itself hasn't yet been executed, skip its execution. You can combine this with omitting src, such as <script type="module" import="..." override></script> to declaratively revert to the built-in fallback.
    • import="..." is a space-separated list of tokens where for each token:
      • binding=name - Define name as the value exported from binding
      • name - Sugar for name=name, since that's bound to be incredibly common.
    • This leads to a "polyfill by default" semantic which is useful for not only built-in polyfills, but also userland components. Imagine Bootstrap offering custom elements.
    • This plays well with bundling - HTTP/2 isn't the magic sauce people claimed it was!
  • window.elements.define("name", Comp), window.elements.define("name", promiseToComp) - Define a name programmatically and set it to a component.

    • One could imagine <script type="module" import="..."> as sugar for window.elements.define("name", promiseToComp).
    • Elements are upgraded synchronously, and errors in UpgradeTracker callbacks are propagated accordingly.
    • window.elements.define("name", Comp, {is: "name"}) - Define a customized built-in element instead of an autonomous custom element.
    • The non-is: form can overwrite existing elements.
    • This obviously should prevent reentrancy.
  • window.elements.define("name", null) - Undefine a custom element.

    • This of course fires applicable upgrades to the native element type if a native fallback exists.
    • If this is entered while upgrading, it should perform its usual behavior, upgrade previously upgraded elements to the native forms, and abort the update algorithm altogether.
      • This removes any existing upgrade map entries of those upgraded if they were previously native themselves prior to upgrading.
  • sub = window.elements.whenDefined("name", callback) - This registers callback to be called as callback() on each upgrade of any element with a given "name".

    • This is mostly equivalent to customElements.whenDefined("name").then(...), but it avoids the overhead of scheduling a microtask and it can be called more than once.
    • Invoke sub.unsubscribe() to unsubscribe the callback.
  • maybeUpgraded = window.elements.upgrade(root) - Explicitly an element and all children in its subtree and return the upgraded element. If the root element itself needed upgraded, it returns the new element, not the old element.

quoting the second bullet from option 4 above:

But this issue is predicated on browsers introducing new elements to the global namespace. What if they didn't, or did so only rarely? For example, if we declared that all new elements will not be added to the global namespace, but instead only appear to pages that load a corresponding built-in module, then there would be no future collisions to worry about.

It seems like this is #4697, right? (Or is there a distinction between this option and that proposal?) It seems like if we do this, then option 4 seems pretty good. That said, it's a pretty major change, and I'm a bit skeptical of the rationale currently provided for it.


From the first bullet point of option 4 above:

It would either break web pages, if we made customElements.define() throw, or it would require extra work from browsers, to allow native <awesomeelement>s to get converted into custom <awesomeelement>s (probably using a variant of the custom element upgrade machinery). Note that one route out here is to implement that conversion.

Is there more detail on what the costs of this are?


(I think the strongest opinion I have here overall is that I dislike option 3.)

@dbaron

It seems like this is #4697, right? (Or is there a distinction between this option and that proposal?) It seems like if we do this, then option 4 seems pretty good. That said, it's a pretty major change, and I'm a bit skeptical of the rationale currently provided for it.

They are very related; as I said in the OP it's unclear whether keeping them separate works. But yeah, as stated it's about #4697. You could also imagine, at least in the abstract, opt-ins that are not done via built-in modules, or indeed are not pay-for-what-you-use. I guess the main point here is that opt-in allows collision avoidance.

Is there more detail on what the costs of this are?

I haven't thought about it too hard. I guess it's similar to customized built-in elements in terms of upgrades, but different in that you want to stop treating the <awesomeelement> as a native <awesomeelement>. That seems pretty hard to imagine implementing.

If we take a toy version of the problem, and ask what it would take for a second custom element definition B to override and upgrade a previous custom element definition A, we can perhaps analyze in more detail. The biggest problem I see there is the need to "undo" things done to each element instance by A's constructor. Maybe some sort of un-constructing callback would work. It sounds pretty messy though. Hmm.

What would be an example of a new HTML element (or even past elements in retrospect) that couldn't be introduced as a polyfillable element?

e.g. Instead of introducing new capabilities via HTML elements it seems to me that they could be introduced via new JS/CSS/DOM primitives and all new HTML elements could be defined in terms of those primitives (and hence be polyfillable).

I might be missing something but it seems to me that the only elements that can't be polyfilled under such a scheme are ones that have effects (or special behaviour) during parsing:

  • <meta>
  • <base>
  • <template>
  • <link>
  • <embed>
  • <object>
  • <script>
  • <noscript>

@Jamesernator You missed a few other elements (that doesn't include all the void elements), but it's mainly about what has runtime functionality that can't be polyfilled. Most of the most commonly used functionality of <embed> and <object> have JS implementations (Shumway for Flash, PDF.js for PDFs, etc.), and most of the rest could be implemented using <video>, <audio>, or <img> as appropriate. Chrome's NaCl and the equivalents for Safari and Firefox are the exception, but those are limited to extensions IIRC. There are some elements that can't be easily polyfilled that you also missed:

  • <img> and <video> lack the necessary APIs you'd need to render them efficiently to a canvas
  • <canvas>, <area>, and <map> are very important primitives when the usual tree model won't work (think: 3D games and interactive visualizations).
  • <iframe> is the primitive for nested browsing contexts.
  • <link> has a lot of built-in functionality you can't polyfill without native API access.
  • <slot> is a primitive for defining placeholders for templates.
  • <style> is a primitive for applying styles. <link rel="stylesheet"> can itself be implemented in terms of this + XHR/fetch.
  • You still need a basic "element" with nothing attached to it.

This of course doesn't include <noscript> support and it's by no means complete.

There are some elements that can't be easily polyfilled that you also missed:

I meant under my scheme of exposing an appropriate API (pretending this could be done in retrospect). If an appropriate API was exposed for all of those capabilities then they could be polyfilled.

My point is that the only ones that cannot be polyfilled given an appropriate API are those that affect the parser directly (and yes I forgot about void elements).

Hence all future (non-parser affecting) elements could in theory just be replaced with an API + a polyfillable element built on top of that API. (Or just forgo the element and implement the API and let userland settle on best usages of the API before deciding to standardize an element).