w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pre-rendering in new tabs and window ID behaviour

oliverdunk opened this issue · comments

Chrome already supports pre-rendering (blog), but only for a navigation that is expected to happen in an existing tab. We are looking to expand this to situations where a pre-rendered page could open in a new tab or window, such as when clicking an anchor tag.

It is unclear if this new tab will be added to the current window or a new one while pre-rendering is happening since it can start before the click has happened, and we do not know if the user will middle click.

Should we fire events on the browser.tabs API, and if so should the windowId parameter be:

  • The ID of the window initiating the navigation (which will be the final window if a new one is not opened)
  • A new window ID (which will only be used if we place the tab in a new window)
  • An arbitrary value, or a pre-determined value that represents this specific case

Note that for the existing pre-rendering of documents, we do fire webNavigation events in the existing implementation. We set the frameId to an unspecified value and provide the documentLifecycle property on the event to indicate that we are in the pre-render lifecycle stage.

For the full discussion, see the meeting notes (reviewed at #526, available at https://github.com/w3c/webextensions/blob/main/_minutes/2024-01-18-wecg.md).

The general theme was the desire to integrate this in the existing tab/window event models. That implies no -1 or magic windowId, but an actual windowId (followed by tabs moving windowIds if desired).

I have labeled this as discussion Needs further discussion because the usual consensus labels are not really applicable here. Perhaps, if we agree on a direction in the future, we may be able to use the usual consensus labels to make it easier to track cross-browser support and implementation.

I chatted with Oliver about this, and I have some concerns with using the window ID of initiator window. In particular, I think this will very quickly lead to inconsistencies between APIs that will be very difficult for developers to understand, diagnose, and detangle -- and will definitely seem like (and possibly hide) real bugs.

Some examples. Assume a window (ID "W1") with, say, 5 tabs. Tab 3 is active, and begins a prerender for a new tab. The proposal is to indicate the new prerendered tab lives in Window W1.

  • What happens when the developer calls tabs.getAllInWindow() or any of the windows API methods (like windows.get())? Would we modify those methods to return the pre-rendered tab? That seems like a pain to track from an implementation standpoint, but omitting the prerendered tab would then look like a bug (a developer got a tab event and then the tab isn't present in the corresponding window, despite no other events around it).
  • What is the index of the tab in the window in this scenario? It can't be "4" (there's already a tab 4). We could put it as a new tab at the end, but then we also need to fire corresponding events for moving the tab when it gets into place.
  • Even if we do return the tab along with the window data, how does the extension know it's being prerendered? For extensions that present some information to the user about the state of their window -- for instance, vertical tab strips implemented in side panels -- their corresponding UI would look incorrect (they'd indicate a tab that isn't there).
  • While we could theoretically work around the above by also introducing a new property on the tab ("prerendered" or similar), that doesn't seem inline with the goal here of making it easier for existing extensions.

I feel like "lying" about the window the tab is in is going to lead to a lot more complexity than it solves. Any instance in which the developer tries to access that tab through the corresponding window is either going to fail or is going to impose a lot of complexity for both browsers and extensions to get right.

I'd much prefer we either:

  • Just use -1 -- this is "WINDOW_ID_NONE" and, in my opinion, that's correct. The tab is not in a window. There's some similar precedent for this already with contexts that don't have a corresponding tab (such as requests from service workers).
  • We could theoretically use a new constant for a prerendered window. I'd lean against this, though.

The main reason this came up is that currently Chrome doesn't allow tab objects with invalid window IDs -- but I would suspect that's a limitation that few or no extension developers are currently expecting. IMO, indicating the prerendered tab isn't in a window is the most predictable, explicable, and logical.

(I'm fully on board with firing the "moved" events when the tab exits prerender and becomes "real", at which point it would have a valid window ID).

What do y'all think?

I would like to know what tab-related events that the "prerender-in-new-tab/window" fires. For example,

  • Does browser fire the tabs.onCreated event when "prerender" or when "visible to user"? I prefer this event when "visible to user".
  • Does browser fire the tabs.onUpdated event when "prerender"? I think No.
  • Does browser fire the tabs.onMoved event when "visible to user"? rdcronin think Yes. But I don't think so.
  • Does browser fire the tabs.onReplaced event when "visible to user"? This is the only existing event that related to "prerender" behavior. But it doesn't "replace" any existing tab.
  • Does browser fire the tabs.onAttached event when "visible to user"? Again, I prefer the tabs.onCreated event.

Overall, I prefer that developers are not aware of a "prerendered" tab when it is not visible to user. For example, tabs.query() can't find it. So developers don't need to care about tab.windowId.

Unfortunately, it's not as simple as that.

Many extensions need to deal with sites during the loading process. If this loading process happens off-screen (in a pre-rendered context) those extensions will still need to handle that case. Many of these can be handled declaratively -- such as through content scripts or declarativeNetRequest, which are both able to behave based on URL patterns or other data that is largely independent of tab ID. However, this isn't always the case. While we could conceptually limit the tabs API to "visible tabs", there are other APIs in which this would clearly be undesirable IMO (such as webNavigation). Currently, those events are also heavily tied to tab IDs.

We could take the stance that the tabs (and windows) API is purely about "what the user sees", and then simply not include tab ID at all on events from webNavigation for pre-rendered or BFCached (or other non-visible) tabs. This is more possible now that we have document IDs on these events, which allows developers to continually associate a request with a given context. If we were to start with a clean slate, I think we would probably want to go in that direction.

However, we also have to worry about existing extensions. For better or worse, many extensions have a pattern that might look like:

chrome.tabs.onCreated.addListener(tab => {
  if (isRelevantUrl(tab.url)) {
    injectScriptAtDocumentStart(tab.id);
  }
});

(Yes, this has always been racy, and it's one of the reasons we introduced declarative content scripts. However, it's still a common pattern and it "mostly" works for injecting script before the page has a chance to do much.)

This type of pattern would be very hindered if we did not fire any tab-related events for pre-rendered tabs. Similarly affected would be extensions that currently use tabId instead of documentId to associate requests (e.g. from webNavigation or webRequest) with a given context.

When making changes to the network and rendering stack, we've typically strived to ensure as much backwards compatibility as possible to avoid breaking extensions. I suspect that simply not dispatching any tab-related events (or having any associated tab ID for other events like webNavigation and webRequest) would be a very breaking change for many extensions.

"pre-rendering tab" seems unable to achieve conceptual integrity while maintaining backwards compatibility. It looks like we're faced with these choices:

  1. Option 1: don't introduce "pre-rendering tab". This will keep everything simple and no compatibility issues.
  2. Option 2: introduce new api, event and property. Developers are clearly aware of this special state.
  3. Option 3: pretend it's a tab, but not exactly (no window id, no index, can't activate, and maybe cancel). Developers still need some defensive programming.

I think that's fair @hanguokai, though I would say there is also a 2b where we introduce some new properties without a full API, similar to the webNavigation changes.

I think focusing on a discussion between 2 and 3 makes the most sense since this is ultimately not a change being worked on by the extensions team and we want to support browsers in continuing to make performance improvements.

Based on the last meeting, it seems like there is general agreement to move forward with using WINDOW_ID_NONE for any events involving these windows. We're planning to start some early experiments using this in Chrome, likely behind an origin trial.