w3c / ServiceWorker

Service Workers

Home Page:https://w3c.github.io/ServiceWorker/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

_Update algorithm should unregister SW on 404 and 410 errors

jyasskin opened this issue · comments

Currently the _Update algorithm says

  • If fetching the script fails due to the server returns a 4xx or 5xx response or equivalent, or there is a DNS error, or the connection times out, then:
    1. Reject promise with a new NetworkError.
    2. Set serviceWorkerRegistration.updatePromise to null.
    3. Abort these steps.

Both @annevk and @jakearchibald have suggested that all 4xx responses should be interpreted as unregistering the SW instead of as an update failure. AppCache only treates a 404 or 410 as deleting the AppCache, and I think some other responses like 403 may be transient, so I'm only proposing to match AppCache. If y'all think all 4xx responses should unregister instead, I won't argue.

👍 Happy to unregister on 404/410 in line with appcache.

The main reason I suggested 4xx was that in that case the server can be reached and is active but apparently is no longer hosting an accessible service worker in that location.

With your suggested approach HTTP-only sites with a 403 space would be more vulnerable to attacks.

This is a proposal for HTTPS sites. Even unregistering on all 4xx errors wouldn't be enough for HTTP sites; see #199 for that.

I would unregister for all 4xx & 5xx errors. If a user agent receive them, both the client and the server are online so it's feasable to receive the updated ServiceWorker in the near future, and during this time the ServiceWorker served resources can be available directly by the server. I would only prevent to unregister the SW after the stated 24 hours life limit if the client is offline.

@jyasskin the same reasoning applies to HTTPS I think, apart from the attacks.

@piranna no, 5xx could be because the server is undergoing maintenance or is under heavy traffic or some such. Makes no sense to unregister there.

@annevk, I didn't take in account that situation. Could it make sense to automatically unregister on any 4xx error, and left th SW untouched until the 5xx error gets fixed so a decision can be done? I think it should fix the problems we are having here...

Yes, that follows from OP and subsequent comments.

Unregistration on 4xx is broken. It is hostile to intermittent data center downtime, etc. If you need this for security, use SSL.

Why would a data center return 4xx in that case and not 5xx?

IIUC, restricting to SSL obsoletes this issue.

What if a 404/410 triggered a SW unregistration, but only on localhost origins?

The decision to close this issue made assumptions that we were dealing with a production server, using HTTPS, but this has become a recurring pain point for developers who are using SWs on a common localhost:port combination, e.g. localhost:3000 or localhost:8000. As soon as the first project that uses a SW is registered, that SW will remain in control for subsequent navigations, even if the developer switches to a project that has no SW.

See facebook/create-react-app#2438, for one example of the attempts to work around this, but that type of one-off solution doesn't scale.

I think it's absolutely desirable to address this. I worry about the complexity and potential security fallout of putting the developer special-case in the heart of the spec, though. This seems like a use-case best handled by each browser's devtools. They can be more proactive with UI and heuristics than would ever make sense in the spec.

For example, an update check might take up to 24 hours to hit the network and trigger a 404/410-based uninstall if updateViaCache="all" (formerly useCache, #1107) and Cache-Control says the data is still good. But devtools can by default show a UI affordance in the URL bar that indicates the page was intercepted by a ServiceWorker against localhost, and for good measure throw up a door-hangar or panel resulting from a proactive update check that notices the SW is gone, etc.

This is also more likely to play well with devtools' affordances like Chrome's "Update on reload" checkbox option under the "Service Workers" page of the "Application" tab.

Having said that, there are header-based mitigations that tools that are running their own webserver on localhost could use to try and proactively mitigate the situation, such as is discussed in facebook/create-react-app#2438. For example, the link header could be used to install a "cleanup" service worker when the webserver gets a chance to serve the user a page because the refresh button was hit and bypassed the existing SW. I think there's also been talk of a header to let an origin request that its persistent storage be blown away?

Note, we have seen production servers orphaning service worker scripts on clients with a 404. We have many users continually trying to update:

https://plus.google.com/serviceworker.js

That's a 400 rather than a 404/410 which is a bit weird.

FWIW I'm still not convinced we made the right choice here, and would be happy to make 404/410 unregister.

I get that this may potentially trigger some false positives in event of a server glitch (#204 (comment)), but I feel these situations are rarer than issues with orphaning service worker scripts as @wanderview describes.

@slightlyoff is your view here unchanged?

For the orphaned production scripts would could unregister after N consecutive 404s or something. Or 404s over time period X.

That would not help with the localhost case, though.

Are intermittent 404s a thing that happens? Are they more common than server glitches we aren't defending against, such as serving a blank file.

Appcache unregistered on 404 and I don't recall any complaints about that, but they may have been drowned out by other complaints.

Re: #204 (comment), regarding whether specialized behavior for localhost might be best addressed via browsers' DevTools, that's another avenue that I'm exploring, with some FRs for Chrome:

(I haven't filed similar requests against Firefox.)

Are intermittent 404s a thing

I don't know. I defer to people running sites on this.

Note, if we do auto-unregister on 404, what do we do with data in the Cache API? Will this cause problems if a new service worker is registered on top of the old storage?

Note, if we do auto-unregister on 404, what do we do with data in the Cache API?

The Cache API is just part of origin storage, so I wouldn't expect the cache API to be removed.

The Cache API is just part of origin storage, so I wouldn't expect the cache API to be removed.

Sure, but does this solve the use case of "I tested this on localhost ages ago and I have a bunch of unexpected state on localhost messing up my localhost today"?

Still, it would be nice to fix the orphaned update case as that uses actually CPU, bandwidth, etc on real sites today.

Cache API could be used from main thread for many things. Will browsers browser be able to detect if cache is ServiceWorker only? Unless it will be stored as cache's metadata then probably not.

Sure, but does this solve the use case of "I tested this on localhost ages ago and I have a bunch of unexpected state on localhost messing up my localhost today"?

Only if the same (hard-coded?) cache name is used, but I see how this may happen too. SW tools should save from this.

Still, it would be nice to fix the orphaned update case as that uses actually CPU, bandwidth, etc on real sites today.

Can you share stats on how often that happens with how many websites? (assuming Google Plus SW isn't the only case)

Throwing in some other ideas:

  • Unregister on 404 only if page (on navigation)[1] doesn't call navigator.serviceWorker.register() anymore
  • Have an unregisterOnFailure option in navigator.serviceWorker.register() (I don't really like this)

[1] During a navigation, i.e. between current navigation and the next one in the same tab

Reopening as there's some interest.

I'm all for unregistering the service worker to avoid issues I've encountered in Create React App.

Beyond localhost issues described by @jeffposnick. What if I create an app that has a service-worker.js, deploy it on something.com. Then later on I released a new app without service workers on something.com

As it stands my new app, even though it has never touched a service worker would have to have something to deregister old service workers, or else every user would always see the old version of the app.

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error (but yes, it could happen under certain edge-case circumstances).

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error something..

I'm pretty sure there are such webservers, e.g. an entry is temporarily missing from database or just a glitch happened. But yes, that's probably not very common for static files such as SW file.

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

This sounds reasonable to me.

F2F: Facebook & Google servers will occasionally 404 due to bugs and do not want service workers to disappear because of this.

Service worker offers you enough tools & headers to see what's happening.

Jake promises never to bring this up again.

If we want to look at this in future, it should be in the form of an updatefailure event or some-such, that developers could use to monitor updates and unregister the service worker themselves if they get a series of 404s in a row.

This still seems somewhat sad. Even if 404 is triggered on accident, I have a hard time believing the same is true for 410.

Unregistering on 410 is fine, but is it useful? It doesn't fall with the "I removed the service worker and forgot about it" case. If you can put the effort into serving a 410, you can just create a service worker that unregisters itself.

A 410 seems easier to manage, but maybe Clear-Site-Data will also take care of this which is equally easy to manage.

I'm all for unregistering the service worker to avoid issues I've encountered in Create React App.

Beyond localhost issues described by @jeffposnick. What if I create an app that has a service-worker.js, deploy it on something.com. Then later on I released a new app without service workers on something.com

As it stands my new app, even though it has never touched a service worker would have to have something to deregister old service workers, or else every user would always see the old version of the app.

If the server responds with a 404 that is quite an obvious error of not finding a file and I have never encountered a server responding with 404 when it was actually a 500 error (but yes, it could happen under certain edge-case circumstances).

Even if it does, worst case someone loses access to the site temporarily (no more offline after 1st reload). On the other hand, worst case of ignoring 404 is someone is stuck forever seeing the incorrect webapp.

I can't agree more, this is likely to cause serious problems in the future. I don't think the fact that Google and Facebook don't want this is a solid argument.