Closed
Bug 1448141
Opened 7 years ago
Closed 7 years ago
fix service worker ScriptLoader issues
Categories
(Core :: DOM: Service Workers, defect, P2)
Core
DOM: Service Workers
Tracking
()
RESOLVED
FIXED
mozilla61
People
(Reporter: bkelly, Assigned: bkelly)
References
Details
Attachments
(5 files, 12 obsolete files)
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
baku
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
asuth
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
Splitting this off from bug 1406547 comment 19. Currently if ScriptLoader cannot find a service worker's scripts in CacheStorage it will just load them over the network and store them again. This is a bit bugged because:
1. The registration should not exist any more if quota is wiped by QM. (bug 1183245)
2. The service worker script may change without triggering new install/activate events. This could break the site.
3. The ScriptLoader fallback code is buggy. For example in bug 1406547 we discovered it allows scripts to be redirected, which is not correct for service workers.
Rather than do this broken fallback we will instead unregister the service worker in this case. This will emulate what should happen when a QM wipe occurs.
Updated•7 years ago
|
Priority: -- → P2
Assignee | ||
Comment 1•7 years ago
|
||
Hmm... apparently we use this "fallback" path for the initial installation of the scripts as well. The offlining in ServiceWorkerScriptCache is only for the update path.
So there are a couple problems here.
1. We don't block redirects like we're supposed to in this path AFAICT.
2. We need to support initial installation, but not reloading the script later.
Assignee | ||
Updated•7 years ago
|
Summary: unregister service worker if ScriptLoader cannot find its scripts in CacheStorage → fix service worker ScriptLoader issues
Assignee | ||
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=41eb0ad43974ed9a3a684b930ce94c31285988dc
I still need to add some logic to unregister the service worker if we hit the network when we're not first evaluating.
Assignee | ||
Comment 5•7 years ago
|
||
I think the best thing to do here:
1. Let the fetch event fail and reset to network.
2. The update should then run.
3. If the update fails, then clear the registration.
So I will add logic in the update code to clear the registration if it fails when the main script is missing from the old cache.
Assignee | ||
Comment 6•7 years ago
|
||
Attachment #8961546 -
Attachment is obsolete: true
Assignee | ||
Comment 7•7 years ago
|
||
Attachment #8961550 -
Attachment is obsolete: true
Assignee | ||
Comment 8•7 years ago
|
||
Assignee | ||
Comment 9•7 years ago
|
||
Attachment #8961920 -
Attachment is obsolete: true
Assignee | ||
Comment 10•7 years ago
|
||
This avoids the crash in bug 1406547, but runs afoul the "removed a registration controlling clients" assertion.
Blocks: 1406547
Assignee | ||
Comment 11•7 years ago
|
||
Perhaps we can avoid the most common situation by moving this code:
https://searchfox.org/mozilla-central/rev/56274d0a016a6833e5da7770ce70580b6f5abb21/dom/serviceworkers/ServiceWorkerManager.cpp#2410-2474
Right now this code starts controlling the reserved/initial client before we realize the service worker is corrupted. If we move this later, then we can avoid controlling the client at all.
On the other hand, that won't help if there are other windows already open with the given service worker controller. I think that situation is unlikely, though, given if there are other windows already controlled then the service worker is most likely *not* corrupted.
So I will try moving the "control this loading client" code to after ServiceWorkerPrivate determines that things are ok to proceed.
Also, I need to write a test for the one method I have to reproduce this crash.
Assignee | ||
Comment 12•7 years ago
|
||
I investigated a bit more why we still need the code in ScriptLoader to store some scripts. I was having trouble understanding why ServiceWorkerScriptCache was inadequate.
The answer is that ServiceWorkerScriptCache always offlines the main script. It does know about any new importScripts(), however, because the script must be evaulated to detect those. It only offlines importScripts() it has seen from previous versions of the script.
So the ScriptLoader code to store in cache API is only really used for importScripts.
Assignee | ||
Comment 13•7 years ago
|
||
Attachment #8961917 -
Attachment is obsolete: true
Assignee | ||
Comment 14•7 years ago
|
||
We don't need to block redirecting service worker scripts in ScriptLoader since we should never be loading them over network there.
Attachment #8961919 -
Attachment is obsolete: true
Attachment #8962356 -
Attachment is obsolete: true
Assignee | ||
Comment 15•7 years ago
|
||
Attachment #8961932 -
Attachment is obsolete: true
Assignee | ||
Comment 16•7 years ago
|
||
I tried doing comment 12, but its turning out to be quite thorny. There are a number of paths we have to handle and ScriptLoader doesn't do exactly what we want on error.
I'm going to investigate the "force clients to be uncontrolled" approach instead. I don't like it as much since it means doing something sites don't expect. Its also possible for us to accidentally abuse it in the future. It might be the most reasonable thing to do for now, though.
Assignee | ||
Comment 17•7 years ago
|
||
Actually, I think I can just loosen the assertion if we are in the "forced uninstall due to corrupt storage" state. That seems to work ok.
Assignee | ||
Comment 18•7 years ago
|
||
Attachment #8962395 -
Attachment is obsolete: true
Assignee | ||
Comment 19•7 years ago
|
||
Just need to write some tests now.
Assignee | ||
Comment 20•7 years ago
|
||
Assignee | ||
Comment 21•7 years ago
|
||
Assignee | ||
Comment 22•7 years ago
|
||
Assignee | ||
Comment 23•7 years ago
|
||
Comment on attachment 8962394 [details] [diff] [review]
P1 Only load service worker importScripts() from the network in ScriptLoader on first evaluation. r=asuth
Andrew, we currently have two different ways we load and store service worker scripts for offline usage:
1. The ServiceWorkerScriptCache code loads/stores the main script and any importScripts() it knows about. That is, any importScripts() seen in the last version of the registration.
2. The ScriptLoader also loads/stores importScripts(). It does this for importScripts() that ServiceWorkerScriptCache hasn't seen yet. Basically, we do it here because we need the ScriptLoader to be evaluating the script to find the importScripts() call.
The problem is that the ScriptLoader path in (2) could also be triggered for main scripts if the CacheStorage data went missing. This was bad for a couple reasons. First, the new script could be different and not trigger install/activate events. Second, this path did not protect against redirects.
This patch closes off the ScriptLoader (2) path for main scripts and importScripts() after the initial installation phase.
Attachment #8962394 -
Flags: review?(bugmail)
Assignee | ||
Comment 24•7 years ago
|
||
Comment on attachment 8962739 [details] [diff] [review]
P2 Uninstall the service worker registration if update fails while its scripts are missing from offline storage. r=asuth
So the P1 patch causes us to fail to load a broken service worker script. What do we do then?
Well, every navigation will trigger an update whether it fired a FetchEvent or not. With that in mind we can handle the problem during update.
This patch causes us to note if the old scripts are missing. If they are we set a flag indicating that we're in a bad state. If the update can get the new script then we allow it to install. In addition, we force mark it as skip-waiting to get rid of the broken service worker as soon as possible.
If the update fails and we've set our "OMG its broken" flag, then we are in a bad place. For lack of anything better to do we force uninstall the registration.
Assignee | ||
Comment 25•7 years ago
|
||
Comment on attachment 8962739 [details] [diff] [review]
P2 Uninstall the service worker registration if update fails while its scripts are missing from offline storage. r=asuth
See comment 24.
Attachment #8962739 -
Flags: review?(bugmail)
Assignee | ||
Comment 26•7 years ago
|
||
Comment on attachment 8962875 [details] [diff] [review]
P3 Add a browser chrome test verifying service workers handles its scripts being wiped. r=asuth
This patch adds a test that verifies:
1. We trigger a real update if the underlying scripts went missing and we had to reinstall.
2. We remove the registration if the update fails.
3. A redirect will not succeed as an update.
Attachment #8962875 -
Flags: review?(bugmail)
Assignee | ||
Updated•7 years ago
|
Attachment #8962876 -
Attachment is obsolete: true
Assignee | ||
Comment 27•7 years ago
|
||
The test here is hitting bug 1447300 on the verify run. I guess I'll fix that as well.
Depends on: 1447300
Assignee | ||
Comment 28•7 years ago
|
||
Assignee | ||
Comment 29•7 years ago
|
||
I'd like to get this into 60 beta if I can. The sooner we can catch these corrupted storage issues the better.
status-firefox59:
--- → wontfix
status-firefox60:
--- → affected
status-firefox61:
--- → affected
status-firefox-esr52:
--- → disabled
Assignee | ||
Comment 30•7 years ago
|
||
Attachment #8963120 -
Attachment is obsolete: true
Comment 31•7 years ago
|
||
Comment on attachment 8962394 [details] [diff] [review]
P1 Only load service worker importScripts() from the network in ScriptLoader on first evaluation. r=asuth
Review of attachment 8962394 [details] [diff] [review]:
-----------------------------------------------------------------
Excellent comments, thank you.
Attachment #8962394 -
Flags: review?(bugmail) → review+
Comment 32•7 years ago
|
||
Comment on attachment 8962739 [details] [diff] [review]
P2 Uninstall the service worker registration if update fails while its scripts are missing from offline storage. r=asuth
Review of attachment 8962739 [details] [diff] [review]:
-----------------------------------------------------------------
Really great comments in here too.
::: dom/serviceworkers/ServiceWorkerUpdateJob.h
@@ +13,5 @@
> namespace mozilla {
> namespace dom {
>
> +namespace serviceWorkerScriptCache {
> +enum class OnFailure : uint8_t;
This is a weird thing to forward declare given how tiny ServiceWorkerScriptCache.h is (it only includes nsString.h, etc.), but okay. If the intent is to hide the actual enum values from this file and its includers, please do say that in a comment though.
Attachment #8962739 -
Flags: review?(bugmail) → review+
Assignee | ||
Comment 33•7 years ago
|
||
Comment 34•7 years ago
|
||
Comment on attachment 8962875 [details] [diff] [review]
P3 Add a browser chrome test verifying service workers handles its scripts being wiped. r=asuth
Review of attachment 8962875 [details] [diff] [review]:
-----------------------------------------------------------------
Fantastic, very readable and well-commented test. I suggest a few additional comments for my own laziness in the future.
::: dom/serviceworkers/test/browser_storage_recovery.js
@@ +1,2 @@
> +"use strict";
> +
I suggest adding a file-level block comment up here before the constants that explains things a bit more, like:
This test registers a SW for a scope that will never control a document and therefore never trigger a "fetch" functional event that would automatically attempt to update the registration. The overlap of the PAGE_URI and SCOPE is incidental. checkForUpdate is the only thing that will trigger an update of the registration and so there is no need to worry about Schedule Job races to coalesce an update job.
@@ +12,5 @@
> + return !!reg.installing;
> + });
> +}
> +
> +async function wipeStorage(u) {
I suggest adding a block comment like as follows. I do think your line comments before the invocations are great for experts, but generally worry about code like this getting cargo-culted without understanding:
Delete all of our chrome-namespace Caches for this origin, leaving any content-owned caches in place. This is exclusively for simulating loss of the origin's storage without loss of the registration and without having to worry that future enhancements to QuotaClients/ServiceWorkerRegistrar will break this test. If you want to wipe storage for an origin, use QuotaManager APIs
::: dom/serviceworkers/test/storage_recovery_worker.sjs
@@ +14,5 @@
> + setState("redirect", "true");
> + }
> +
> + response.setHeader("Content-Type", "application/javascript");
> + response.write("");
For byte-comparison paranoia purposes, would it better for the body to be "// I am not empty.js!" or something like that (since empty.js is an empty string too)?
Attachment #8962875 -
Flags: review?(bugmail) → review+
Assignee | ||
Comment 35•7 years ago
|
||
Comment on attachment 8963158 [details] [diff] [review]
P4 Immediately start closing worker scripts if their initial compilation fails. r=baku
Andrea, this causes us to start closing the worker thread when we notice the initial worker script failed to compile.
Attachment #8963158 -
Flags: review?(amarchesini)
Assignee | ||
Comment 36•7 years ago
|
||
Comment on attachment 8963341 [details] [diff] [review]
P5 Remove the service worker script load failure runnable. r=asuth
Andrew, this removes the fail runnable from the service worker startup path. It turns out its really not needed and is causing duplicate callbacks. If the script fails to compile the thread is still started and our various WorkerRunnables still get dispatched there. They can then notice the worker thread is shutting down themselves and report back their own callback.
Attachment #8963341 -
Flags: review?(bugmail)
Assignee | ||
Comment 37•7 years ago
|
||
I need to tweak the test to explicitly disable the script redirecting. I think the --verify test runs are confusing the .sjs state handling a bit somehow.
Assignee | ||
Comment 38•7 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #34)
> For byte-comparison paranoia purposes, would it better for the body to be
> "// I am not empty.js!" or something like that (since empty.js is an empty
> string too)?
This shouldn't be necessary because redirects are an error.
Assignee | ||
Comment 39•7 years ago
|
||
Attachment #8962875 -
Attachment is obsolete: true
Attachment #8963650 -
Flags: review+
Assignee | ||
Comment 40•7 years ago
|
||
Assignee | ||
Comment 41•7 years ago
|
||
Those last TV failures seem to have been due to the redirect being cached in http cache.
Attachment #8963650 -
Attachment is obsolete: true
Assignee | ||
Updated•7 years ago
|
Attachment #8963711 -
Flags: review+
Updated•7 years ago
|
Attachment #8963341 -
Flags: review?(bugmail) → review+
Updated•7 years ago
|
Attachment #8963158 -
Flags: review?(amarchesini) → review+
Assignee | ||
Comment 42•7 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #32)
> > +namespace serviceWorkerScriptCache {
> > +enum class OnFailure : uint8_t;
>
> This is a weird thing to forward declare given how tiny
> ServiceWorkerScriptCache.h is (it only includes nsString.h, etc.), but okay.
> If the intent is to hide the actual enum values from this file and its
> includers, please do say that in a comment though.
I'd like to leave this as is. We forward declare enums in other places in the tree. Even if the header is small today, it will get more stuff added in the future. Forward declaring whenever possible is about de-coupling header dependencies and I think is a good thing as a general principal.
Assignee | ||
Comment 43•7 years ago
|
||
Give risk from P4 and P5 I'd prefer to just let this ride the trains.
Comment 44•7 years ago
|
||
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0b60b1f171f2
P1 Only load service worker importScripts() from the network in ScriptLoader on first evaluation. r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a78a425ee42
P2 Uninstall the service worker registration if update fails while its scripts are missing from offline storage. r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1a568d82d1c
P3 Add a browser chrome test verifying service workers handles its scripts being wiped. r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/cb8d4e3b8d01
P4 Immediately start closing worker scripts if their initial compilation fails. r=baku
https://hg.mozilla.org/integration/mozilla-inbound/rev/c22501d278b4
P5 Remove the service worker script load failure runnable. r=asuth
Comment 45•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/0b60b1f171f2
https://hg.mozilla.org/mozilla-central/rev/2a78a425ee42
https://hg.mozilla.org/mozilla-central/rev/e1a568d82d1c
https://hg.mozilla.org/mozilla-central/rev/cb8d4e3b8d01
https://hg.mozilla.org/mozilla-central/rev/c22501d278b4
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in
before you can comment on or make changes to this bug.
Description
•