Off thread parse scripts as soon as they are fetched
Categories
(Core :: Performance, enhancement)
Tracking
()
People
(Reporter: denispal, Assigned: denispal)
References
(Regressed 1 open bug)
Details
(Keywords: perf:pageload)
Attachments
(3 files, 2 obsolete files)
Many scripts are currently parsed synchronously right before execution. This is an unnecessary blocker on the main thread. We should instead consider parsing them off thread immediately when the fetch is complete.
Data we've collected suggests there is ample time between a fetch end and the execution start of most scripts indicating that much of the parsing can be done off-thread and be ready by the time the script needs to be executed.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
A blocker for this work is that the script dom element is currently passed to the JS parser during compilation. This element is usually not ready when the script is fetched since most of the fetches are done speculatively before the dom parser even gets to the script tag. Therefore, bug 1501608 needs to be addressed first by removing the dom element parameter that is passed to the compiler and replacing it with a callback instead.
Assignee | ||
Comment 2•5 years ago
|
||
WIP: Off-thread speculative parsing prototype
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Preliminary results from the prototype from try: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=51dcc2edecadad8b21cb7d64b514761d98ab608c&newProject=try&newRevision=1679c0f7b8f0bccbe6caff8d1f4250407192642e&framework=10
Summary is many improvements in the raptor tests ranging from +3%-30%. Average appears to be around 20%. There is also a notable regression in the facebook test which needs to investigated.
Assignee | ||
Comment 4•5 years ago
|
||
Attaching browsertime visual metrics results for the Pixel 2. Overall geomean is about a 3% improvement. Similar results are seen on Desktop Windows 10 on the reference laptop and a higher end laptop.
Updated•5 years ago
|
Comment 5•5 years ago
|
||
Pushing to try since I don't know whether the patch works with the tests,
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f900ff3e7b08912a6a674d9a799157a610d509de
Comment 6•5 years ago
|
||
ok, that is colorful
Comment 7•5 years ago
|
||
Summary of the changes required:
-
Remove the ELEMENT_SLOT property of the SSO
-
Add a callback function in Scriptloader that will return the script dom element when given the loaded script
-
Some scripts are compiled directly from the Event Listener Manager, so add a new loadedscript derivation for the Event Listener Manager.
-
Change the mElement type ScriptLoadRequest from nsIScriptElement to Element. This is because we do not have access to an nsIScriptElement from the Event listener manager.
-
Remove mParserBlockingRequest restriction for off thread parsing. This should allow all fetched scripts to immediately begin off thread parsing via OnStreamComplete->PrepareLoadedRequest->AttemptAsyncScriptCompile
-
We now need to explicitly check for the dom element in ProcessOffThreadRequest because it is very likely that we have finished compilation before the dom parser has the element ready. Previously, we only off thread parsed parser blocking scripts so the element was always available when we reached ProcessOffThreadRequest. If the element is not ready, then the request should be eventually processed when it becomes a parser blocking request or from one of the request lists (e.g. mLoadedAsyncRequests) via ProcessPendingRequests.
-
We need multiple checks in ProcessPendingRequests for mWasCompiledOMT so we can properly unblock on load. I think this may be best done one time in ProcessRequest but I didn't get around to making that change.
-
In LookupPreloadRequest, we need to call SetReady if the script was compiled off thread speculatively since the element is now available. Normally, this is done during processing but if the element was not ready at the time, then we were not able to process the script. If we reached LookupPreloadRequest then the script has become a parser blocking script so we can process it properly.
Known Problems: The CC is deleting the ScriptLoadRequest object from the LoadedScript while we still need it alive. This looks to several SEGV issues since the callback will try to dereference a nullptr when called from the debugger. We could potentially work around this by moving the mElement object outside of the ScriptLoadRequest and into the LoadedScript and other classes that use it. It makes things a bit less organized, but the CC problem may disappear (although it's still important to know what's going on) and it will also let us allocate much less memory when creating ELMScript objects since we don't need the ScriptLoadRequest there.
Risks: I did my best making sure that the execution order of all defer, async, and blocking scripts is maintained. If there is a bug somewhere and we are not executing some scripts that we should be, then that could be a very likely reason for the performance improvements seen.
Smaug: Trying to fix assertions
The main change was to revert changes to ScriptLoader::AddAsyncRequest
That fixed, IIRC, the assertions within ScriptLoader.
Known assertions are coming from JS engine during shutdown
Something like "Assertion failure: isEmpty() (failing this assertion means this LinkedList's creator is buggy: it should have removed all this list's elements before the list's destruction), at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/LinkedList.h:434"
Debug 64bit linux + wpt tests seem to trigger that.
Assignee | ||
Comment 8•5 years ago
|
||
Results from live sites using gve on the S7E.
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 9•5 years ago
|
||
The changes proposed here will speculatively parse all external scripts off thread as soon as they are fetched. The exception is non-async inserted scripts which may potentially degrade onLoad event by quite a bit. This should save us some time since currently all scripts are parsed right before execution or while they are blocking the dom parser.
Assignee | ||
Comment 10•5 years ago
|
||
tp6 results can be found here: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=84a674ba8b2bd5bd990b87fbbea7db7c1d2fc1c7&newProject=try&newRevision=a22e228351b595688235acc83bdde2b90dfbe429&framework=10
There is a regression for fnbp in raptor-tp6-facebook-firefox due to a different execution order of async scripts. Unfortunately, I think this is unavoidable right now until we can properly prioritize async script execution, unless we try to replicate the exact same behavior as before for async scripts which means we also parse them later. I performed visual metric testing on the live site itself and did not see any regressions, however.
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Assignee | ||
Comment 13•4 years ago
|
||
Latest results should be roughly similar to this run:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=b66d0ed5382ee7b68ea075eead4c11de5eb9d06d&newProject=try&newRevision=90e3c352c32e8c69321672596a464a2c1fa7e6ce&framework=10&pageTitle=Performance+Results+for+bug+1606652
Assignee | ||
Updated•4 years ago
|
Updated•3 years ago
|
Description
•