Closed Bug 1606652 Opened 5 years ago Closed 4 years ago

Off thread parse scripts as soon as they are fetched

Categories

(Core :: Performance, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla79
Performance Impact high
Tracking Status
firefox73 --- wontfix
firefox79 --- fixed

People

(Reporter: denispal, Assigned: denispal)

References

(Regressed 1 open bug)

Details

(Keywords: perf:pageload)

Attachments

(3 files, 2 obsolete files)

Many scripts are currently parsed synchronously right before execution. This is an unnecessary blocker on the main thread. We should instead consider parsing them off thread immediately when the fetch is complete.

Data we've collected suggests there is ample time between a fetch end and the execution start of most scripts indicating that much of the parsing can be done off-thread and be ready by the time the script needs to be executed.

Assignee: nobody → dpalmeiro

A blocker for this work is that the script dom element is currently passed to the JS parser during compilation. This element is usually not ready when the script is fetched since most of the fetches are done speculatively before the dom parser even gets to the script tag. Therefore, bug 1501608 needs to be addressed first by removing the dom element parameter that is passed to the compiler and replacing it with a callback instead.

Depends on: 1501608
Attached file WIP: Off-thread speculative parsing prototype (obsolete) (deleted) —

WIP: Off-thread speculative parsing prototype

Attachment #9118386 - Attachment description: Retrieve DOM script element from a callback function → WIP: Off-thread speculative parsing prototype

Preliminary results from the prototype from try: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=51dcc2edecadad8b21cb7d64b514761d98ab608c&newProject=try&newRevision=1679c0f7b8f0bccbe6caff8d1f4250407192642e&framework=10

Summary is many improvements in the raptor tests ranging from +3%-30%. Average appears to be around 20%. There is also a notable regression in the facebook test which needs to investigated.

Attached file Pixel2_performance_results.txt (deleted) —

Attaching browsertime visual metrics results for the Pixel 2. Overall geomean is about a 3% improvement. Similar results are seen on Desktop Windows 10 on the reference laptop and a higher end laptop.

Whiteboard: [qf:p1:pageload]

ok, that is colorful

Summary of the changes required:

  • Remove the ELEMENT_SLOT property of the SSO

  • Add a callback function in Scriptloader that will return the script dom element when given the loaded script

  • Some scripts are compiled directly from the Event Listener Manager, so add a new loadedscript derivation for the Event Listener Manager.

  • Change the mElement type ScriptLoadRequest from nsIScriptElement to Element. This is because we do not have access to an nsIScriptElement from the Event listener manager.

  • Remove mParserBlockingRequest restriction for off thread parsing. This should allow all fetched scripts to immediately begin off thread parsing via OnStreamComplete->PrepareLoadedRequest->AttemptAsyncScriptCompile

  • We now need to explicitly check for the dom element in ProcessOffThreadRequest because it is very likely that we have finished compilation before the dom parser has the element ready. Previously, we only off thread parsed parser blocking scripts so the element was always available when we reached ProcessOffThreadRequest. If the element is not ready, then the request should be eventually processed when it becomes a parser blocking request or from one of the request lists (e.g. mLoadedAsyncRequests) via ProcessPendingRequests.

  • We need multiple checks in ProcessPendingRequests for mWasCompiledOMT so we can properly unblock on load. I think this may be best done one time in ProcessRequest but I didn't get around to making that change.

  • In LookupPreloadRequest, we need to call SetReady if the script was compiled off thread speculatively since the element is now available. Normally, this is done during processing but if the element was not ready at the time, then we were not able to process the script. If we reached LookupPreloadRequest then the script has become a parser blocking script so we can process it properly.

Known Problems: The CC is deleting the ScriptLoadRequest object from the LoadedScript while we still need it alive. This looks to several SEGV issues since the callback will try to dereference a nullptr when called from the debugger. We could potentially work around this by moving the mElement object outside of the ScriptLoadRequest and into the LoadedScript and other classes that use it. It makes things a bit less organized, but the CC problem may disappear (although it's still important to know what's going on) and it will also let us allocate much less memory when creating ELMScript objects since we don't need the ScriptLoadRequest there.

Risks: I did my best making sure that the execution order of all defer, async, and blocking scripts is maintained. If there is a bug somewhere and we are not executing some scripts that we should be, then that could be a very likely reason for the performance improvements seen.

Smaug: Trying to fix assertions
The main change was to revert changes to ScriptLoader::AddAsyncRequest
That fixed, IIRC, the assertions within ScriptLoader.

   Known assertions are coming from JS engine during shutdown
   Something like "Assertion failure: isEmpty() (failing this assertion means this LinkedList's creator is buggy: it should have removed all this list's elements before the list's destruction), at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/LinkedList.h:434"
   Debug 64bit linux + wpt tests seem to trigger that.
Attached file speculative_parsing_S7E.txt (deleted) —

Results from live sites using gve on the S7E.

Attachment #9118386 - Attachment is obsolete: true
Attachment #9127280 - Attachment is obsolete: true
Attachment #9127280 - Attachment is obsolete: false
Attachment #9127280 - Attachment is obsolete: true

The changes proposed here will speculatively parse all external scripts off thread as soon as they are fetched. The exception is non-async inserted scripts which may potentially degrade onLoad event by quite a bit. This should save us some time since currently all scripts are parsed right before execution or while they are blocking the dom parser.

tp6 results can be found here: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=84a674ba8b2bd5bd990b87fbbea7db7c1d2fc1c7&newProject=try&newRevision=a22e228351b595688235acc83bdde2b90dfbe429&framework=10

There is a regression for fnbp in raptor-tp6-facebook-firefox due to a different execution order of async scripts. Unfortunately, I think this is unavoidable right now until we can properly prioritize async script execution, unless we try to replicate the exact same behavior as before for async scripts which means we also parse them later. I performed visual metric testing on the live site itself and did not see any regressions, however.

Pushed by dpalmeiro@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fc5808ee37a1 Speculatively off thread parse external scripts as soon as they are fetched. r=smaug
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79
Regressions: 1634641
Regressions: 1645746
Regressions: 1265637
Regressions: 1646793
Regressions: 1649765
Regressions: 1650745
Regressions: 1652126
Regressions: 1663051
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [qf:p1:pageload]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: