Worse performance with new Fission compatible `drawSnapshot` API
Categories
(Core :: Graphics: Layers, defect, P3)
Tracking
()
Fission Milestone | Future |
People
(Reporter: whimboo, Unassigned, NeedInfo)
References
(Blocks 2 open bugs)
Details
(Keywords: perf, regression)
The following gecko profile shows the process of taking a full screenshot via Marionette and the new drawScreenshot
API for https://www.blueorigin.com/latest/gallery (1503 x 4938 pixels) :
Steps which takes a very long time:
- 5s to draw the snapshot
- ~4s to move the screenshot data to the parent process (1.2GB)
- ~4s in
toDataURL()
- Marionette: ~4s to transfer the base64 encoded data via a socket to the Marionette client
'
Matt, can you please have a look at the first 3 parts?
Reporter | ||
Comment 1•5 years ago
|
||
Using Puppeteer and running https://github.com/puppeteer/puppeteer/blob/master/examples/screenshot-fullpage.js with Chromium the following results can be seen:
$ time PUPPETEER_PRODUCT=chrome PUPPETEER_EXECUTABLE_PATH=.local-chromium/mac-706915/chrome-mac/Chromium.app/Contents/MacOS/Chromium NODE_PATH=../ node examples/screenshot-fullpage.js
PUPPETEER_PRODUCT=chrome PUPPETEER_EXECUTABLE_PATH= NODE_PATH=../ node 4.53s user 1.91s system 52% cpu 12.366 total
$ file full.png
full.png: PNG image data, 750 x 11018, 8-bit/color RGBA, non-interlaced
So it takes 12s in total including the page load.
Comment 2•5 years ago
|
||
:whimboo, since this bug is a regression, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 3•5 years ago
|
||
For Marionette specifically it regressed with bug 1559592, but the problem actually is with the implmentation of the new API. Given that this is a core issue not sure what I would have to set.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•4 years ago
|
Comment 5•4 years ago
|
||
Matt, do you know if this is still an issue we need to track or has this been fixed already?
Comment 6•4 years ago
|
||
I don't think we need to block on this, but no it hasn't been fixed.
Reporter | ||
Comment 7•4 years ago
|
||
Maybe this is also related to what we can see on bug 1559592. Creating snapshots from pages including Apple emojis cause a poor performance in creating page thumbnails for Ctrl-Tab on MacOS.
Comment 8•4 years ago
|
||
I think the best thing to do here is to record into file handles, rather than shmem.
The current code records into normal memory (reallocating to grow as needed if the recording is big), copies into shmem, then copies the shmem back into memory in the parent process. Recording directly into a file would remove the reallocations, and both of those copies.
We already do the same thing for printing recording, using DrawEventRecorderPRFileDesc instead of DrawEventRecorderMemory.
We'd need CrossProcessPaint to allocate a temporary file for each content process it wants to record, and send that along with the request to record. We could then instantiate DrawEventRecorderPRFileDesc using the file handle, record into that, and then just notify the parent that we've completed (and it can read from the file).
Reporter | ||
Comment 10•4 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #9)
Sadly the builds for MacOS opt are busted. So I don't have binaries to download. Re-triggered the build job doesn't help due to some TaskCluster failures. Matt, could you please push a new try build, or at least test yourself locally? The Marionette test that I'm using for the measurement is:
from marionette_harness import MarionetteTestCase
class Test(MarionetteTestCase):
def test_screenshot(self):
self.marionette.navigate("https://www.blueorigin.com/latest/gallery")
screenshot = self.marionette.screenshot(full=True)
Then it can be run via MOZ_PROFILER_STARTUP=1 MOZ_PROFILER_SHUTDOWN=marionette_profile.json mach marionette-test test.py
.
Here a profile with the most recent Firefox Nightly build, which looks even worse compared to the build that I used when filing this bug:
https://share.firefox.dev/3rc6vAG
So I'm kinda looking forward to see performance improvements here.
Reporter | ||
Comment 11•4 years ago
|
||
And one more thing, could we actually get a performance test for the drawSnapshot
API?
Comment 12•4 years ago
|
||
Looks like this one built properly: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=998c2bbb7ce165c2cfb892270f3d29420c9ea08f
Reporter | ||
Comment 13•4 years ago
|
||
The improvements here seem to be marginal: https://share.firefox.dev/3r8HXbQ.
It still takes a long time, at least given the duration of Task PWindowGlobal::Msg_DrawSnapshot
which reports 11s compared to 7.5s before.
Comment 14•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #13)
The improvements here seem to be marginal: https://share.firefox.dev/3r8HXbQ.
Same profile with symbols: https://share.firefox.dev/3pIqBT0 (and here with CPU data visible).
Comment 15•4 years ago
|
||
Interesting, those profiles no longer show time spent in unnecessary copies :( Is the total time worse than before? I guess the overhead of using files is worse than just copying.
Back to the drawing board I guess!
Reporter | ||
Comment 16•4 years ago
|
||
Here some numbers as comparison to Chrome, which contain not only the drawSnapshot
time, but everything including the base64 encoding and the transmission of the screenshot over the socket to geckodriver. In this case only for the visible viewport:
Firefox (current mozilla-central):
- Time before screenshot: 1612298483.7496388
- Time after screenshot: 1612298485.118722
Screenshot: 2560x1636
Firefox (try build with your changes):
- Time before screenshot: 1612298848.8437572
- Time after screenshot: 1612298850.0715609
Screenshot: 2560x1636
Chrome:
- Time before screenshot: 1612298380.4292579
- Time after screenshot: 1612298382.065851
Screenshot: 2400x1588
So actually for images of this size your new code path is actually faster, and we also beat Chrome by around 0.4s. Here a profile for that particular scenario.
So maybe creating screenshots of the full page (which is huge!) triggers some slow code path?
Reporter | ||
Comment 17•4 years ago
|
||
I can have a look for a comparison between Firefox and Chrome for a full page screenshot tomorrow. I would have to do it via Puppeteer and the CDP protocol.
Updated•2 years ago
|
Updated•2 years ago
|
Description
•