Closed Bug 1348280 Opened 8 years ago Closed 7 years ago

Thumbnail causes jank on sites with a requestAnimationFrame

Categories

(Firefox :: New Tab Page, defect)

55 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 57
Performance Impact high
Tracking Status
firefox55 --- wontfix
firefox57 --- fixed

People

(Reporter: gregtatum, Assigned: bytesized)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

Profile: https://perf-html.io/public/0b8e110a65348992a16fe6ba8c16b22bf24d8594/timeline/?range=8.0482_9.8080~8.6736_9.1553&thread=2 Profiled site: http://gregtatum.com/poems/wandering-lines/1/ On my Macbook there is a 68ms delay on the content thread when the screenshot is taken for the New Tab page. I've noticed that this happens quite frequently when I'm working on visualizations and loading the page. This translates into about 4-5 dropped frames on my computer anytime this happens, which gives the user perception of slowness and jankiness in the browser that is unrelated to the user's actual implementation. I think a fix for this would be to find a way to move this work to a separate thread if possible. Most of the time is spent in actually reading the pixels values out.
Whiteboard: [qf]
Summary: Screenshot causes jank on sites with a requestAnimationFrame → Thumbnail causes jank on sites with a requestAnimationFrame
Hey felipe, I hope it's cool that I'm putting this under bug 1363771 for now, since it's jank from the front-end code that can occur kind of randomly. Not actually sure what we can do here in the short term - although OMTP would probably help. Hey mchang, out of curiosity, would we need to do anything special with the thumbnailing code to take advantage of OMTP? Thumbnailing code in question http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/toolkit/components/thumbnails/PageThumbUtils.jsm#182-192
Flags: needinfo?(mchang)
Whiteboard: [qf] → [qf:p1]
(In reply to Mike Conley (:mconley) - Digging out of needinfo / review backlog. from comment #1) > Hey felipe, I hope it's cool that I'm putting this under bug 1363771 for > now, since it's jank from the front-end code that can occur kind of randomly. > > Not actually sure what we can do here in the short term - although OMTP > would probably help. > > Hey mchang, out of curiosity, would we need to do anything special with the > thumbnailing code to take advantage of OMTP? > > Thumbnailing code in question > http://searchfox.org/mozilla-central/rev/ > b7e531410ebc1c7d74a78e86a894e69608db318c/toolkit/components/thumbnails/ > PageThumbUtils.jsm#182-192 I don't think it would help in this case. For now, we're going to still render canvas/webgl on the main thread and IIRC, drawWindow has to do the readback. Readback is slow :(.
Flags: needinfo?(mchang)
I'll note that bug 1353584 is changing a thumbnail call to use requestIdleCallback (I'm not sure if it's the same code path that triggers the thumbnail capture here, but it's possible). It won't make anything faster, but it would ensure that the thumbnail is not captured during the animation and make it jank. We should probably retest the reported testcase here once bug 1353584 lands
I have been taking a look at this bug and trying to figure out how to approach it. As far as I can tell, this work cannot easily be moved off thread. The only ideas that I have gotten so far are trying to reduce the frequency of the and increase the probability that capture occurs during idle time. @mconley Are there any additional avenues I should explore?
Assignee: nobody → ksteuber
Flags: needinfo?(mconley)
@gregtatum Can you still reproduce this? I can see the offending code show up in my profiles, but I only see about 2ms delay instead of the >50ms that your profile shows.
Flags: needinfo?(gtatum)
The shortest path to success might be in using requestIdleCallback here: http://searchfox.org/mozilla-central/rev/c329d562fb6c6218bdb79290faaf015467ef89e2/toolkit/components/thumbnails/content/backgroundPageThumbsContent.js#163-178 To increase the likelihood that the user isn't busy when the capture occurs. I suspect that'd go a long way. Another thing we could try to do is to have the docshell be not-active _except_ when we want to do the capture. That way, when the page for thumbnailing is loading, we don't do intermediate paints while it's sorting itself out. For that last idea, we could: 1) Set browser.docShellIsActive to false around here: http://searchfox.org/mozilla-central/source/toolkit/components/thumbnails/BackgroundPageThumbs.jsm#289 2) Inside here: http://searchfox.org/mozilla-central/rev/c329d562fb6c6218bdb79290faaf015467ef89e2/toolkit/components/thumbnails/content/backgroundPageThumbsContent.js#163-178 before doing the capture, wait until an idle callback, then set the content docshell active before doing the capture, and then set it inactive again after the capture completes. "docShell" should exist as a global in a framescript, so in backgroundPageThumbsContent.js, I believe you can just do: docShell.isActive = true; (capture) docShell.isActive = false; That's what I can think of off the top of my head. I haven't tested to see if any of this will actually work.
Flags: needinfo?(mconley)
Kirk: I can't seem to reproduce it now. I just spent about 5 minutes trying to do it, and I couldn't find any samples.
Flags: needinfo?(gtatum)
Despite not being able to reproduce this problem, I am going to proceed with the work described in Comment 6 because it is work that ought to be done regardless.
Attachment #8897998 - Flags: review?(markh)
Hmm. My patch seems to be causing a test [1] to time out. I will have to take a look into why this is. [1] toolkit/components/thumbnails/test/browser_thumbnails_bg_bad_url.js
Ok, I believe that I fixed the problem. The patch should be ready assuming it passes review.
Comment on attachment 8897998 [details] Bug 1348280 - Try to ensure thumbnail creation happens when the thread is idle https://reviewboard.mozilla.org/r/169298/#review175732 This looks fine to me - thanks! I've pushed another try run and it can be landed if that looks green(ish)
Attachment #8897998 - Flags: review?(markh) → review+
Looks like the Try run has an assertion failure. Thanks for pushing that. I'll take a look into it.
@mstange Do you have any insight that could help me here? I feel a bit lost. One of the modifications to |docShell.isActive| is causing an assertion here: [1][2]. So far I have reproduced this assertion in Ubuntu and Windows. I added some logging, and it seems that |mPuppetWidget->GetLayerManager()->GetBackendType()| is returning |1| (LAYERS_BASIC). Using rr, I was able to determine that LAYERS_BASIC is being used due to the Layer Manager being created as a BasicLayerManager here: [3]. I am afraid that I know pretty much nothing about this area of the code base, so any help you could give me would be much appreciated. [1] http://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/dom/ipc/TabChild.cpp#2451 [2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ef94d04942e&selectedJob=124474743 [3] http://searchfox.org/mozilla-central/rev/b258e6864ee3e809d40982bc5d0d5aff66a20780/widget/PuppetWidget.cpp#603
Flags: needinfo?(mstange)
Do you know if your patch actually changes the type of layer manager that's used for the thumbnail tab? Or was it LAYERS_BASIC even before your patch, and the assertion was just never hit? It's possible that your patch causes GetLayerManager to be called at an earlier time, during which the connection to the compositor hasn't been established yet, and then that call would inadvertently cause the creation of a basic layer manager. I'm unsure whether LAYERS_BASIC is the right layer manager type to use here. If it's what we have been using for thumbnail tabs all along, then we probably just need to adjust the assertion. But if we were using LAYERS_CLIENT before, then we need to delay the call to GetLayerManager until the connection to the compositor has been set up.
Flags: needinfo?(mstange)
I am not sure how to continue from here, since I do not really know for sure whether my patch is changing the layout manager used or whether it is safe for the assertion to be changed to allow for LAYERS_BASIC. Returning the needinfo to you as discussed on IRC. In the meantime, I am going to try to see if I can gain a bit more understanding of that assertion.
Flags: needinfo?(mstange)
Kirk, it would help me to understand this to know what the stack looks like for the GetLayerManager call that returns LAYERS_BASIC. Is it happening during the SetDocShellIsActive call? David would know better, but it seems like we ought to be using LAYERS_CLIENT here. This is a normal remote <browser> element that's being added to a normal XUL document as far as I know.
@billm I have attached the stack, taken when the Layer Manager is created as a BasicLayerManager, here [1]. [1] http://searchfox.org/mozilla-central/rev/b258e6864ee3e809d40982bc5d0d5aff66a20780/widget/PuppetWidget.cpp#603
OK, that sort of makes sense. I think what's happening is that we're sending the InitRendering message with aIsLayersConnected=false. That causes the layer manager to be set to basic. That doesn't seem right to me, but I don't really understand this code. Could you look at this, Kats? It looks like bug 1351777 was supposed to make this happen, but I don't really understand why.
Flags: needinfo?(bugmail)
I think this is basically fine. Bug 1351777 intentionally made this case use BasicLayerManager, because in this case we're not actually compositing this content to the screen. All we need to do is draw it and then take a snapshot for the thumbnail. If we try to create a ClientLayerManager in this scenario we actually don't have all the information (i.e. a valid CompositorOptions object) that we need to properly instantiate it. I ran a stock m-c build on Linux and in my run it looks like the thumbnail process is creating a BasicLayerManager, so I don't think your patch actually changes that behaviour. The only behaviour change is that it's now tripping an assertion, because the docshell is being made active and that function has an assert about which layer managers are allowed. I think the assert is wrong and actually kinda pointless. We can just remove that assert and any other similar asserts that end up getting triggered. The PuppetWidget has legitimate reasons to create all of the different layer managers that it has codepaths for (in PuppetWidget::GetLayerManager) so there's no point littering the rest of TabChild with assertions about allowed layer manager types. (Not to mention that the assertions themselves call GetLayerManager which technically has nontrivial side-effects, so we should not call it in debug-only codepaths). I'd be happy to write a patch to take out the assertions if you aren't comfortable doing it, let me know.
Flags: needinfo?(bugmail)
Was getting a permafail on a (seemingly unrelated) test. After looking into it, I discovered that I was getting it even if I built without my patch. I rebased on the most recent mozilla-central and the problem seems to have disappeared (locally at least). I am going to give it another pass through try just to be sure though... Also cancelling needinfo from :mstange since it seems that my question about whether changing the assertions has already been answered.
Flags: needinfo?(mstange)
Attachment #8902873 - Flags: review?(bugmail)
Comment on attachment 8902873 [details] Bug 1348280 - Remove LayerManager type assertions from TabChild https://reviewboard.mozilla.org/r/174566/#review180234
Attachment #8902873 - Flags: review?(bugmail) → review+
Pushed by ksteuber@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3d475d6b6b5b Remove LayerManager type assertions from TabChild r=kats https://hg.mozilla.org/integration/autoland/rev/9134ee2f8dcf Try to ensure thumbnail creation happens when the thread is idle r=markh
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: