Closed
Bug 1295217
Opened 8 years ago
Closed 7 years ago
frequent jank/hang/pause in Firefox UI during nsCocoaWindow::setTitle/SetSizeConstraints
Categories
(Core :: Widget: Cocoa, defect, P2)
Tracking
()
RESOLVED
INCOMPLETE
Tracking | Status | |
---|---|---|
firefox50 | --- | unaffected |
firefox51 | --- | wontfix |
firefox52 | --- | wontfix |
firefox53 | --- | fix-optional |
People
(Reporter: myk, Unassigned)
References
Details
(Keywords: regression, Whiteboard: tpi:+, gfx-noted)
I've been seeing frequent jank/hang/pause in the Firefox UI on Nightly builds for the last few weeks. This morning I installed the Gecko profiler and generated some profiles.
https://cleopatra.io/#report=bec536839e9660d1122da6feeadf3ea42e7d2d72&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54
https://cleopatra.io/#report=dca66657ec5b6cfaa15b2a79629162750f5f2134&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,20,21,22,23,24,25,26,128,129,130,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,19
https://cleopatra.io/#report=1a95da3c49ce24688e44920effa4a95bed729d63&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50
I'm unfamiliar with the Gecko profiler, so I may be misreading the profiles, but all of them report that most of the time is being spent in mach_msg_trap, and in all of the profiles, either most or half of the time the mach_msg_trap caller is CGSSetWindowCornerMask.
Two of the three profiles get there from nsCocoaWindow::setTitle, which calls [NSWindow _doSetTitle:andDefeatWrap:] in AppKit, which then proceeds through a series of calls to CGSSetWindowCornerMask.
One of the three profiles gets there from nsCocoaWindow::SetSizeConstraints, which calls [NSWindow _commonMinMaxSizeChanged], which then proceeds through a series of calls to CGSSetWindowCornerMask.
Since both of those call paths go through nsCocoaWindow, I'm optimistically filing this in the Widget: Cocoa component. But please feel free to redirect it to a more appropriate component!
More info about my environment, in case it matters: I have two browser windows, both fullscreen, each one on its own Mac OS X space. The first window has six pinned tabs (keep.google.com, calendar.google.com, web.telegram.org, web.whatsapp.org, www.messenger.com, hangouts.google.com). The second window has five pinned tabs (keep.google.com, calendar.google.com, www.irccloud.com, and two slack.com subdomains). Both of them also occasionally host other tabs, but the jank doesn't seem to depend on them. It occurs even when only the pinned tabs are open.
Comment 1•8 years ago
|
||
I'm also seeing this happen on Win10. Youtube in particular seems to trigger it.
Reporter | ||
Comment 2•8 years ago
|
||
(In reply to timbugzilla from comment #1)
> I'm also seeing this happen on Win10. Youtube in particular seems to trigger
> it.
Hmm, I suspect that's a different issue, since this bug appears to be specific to Mac (based on the profiles I ran). There's a discussion in the support forum about a hang with YouTube in fullscreen, and you might be able to resolve your issue via the suggestions in that discussion:
https://support.mozilla.org/en-US/questions/979557
Comment 3•8 years ago
|
||
spohl, any ideas here?
marking P2 for now since we only have one user reporting the issue. a regression range would be helpful.
Flags: needinfo?(spohl.mozilla.bugs)
Priority: -- → P2
Whiteboard: tpi:+
Reporter | ||
Comment 4•8 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #3)
> marking P2 for now since we only have one user reporting the issue. a
> regression range would be helpful.
I've bisected it down to https://hg.mozilla.org/mozilla-central/rev/bfc47d8a87ef from bug 1230641.
Blocks: 1230641
Reporter | ||
Updated•8 years ago
|
Keywords: regression
Comment 5•8 years ago
|
||
This bug frightens me. Myk, what is your macOS version?
Reporter | ||
Comment 6•8 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #5)
> This bug frightens me. Myk, what is your macOS version?
I'm running OS X El Capitan version 10.11.6 (15G31) on a MacBook Pro (Retina, 15-inch, Mid 2015).
Updated•8 years ago
|
Comment 7•8 years ago
|
||
Matt, seems like fallout from bug 1230641?
Flags: needinfo?(matt.woodrow)
Whiteboard: tpi:+ → tpi:+, gfx-noted
Updated•8 years ago
|
Comment 8•8 years ago
|
||
Flags: needinfo?(spohl.mozilla.bugs)
Updated•8 years ago
|
status-firefox50:
--- → unaffected
status-firefox51:
--- → affected
status-firefox52:
--- → affected
status-firefox53:
--- → affected
Version: unspecified → 51 Branch
Comment 9•8 years ago
|
||
Myk, just to make sure, is it reproducible on Firefox 51 beta/52 aurora too ?
Flags: needinfo?(myk)
Reporter | ||
Comment 10•8 years ago
|
||
(In reply to Astley Chen [:astley] UTC+8 from comment #9)
> Myk, just to make sure, is it reproducible on Firefox 51 beta/52 aurora too ?
Unfortunately, I'm having trouble reproducing the bug on those versions, as Beta's chrome process hangs indefinitely shortly after startup (the OS reports that the application has "stopped responding"), while Aurora's content process hangs (each tab's browser pane displays a throbber that throbs indefinitely).
I've tried to reproduce in a new profile by loading IRCCloud in a pinned tab, as I suspect that the behavior is related to pinned tabs that frequently update their titles. But so far I haven't succeeded. I'll try recreating the session more extensively next. (Leaving the needinfo request to remind me to do this.)
Reporter | ||
Comment 11•8 years ago
|
||
After reproducing my tabset this morning in a separate profile, I reproduced the bug this afternoon. Or rather, I might have reproduced it. I'm experiencing identical symptoms, and my profiles all end up blocking in mach_msg_trap like before; but their stacks look different. Here's an example:
https://clptr.io/2hZ1goO
Flags: needinfo?(myk)
Does setting media.video-queue.hw-accel-size to 3 make any difference?
Flags: needinfo?(myk)
Reporter | ||
Comment 13•8 years ago
|
||
(In reply to Myk Melez [:myk] [@mykmelez] from comment #11)
> After reproducing my tabset this morning in a separate profile, I reproduced
> the bug this afternoon.
Erm, I meant to say: I reproduced the bug this afternoon *on Aurora* (hence partly answering the question about whether this is reproducible on Beta/Aurora).
> Or rather, I might have reproduced it. I'm
> experiencing identical symptoms, and my profiles all end up blocking in
> mach_msg_trap like before; but their stacks look different.
My stacks now look different on Nightly as well, f.e. this Nightly stack looks like the one I saw on Aurora last week:
https://clptr.io/2i9QPu0
Reporter | ||
Comment 14•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #12)
> Does setting media.video-queue.hw-accel-size to 3 make any difference?
No such luck, I'm afraid. The stack I just referenced in comment 13 is from a Nightly build with that preference set to 3 (after which I restarted Nightly).
Flags: needinfo?(myk)
Comment 15•8 years ago
|
||
If the regression is from bug 1230641, then it's likely the changes to widget/cocoa/nsChildView.mm that cause these symptoms. That code will execute regardless of the video playback preferences.
Interestingly, that code was added to fix excessive time in mach_msg_trap in bug 1230641 comment 48 for the video playback case. We may be too aggressive when resetting the opacity value:
bool isFullscreen = (styleMask & NSFullScreenWindowMask) || !(styleMask & NSTitledWindowMask);
Matt: why is the code after the || also setting fullscreen and flipping opacity? Can we get away without it?
Comment 16•8 years ago
|
||
Looks like Xidorn added that check in bug 1105939, maybe he remembers more.
I don't think it matters though, the idea is that when we're fullscreen (or just not drawing a titlebar) then we stop masking out the rounded corners and we mark our GL context as being opaque (since we no longer need opacity).
It sounds like Myk's windows are fullscreen, so it would appear that making the GL context opaque causes CGSSetWindowCornerMask to hang/pause.
It's not obvious why that would happen, or even why cocoa is calling that function at all (it comes from a call to _NSSpaceIsVisible? weird side effect).
Does anyone know enough about Cocoa to know what this call is doing and how we might avoid it?
Flags: needinfo?(matt.woodrow) → needinfo?(xidorn+moz)
Reporter | ||
Comment 17•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> It sounds like Myk's windows are fullscreen, so it would appear that making
> the GL context opaque causes CGSSetWindowCornerMask to hang/pause.
Yes, this only happens when my two windows are fullscreen.
Comment 18•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> Looks like Xidorn added that check in bug 1105939, maybe he remembers more.
>
> I don't think it matters though, the idea is that when we're fullscreen (or
> just not drawing a titlebar) then we stop masking out the rounded corners
> and we mark our GL context as being opaque (since we no longer need opacity).
>
> It sounds like Myk's windows are fullscreen, so it would appear that making
> the GL context opaque causes CGSSetWindowCornerMask to hang/pause.
>
> It's not obvious why that would happen, or even why cocoa is calling that
> function at all (it comes from a call to _NSSpaceIsVisible? weird side
> effect).
>
> Does anyone know enough about Cocoa to know what this call is doing and how
> we might avoid it?
Maybe Markus?
Flags: needinfo?(mstange)
Comment 19•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> Looks like Xidorn added that check in bug 1105939, maybe he remembers more.
>
> I don't think it matters though, the idea is that when we're fullscreen (or
> just not drawing a titlebar) then we stop masking out the rounded corners
> and we mark our GL context as being opaque (since we no longer need opacity).
That's right. This shouldn't matter, unless the state can switch back and forth automatically without user action, which shouldn't happen.
Flags: needinfo?(xidorn+moz)
Comment 20•8 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #18)
> > Does anyone know enough about Cocoa to know what this call is doing and how
> > we might avoid it?
>
> Maybe Markus?
Unfortunately not, no :(
It would be nice to know if updating to 10.12 fixes this, but if Myk upgrades and that fixes it, then we've lost the only machine that reproduces this bug (that we know of).
Myk, can you profile the WindowServer process with Instruments when this happens and attach the profile?
Flags: needinfo?(mstange) → needinfo?(myk)
Reporter | ||
Comment 21•8 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #20)
> Myk, can you profile the WindowServer process with Instruments when this
> happens and attach the profile?
Instruments doesn't list WindowServer in its lists of Applications and Running Processes, but I can sample "all processes," which includes WindowServer. Here's a profile that used the Time Profiler template and contains several short runs during which I experienced pauses. The profile is too large to attach to this bug, even after compression, so I've uploaded it to people-mozilla.org:
https://people-mozilla.org/~myk/Instruments3.trace.tbz2
Flags: needinfo?(myk)
Updated•8 years ago
|
Flags: needinfo?(mstange)
Comment 22•8 years ago
|
||
As the scenario happened in two fullscreen windows and it's too late for 51. I would say it's not a blocking issue, mark 51 won't fix.
Comment 23•8 years ago
|
||
Thanks Myk. Unfortunately I wasn't able to find any useful information in the profile. Can you try to get another profile with "Record Waiting Threads" and "Callstacks: User & Kernel" checked?
Flags: needinfo?(mstange) → needinfo?(myk)
Reporter | ||
Comment 24•8 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #23)
> Thanks Myk. Unfortunately I wasn't able to find any useful information in
> the profile. Can you try to get another profile with "Record Waiting
> Threads" and "Callstacks: User & Kernel" checked?
Yes, here's such a profile:
https://people-mozilla.org/~myk/Instruments4.trace.tbz2
Flags: needinfo?(myk)
Comment 25•8 years ago
|
||
Thanks! I was able to find the hang in there. Firefox is waiting for the WindowServer, and the WindowServer is blocked in IOAccelFlushSurfaceOnFramebuffers. We're not completely sure what that means, but it's probably waiting for the GPU hardware. E.g. it could be waiting for a GPU switch. Or your GPU has gone bad somehow - it could be a hardware problem.
It's very hard to say why our change to use an opaque GLContext triggered this.
Updated•8 years ago
|
Comment 26•7 years ago
|
||
Is this still reproducing with current Nightlies? We switched to the 10.11 SDK, which may have had an effect on this.
Flags: needinfo?(myk)
Reporter | ||
Comment 27•7 years ago
|
||
Hmm, I can't reproduce it with the latest Nightly. Note, however, that I replaced my Mac last month, upgrading to macOS 10.12 in the process. And I no longer have the old Mac, so I can't test there.
Flags: needinfo?(myk)
Comment 28•7 years ago
|
||
Thanks. Let's close for now until we have a way to reproduce this again.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•