1295217 - frequent jank/hang/pause in Firefox UI during nsCocoaWindow::setTitle/SetSizeConstraints

Reporter

Description

•

8 years ago

I've been seeing frequent jank/hang/pause in the Firefox UI on Nightly builds for the last few weeks. This morning I installed the Gecko profiler and generated some profiles. https://cleopatra.io/#report=bec536839e9660d1122da6feeadf3ea42e7d2d72&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54 https://cleopatra.io/#report=dca66657ec5b6cfaa15b2a79629162750f5f2134&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,20,21,22,23,24,25,26,128,129,130,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,19 https://cleopatra.io/#report=1a95da3c49ce24688e44920effa4a95bed729d63&selection=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50 I'm unfamiliar with the Gecko profiler, so I may be misreading the profiles, but all of them report that most of the time is being spent in mach_msg_trap, and in all of the profiles, either most or half of the time the mach_msg_trap caller is CGSSetWindowCornerMask. Two of the three profiles get there from nsCocoaWindow::setTitle, which calls [NSWindow _doSetTitle:andDefeatWrap:] in AppKit, which then proceeds through a series of calls to CGSSetWindowCornerMask. One of the three profiles gets there from nsCocoaWindow::SetSizeConstraints, which calls [NSWindow _commonMinMaxSizeChanged], which then proceeds through a series of calls to CGSSetWindowCornerMask. Since both of those call paths go through nsCocoaWindow, I'm optimistically filing this in the Widget: Cocoa component. But please feel free to redirect it to a more appropriate component! More info about my environment, in case it matters: I have two browser windows, both fullscreen, each one on its own Mac OS X space. The first window has six pinned tabs (keep.google.com, calendar.google.com, web.telegram.org, web.whatsapp.org, www.messenger.com, hangouts.google.com). The second window has five pinned tabs (keep.google.com, calendar.google.com, www.irccloud.com, and two slack.com subdomains). Both of them also occasionally host other tabs, but the jank doesn't seem to depend on them. It occurs even when only the pinned tabs are open.

timbugzilla

Comment 1

•

8 years ago

I'm also seeing this happen on Win10. Youtube in particular seems to trigger it.

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 2

•

8 years ago

(In reply to timbugzilla from comment #1) > I'm also seeing this happen on Win10. Youtube in particular seems to trigger > it. Hmm, I suspect that's a different issue, since this bug appears to be specific to Mac (based on the profiles I ran). There's a discussion in the support forum about a hang with YouTube in fullscreen, and you might be able to resolve your issue via the suggestions in that discussion: https://support.mozilla.org/en-US/questions/979557

Jim Mathies [:jimm]

Comment 3

•

8 years ago

spohl, any ideas here? marking P2 for now since we only have one user reporting the issue. a regression range would be helpful.

Flags: needinfo?(spohl.mozilla.bugs)

Priority: -- → P2

Whiteboard: tpi:+

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 4

•

8 years ago

(In reply to Jim Mathies [:jimm] from comment #3) > marking P2 for now since we only have one user reporting the issue. a > regression range would be helpful. I've bisected it down to https://hg.mozilla.org/mozilla-central/rev/bfc47d8a87ef from bug 1230641.

Blocks: 1230641

Myk Melez [:myk] [@mykmelez]

Reporter

Updated

•

8 years ago

Keywords: regression

Markus Stange [:mstange]

Comment 5

•

8 years ago

This bug frightens me. Myk, what is your macOS version?

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 6

•

8 years ago

(In reply to Markus Stange [:mstange] from comment #5) > This bug frightens me. Myk, what is your macOS version? I'm running OS X El Capitan version 10.11.6 (15G31) on a MacBook Pro (Retina, 15-inch, Mid 2015).

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

8 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1301301

Mason Chang [Inactive] [:mchang]

Comment 7

•

8 years ago

Matt, seems like fallout from bug 1230641?

Flags: needinfo?(matt.woodrow)

Whiteboard: tpi:+ → tpi:+, gfx-noted

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

8 years ago

See Also: https://bugzilla.mozilla.org/show_bug.cgi?id=1301301 →

Stephen A Pohl [:spohl] (OOO until 9/8)

Comment 8

•

8 years ago

Clearing n-i based on comment 4 and comment 7.

Flags: needinfo?(spohl.mozilla.bugs)

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

status-firefox50: --- → unaffected

status-firefox51: --- → affected

status-firefox52: --- → affected

status-firefox53: --- → affected

Version: unspecified → 51 Branch

Astley Chen (inactive)

Comment 9

•

8 years ago

Myk, just to make sure, is it reproducible on Firefox 51 beta/52 aurora too ?

Flags: needinfo?(myk)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 10

•

8 years ago

(In reply to Astley Chen [:astley] UTC+8 from comment #9) > Myk, just to make sure, is it reproducible on Firefox 51 beta/52 aurora too ? Unfortunately, I'm having trouble reproducing the bug on those versions, as Beta's chrome process hangs indefinitely shortly after startup (the OS reports that the application has "stopped responding"), while Aurora's content process hangs (each tab's browser pane displays a throbber that throbs indefinitely). I've tried to reproduce in a new profile by loading IRCCloud in a pinned tab, as I suspect that the behavior is related to pinned tabs that frequently update their titles. But so far I haven't succeeded. I'll try recreating the session more extensively next. (Leaving the needinfo request to remind me to do this.)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 11

•

8 years ago

After reproducing my tabset this morning in a separate profile, I reproduced the bug this afternoon. Or rather, I might have reproduced it. I'm experiencing identical symptoms, and my profiles all end up blocking in mach_msg_trap like before; but their stacks look different. Here's an example: https://clptr.io/2hZ1goO

Flags: needinfo?(myk)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 12

•

8 years ago

Does setting media.video-queue.hw-accel-size to 3 make any difference?

Flags: needinfo?(myk)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 13

•

8 years ago

(In reply to Myk Melez [:myk] [@mykmelez] from comment #11) > After reproducing my tabset this morning in a separate profile, I reproduced > the bug this afternoon. Erm, I meant to say: I reproduced the bug this afternoon *on Aurora* (hence partly answering the question about whether this is reproducible on Beta/Aurora). > Or rather, I might have reproduced it. I'm > experiencing identical symptoms, and my profiles all end up blocking in > mach_msg_trap like before; but their stacks look different. My stacks now look different on Nightly as well, f.e. this Nightly stack looks like the one I saw on Aurora last week: https://clptr.io/2i9QPu0

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 14

•

8 years ago

(In reply to Milan Sreckovic [:milan] from comment #12) > Does setting media.video-queue.hw-accel-size to 3 make any difference? No such luck, I'm afraid. The stack I just referenced in comment 13 is from a Nightly build with that preference set to 3 (after which I restarted Nightly).

Flags: needinfo?(myk)

Jet Villegas (inactive)

Comment 15

•

8 years ago

If the regression is from bug 1230641, then it's likely the changes to widget/cocoa/nsChildView.mm that cause these symptoms. That code will execute regardless of the video playback preferences. Interestingly, that code was added to fix excessive time in mach_msg_trap in bug 1230641 comment 48 for the video playback case. We may be too aggressive when resetting the opacity value: bool isFullscreen = (styleMask & NSFullScreenWindowMask) || !(styleMask & NSTitledWindowMask); Matt: why is the code after the || also setting fullscreen and flipping opacity? Can we get away without it?

Matt Woodrow (:mattwoodrow)

Comment 16

•

8 years ago

Looks like Xidorn added that check in bug 1105939, maybe he remembers more. I don't think it matters though, the idea is that when we're fullscreen (or just not drawing a titlebar) then we stop masking out the rounded corners and we mark our GL context as being opaque (since we no longer need opacity). It sounds like Myk's windows are fullscreen, so it would appear that making the GL context opaque causes CGSSetWindowCornerMask to hang/pause. It's not obvious why that would happen, or even why cocoa is calling that function at all (it comes from a call to _NSSpaceIsVisible? weird side effect). Does anyone know enough about Cocoa to know what this call is doing and how we might avoid it?

Flags: needinfo?(matt.woodrow) → needinfo?(xidorn+moz)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 17

•

8 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #16) > It sounds like Myk's windows are fullscreen, so it would appear that making > the GL context opaque causes CGSSetWindowCornerMask to hang/pause. Yes, this only happens when my two windows are fullscreen.

Andrew Overholt [:overholt]

Comment 18

•

8 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #16) > Looks like Xidorn added that check in bug 1105939, maybe he remembers more. > > I don't think it matters though, the idea is that when we're fullscreen (or > just not drawing a titlebar) then we stop masking out the rounded corners > and we mark our GL context as being opaque (since we no longer need opacity). > > It sounds like Myk's windows are fullscreen, so it would appear that making > the GL context opaque causes CGSSetWindowCornerMask to hang/pause. > > It's not obvious why that would happen, or even why cocoa is calling that > function at all (it comes from a call to _NSSpaceIsVisible? weird side > effect). > > Does anyone know enough about Cocoa to know what this call is doing and how > we might avoid it? Maybe Markus?

Flags: needinfo?(mstange)

Xidorn Quan [:xidorn] UTC+11

Comment 19

•

8 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #16) > Looks like Xidorn added that check in bug 1105939, maybe he remembers more. > > I don't think it matters though, the idea is that when we're fullscreen (or > just not drawing a titlebar) then we stop masking out the rounded corners > and we mark our GL context as being opaque (since we no longer need opacity). That's right. This shouldn't matter, unless the state can switch back and forth automatically without user action, which shouldn't happen.

Flags: needinfo?(xidorn+moz)

Markus Stange [:mstange]

Comment 20

•

8 years ago

(In reply to Andrew Overholt [:overholt] from comment #18) > > Does anyone know enough about Cocoa to know what this call is doing and how > > we might avoid it? > > Maybe Markus? Unfortunately not, no :( It would be nice to know if updating to 10.12 fixes this, but if Myk upgrades and that fixes it, then we've lost the only machine that reproduces this bug (that we know of). Myk, can you profile the WindowServer process with Instruments when this happens and attach the profile?

Flags: needinfo?(mstange) → needinfo?(myk)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 21

•

8 years ago

(In reply to Markus Stange [:mstange] from comment #20) > Myk, can you profile the WindowServer process with Instruments when this > happens and attach the profile? Instruments doesn't list WindowServer in its lists of Applications and Running Processes, but I can sample "all processes," which includes WindowServer. Here's a profile that used the Time Profiler template and contains several short runs during which I experienced pauses. The profile is too large to attach to this bug, even after compression, so I've uploaded it to people-mozilla.org: https://people-mozilla.org/~myk/Instruments3.trace.tbz2

Flags: needinfo?(myk)

Stephen A Pohl [:spohl] (OOO until 9/8)

Updated

•

8 years ago

Flags: needinfo?(mstange)

Gerry Chang [:gchang]

Comment 22

•

8 years ago

As the scenario happened in two fullscreen windows and it's too late for 51. I would say it's not a blocking issue, mark 51 won't fix.

status-firefox51: affected → wontfix

Markus Stange [:mstange]

Comment 23

•

8 years ago

Thanks Myk. Unfortunately I wasn't able to find any useful information in the profile. Can you try to get another profile with "Record Waiting Threads" and "Callstacks: User & Kernel" checked?

Flags: needinfo?(mstange) → needinfo?(myk)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 24

•

8 years ago

(In reply to Markus Stange [:mstange] from comment #23) > Thanks Myk. Unfortunately I wasn't able to find any useful information in > the profile. Can you try to get another profile with "Record Waiting > Threads" and "Callstacks: User & Kernel" checked? Yes, here's such a profile: https://people-mozilla.org/~myk/Instruments4.trace.tbz2

Flags: needinfo?(myk)

Markus Stange [:mstange]

Comment 25

•

8 years ago

Thanks! I was able to find the hang in there. Firefox is waiting for the WindowServer, and the WindowServer is blocked in IOAccelFlushSurfaceOnFramebuffers. We're not completely sure what that means, but it's probably waiting for the GPU hardware. E.g. it could be waiting for a GPU switch. Or your GPU has gone bad somehow - it could be a hardware problem. It's very hard to say why our change to use an opaque GLContext triggered this.

Randell Jesup [:jesup] (needinfo me)

Updated

•

8 years ago

status-firefox52: affected → wontfix

status-firefox53: affected → fix-optional

Stephen A Pohl [:spohl] (OOO until 9/8)

Comment 26

•

7 years ago

Is this still reproducing with current Nightlies? We switched to the 10.11 SDK, which may have had an effect on this.

Flags: needinfo?(myk)

Myk Melez [:myk] [@mykmelez]

Reporter

Comment 27

•

7 years ago

Hmm, I can't reproduce it with the latest Nightly. Note, however, that I replaced my Mac last month, upgrading to macOS 10.12 in the process. And I no longer have the old Mac, so I can't test there.

Flags: needinfo?(myk)

Stephen A Pohl [:spohl] (OOO until 9/8)

Comment 28

•

7 years ago

Thanks. Let's close for now until we have a way to reproduce this again.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → INCOMPLETE