Closed Bug 1668875 Opened 4 years ago Closed 4 years ago

[Fission] Lots of tab spinners

Categories

(Core :: DOM: Content Processes, defect, P2)

defect

Tracking

()

RESOLVED FIXED
84 Branch
Fission Milestone M6c
Tracking Status
firefox-esr78 --- wontfix
firefox81 --- wontfix
firefox82 --- wontfix
firefox83 --- wontfix
firefox84 --- fixed

People

(Reporter: overholt, Assigned: emilio)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

Nightly, Windows 10, Fission enabled

I see a lot more tab spinners these days than I used to. I captured a bit of a profile while I was seeing them on all my (GMail, GDoc) tabs: https://share.firefox.dev/3joE5Ad

Maybe this is a dupe of bug 1561095?

I should have realized a lot of my GSuite tabs are the same process :)

I captured another long input delay (no "Firefox Is Not Responding" warning from Windows): https://share.firefox.dev/36F4Qgj

That is spending a lot of time under MediaFeatureValuesChangedAllDocuments under ThemeChanged, redrawing a gazillion SVGs... Do you have something in your system changing stuff like system colors, etc periodically?

Other than that it seems very weird, ThemeChanged() is only supposed to happen very sporadically.

Oh wow, and the parent process seems stuck in some a11y code under GetProxiedAccessibleInSubtree for a huge time.

Attached file about:support from this laptop (deleted) —

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

That is spending a lot of time under MediaFeatureValuesChangedAllDocuments under ThemeChanged, redrawing a gazillion SVGs... Do you have something in your system changing stuff like system colors, etc periodically?

I don't think I do as they never change. I have a 2017 Surface Laptop with an external 4k display. Here's my about:support.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #3)

Oh wow, and the parent process seems stuck in some a11y code under GetProxiedAccessibleInSubtree for a huge time.

I think maybe my laptop's touch screen is enabling a11y support?

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

That is spending a lot of time under MediaFeatureValuesChangedAllDocuments under ThemeChanged, redrawing a gazillion SVGs... Do you have something in your system changing stuff like system colors, etc periodically?

Could that have something to do with dragging windows from one screen to another?

Perhaps, maybe we incorrectly call themechanged when the resolution changes or something like that...

Depends on: 1670051

See also https://bugzilla.mozilla.org/show_bug.cgi?id=1551982#c4 for some previous investigation on ThemeChanged

Can you take a profile now that bug 1670051 is in? That should help narrow stuff a bit.

Flags: needinfo?(overholt)

(In reply to Andrew Overholt [:overholt] from comment #4)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #3)

Oh wow, and the parent process seems stuck in some a11y code under GetProxiedAccessibleInSubtree for a huge time.

I think maybe my laptop's touch screen is enabling a11y support?

@ Jamie, do you know why the parent process might hang in GetProxiedAccessibleInSubtree? Does a Windows laptop with a touch screen enable a11y support in Firefox?

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

That is spending a lot of time under MediaFeatureValuesChangedAllDocuments under ThemeChanged, redrawing a gazillion SVGs... Do you have something in your system changing stuff like system colors, etc periodically?

Other than that it seems very weird, ThemeChanged() is only supposed to happen very sporadically.

btw, I filed a similar Google Docs hang bug 1661311 two months ago and my profile there also includes excessive ThemeChanged notifications. In bug 1661311 comment 7, jgilbert proposed throttling/debouncing ThemeChanged notifications so they don't fire ASAP.

Severity: -- → S2
Fission Milestone: --- → M6c
Flags: needinfo?(jteh)
Priority: -- → P2

(In reply to Chris Peterson [:cpeterson] from comment #9)

Oh wow, and the parent process seems stuck in some a11y code under GetProxiedAccessibleInSubtree for a huge time.

@ Jamie, do you know why the parent process might hang in GetProxiedAccessibleInSubtree?

That very likely means an a11y client asked for an Accessible, but the target content process is blocked. The hard part to figure out here is whether that's because a11y is slow or whether that's a symptom of non-a11y blocking in the content process. My guess is the latter (because otherwise we'd see long a11y calls in the content process profile), but that unfortunately means we end up with jank in the parent process (because some a11y client happened to query for something while the content process was busy).

Does a Windows laptop with a touch screen enable a11y support in Firefox?

It seems to in some cases, especially on Surface machines. We have code to block this, but it either doesn't work any more or it doesn't work on certain devices. See bug 1531537, bug 1626789.

Flags: needinfo?(jteh)
Flags: needinfo?(overholt)

that profile looks very odd, doesn't hit the marker added in bug 1670051... I added another patch so that we record it on the child too...

Here's another but I don't think your patch from bug 1670051 will have made it into my build yet: https://share.firefox.dev/35kXXOX

Another (but I missed starting the profiler when it first happened so the first ~30 s of being unable to switch tabs): https://share.firefox.dev/3jc8hh3

I can reproduce this fairly consistently. STR (with Fission enabled):

  1. Have multiple tabs with multiple gDocs opened.
  2. Make sure the current tab with focus is a gDoc.
  3. Dock/undock my Jabra wireless headphones to/from their USB charging stand

At this point, the current gDoc tab gets very sluggish, it becomes hard to switch tabs, and when I can switch tabs, switching to another gSuite app shows the spinner in the middle of the content window of the tab.

Here is a profile switching from a gDoc to gCal: https://share.firefox.dev/2T8EbQO
Here is a profile switching from gDoc to gDoc: https://share.firefox.dev/31qaMq2

Here's a profile with a 2 minute event processing delay: https://share.firefox.dev/37m7ub3

Ok, so here's the stack from Mike's profile in the parent process... That's something :)

(root) []
0x7ff980a6cea1 [ntdll.dll]
BaseThreadInitThunk [KERNEL32.DLL]
__scrt_common_main_seh() [firefox.exe]
wmain(int, wchar_t**) [firefox.exe]
XRE_main(int, char**, mozilla::BootstrapConfig const&) [xul.dll]
XREMain::XRE_main []
XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) [xul.dll]
XREMain::XRE_mainRun() [xul.dll]
nsAppStartup::Run() [xul.dll]
nsAppShell::Run() [xul.dll]
nsBaseAppShell::Run() [xul.dll]
MessageLoop::Run() [xul.dll]
MessageLoop::RunHandler() [xul.dll]
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [xul.dll]
nsThread::ProcessNextEvent(bool, bool*) [xul.dll]
nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) [xul.dll]
nsAppShell::ProcessNextNativeEvent(bool) [xul.dll]
0x7ff97fa4be16 [MSCTF.dll]
PeekMessageW [USER32.dll]
int _PeekMessage(struct tagMSG *,struct HWND__*,unsigned int,unsigned int,unsigned int,unsigned int,int) [USER32.dll]
NtUserPeekMessage [win32u.dll]
0x7ff980a9fe34 [ntdll.dll]
fnINDEVICECHANGE [USER32.dll]
DispatchClientMessage [USER32.dll]
int64_t UserCallWinProcCheckWow(struct _ACTIVATION_CONTEXT *,int64_t ( *)(struct tagWND *,unsigned int,uint64_t,int64_t),struct HWND__*,enum _WM_VALUE,uint64_t,int64_t,void *,int) [USER32.dll]
static nsWindow::WindowProc(HWND__*, unsigned int, unsigned long long, long long) [xul.dll]
static nsWindow::WindowProcInternal(HWND__*, unsigned int, unsigned long long, long long) [xul.dll]
nsWindow::ProcessMessage(unsigned int, unsigned long long&, long long&, long long*) [xul.dll]
nsPresContext::ThemeChanged() [xul.dll]
mozilla::base_profiler_markers_detail::AddMarkerToBuffer<mozilla::baseprofiler::markers::Text,nsTLiteralString<char> >(mozilla::ProfileChunkedBuffer&, mozilla::ProfilerStringView<char> const&, mozilla::MarkerCategory const&, mozilla::MarkerOptions&&, bool (*)(mozilla::ProfileChunkedBuffer&), nsTLiteralString<char> const&) [xul.dll]
profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&) [xul.dll]

So Mike's issue is https://searchfox.org/mozilla-central/rev/25d5a4443a7e13cfa58eff38f1faa5e69f0b170f/widget/windows/nsWindow.cpp#5299-5309

So off-hand seems needed because stuff like device changes can change the any-pointer / any-hover media queries etc, but probably not for this device change, for once, and also we should probably try to detect media media query dependency better with SVGs....

The SVG code invalidating too much causes hangs with pages that have
lots of SVGs.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/40bc2dbc5201 Invalidate images for media query changes more granularly. r=heycam
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch

Can you confirm this is better now? If it's still laggy we should figure out something else, there are other potential optimizations we could make.

Flags: needinfo?(mconca)
Flags: needinfo?(overholt)

Does this bug also affect non-Fission users? Should we uplift this fix to 83 Beta?

If this bug only affects Fission users, we don't need to uplift to Beta because Fission is only available in Nightly.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #23)

Can you confirm this is better now? If it's still laggy we should figure out something else, there are other potential optimizations we could make.

It does not appear to be any better for me. Switching from gDoc to another gDoc results in a spinner, and that second gDoc is not responsive to user input for over a minute.

Latest profile: https://share.firefox.dev/2HpGMU0

Flags: needinfo?(mconca) → needinfo?(emilio)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

This should make the optimization landed earlier in this bug apply for
some of the NotifyThemeChanged() calls in nsWindow.cpp which are causing
all the extra invalidations.

If we know that system colors/fonts didn't change, we can avoid doing a
bunch of reflow work and the patch from earlier in the bug can avoid
re-rasterizing images too.

Flags: needinfo?(emilio)

(With yesterday's (or the day before's?) build I got https://share.firefox.dev/31CTRjR but I think Emilio's work here will fix it so I'm just posting for posterity.)

Blocks: 1673318
Attachment #9183129 - Attachment description: Bug 1668875 - Distinguish theme changes that can and cannot affect style/layout. r=#style,tnikkel → Bug 1668875 - Distinguish theme changes that can and cannot affect style/layout. r=#style,tnikkel,mstange
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a7080bae91c1 Distinguish theme changes that can and cannot affect style/layout. r=tnikkel
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED

Can you confirm this looks better now? :-)

Flags: needinfo?(mconca)

Much, MUCH better. No spinners or slow-downs now. Nice work.

Flags: needinfo?(mconca)

I concur with Mike (17 days late). Thanks!

Flags: needinfo?(overholt)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: