Closed Bug 659633 Opened 13 years ago Closed 13 years ago

Pseudo-hang when a XUL tree element is instantiated

Categories

(Core :: Disability Access APIs, defect)

2.0 Branch
x86
Windows 7
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla7
Tracking Status
firefox6 --- fixed

People

(Reporter: ecfbugzilla, Assigned: surkov)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression, verified-aurora)

Attachments

(4 files)

I noticed that after I use Firefox 4 for some time opening Adblock Plus preferences hangs - the window displays and freezes for several seconds then. This only happens after a lengthy browsing session which is why I don't know what is so special about this window and whether the problem happens on trunk as well, experiments involving a browser restart take way too much time. JavaScript Deobfuscator shows that the problem is connected to the tree widget, after the initial paint nsITreeView::GetTreeCell() is being called for all cells of all rows (not only the visible ones) several times which is what causes the issue (row count of something around ten thousand combined with a custom tree view implementation). I attached WinDbg to the process. This way I could determine that paint isn't the culprit here, it is fast as always. The call stack for the offending call: > 0052d298 7051b1c2 xul!nsXULTreeGridCellAccessible::Init(void) [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\xul\nsxultreegridaccessible.cpp @ 1100] > 0052d2a4 705423b8 xul!nsDocAccessible::BindToDocument(class nsAccessible * aAccessible = 0x081ce580, struct nsRoleMapEntry * aRoleMapEntry = 0x351b63e0)+0x6f [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\base\nsdocaccessible.cpp @ 1370] > 0052d2c8 703dc7b8 xul!nsXULTreeGridRowAccessible::GetCellAccessible(class nsITreeColumn * aColumn = 0x081ce580)+0x8d [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\xul\nsxultreegridaccessible.cpp @ 742] > 0052d2dc 70541c10 xul!nsXULTreeGridRowAccessible::GetChildAt(unsigned int aIndex = 0x81ce580)+0x1f [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\xul\nsxultreegridaccessible.cpp @ 712] > 0052d2f8 70541c26 xul!nsDocAccessible::CacheChildrenInSubtree(class nsAccessible * aRoot = 0x174859b4)+0x2b [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\base\nsdocaccessible.cpp @ 1952] > 0052d314 70541c26 xul!nsDocAccessible::CacheChildrenInSubtree(class nsAccessible * aRoot = 0x079461cc)+0x41 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\base\nsdocaccessible.cpp @ 1955] > 0052d330 70541c6c xul!nsDocAccessible::CacheChildrenInSubtree(class nsAccessible * aRoot = 0x079461cc)+0x41 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\base\nsdocaccessible.cpp @ 1955] > 0052d340 7054a85f xul!nsDocAccessible::NotifyOfInitialUpdate(void)+0x39 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\base\nsdocaccessible.cpp @ 1514] > 0052d358 6fef0ee6 xul!nsDocAccessibleWrap::NotifyOfInitialUpdate(void)+0xf [e:\builds\moz2_slave\rel-2.0-w32-bld\build\accessible\src\msaa\nsdocaccessiblewrap.cpp @ 288] > 0052d360 6fef0a23 xul!nsCOMPtr<nsIProxyInfo>::operator=(class nsIProxyInfo * rhs = 0x083bf740)+0x18 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\obj-firefox\dist\include\nscomptr.h @ 665] > 0052d368 6fd22b59 xul!nsIViewManager::UpdateViewBatch::EndUpdateViewBatch(unsigned int aUpdateFlags = 0x47ea6e3d)+0x1b [e:\builds\moz2_slave\rel-2.0-w32-bld\build\obj-firefox\dist\include\nsiviewmanager.h @ 336] > 0052d370 6fcce5ea xul!nsRefPtr<nsIDOMEventListener>::~nsRefPtr<nsIDOMEventListener>(void)+0xc [e:\builds\moz2_slave\rel-2.0-w32-bld\build\obj-firefox\dist\include\nsautoptr.h @ 970] > 0052d420 6fc39b04 xul!PresShell::FlushPendingNotifications(mozFlushType aType = Flush_Content (1))+0x32a [e:\builds\moz2_slave\rel-2.0-w32-bld\build\layout\base\nspresshell.cpp @ 4953] > 0052d43c 6fc44cf2 xul!TimerThread::UpdateFilter(unsigned int aDelay = 0x5a840fc0, class mozilla::TimeStamp aTimeout = class mozilla::TimeStamp, class mozilla::TimeStamp aNow = class mozilla::TimeStamp, double smoothSlack = 1.9716673574817238e-236)+0x254 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\xpcom\threads\timerthread.cpp @ 236] > 0052d474 6fc44a5c xul!nsTimerImpl::Fire(void)+0x92 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\xpcom\threads\nstimerimpl.cpp @ 409] > 0052d47c 6fcbf194 xul!nsTimerEvent::Run(void)+0x1c [e:\builds\moz2_slave\rel-2.0-w32-bld\build\xpcom\threads\nstimerimpl.cpp @ 519] > 0052d4b0 6fcf317a xul!nsThread::ProcessNextEvent(int mayWait = 0, int * result = 0x00000000)+0x324 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\xpcom\threads\nsthread.cpp @ 633] > 0052d4ec 6fe0f5ba xul!mozilla::ipc::MessagePump::Run(class base::MessagePump::Delegate * aDelegate = 0x706f2554)+0x6a [e:\builds\moz2_slave\rel-2.0-w32-bld\build\ipc\glue\messagepump.cpp @ 110] > 0052d4f8 6fe0f5a3 xul!MessageLoop::RunInternal(void)+0x11 [e:\builds\moz2_slave\rel-2.0-w32-bld\build\ipc\chromium\src\base\message_loop.cc @ 219] > 0052d534 71398b90 xul!MessageLoop::RunHandler(void)+0x1d [e:\builds\moz2_slave\rel-2.0-w32-bld\build\ipc\chromium\src\base\message_loop.cc @ 203] That's where I am stuck right now, I don't know the accessibility code well enough to determine what is going wrong here. Anybody have a hint on what I should check? Note that I've seen similar pseudo-hangs before when closing a tab. Attaching the debugger showed tons of accessibility objects being destroyed. Firefox version: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Blocks: a11yperf
(In reply to comment #0) > That's where I am stuck right now, I don't know the accessibility code well > enough to determine what is going wrong here. Anybody have a hint on what I > should check? this is accessible tree creation, it looks like it's just take too long to create the tree for XUL trees. On the another hand for XUL trees I think we could create tree lazily, this should get rid the problem. Other question how does it happen that a11y is enabled for you?
(In reply to comment #1) > this is accessible tree creation, it looks like it's just take too long to > create the tree for XUL trees. Checked properly, I have a tree with 31k lines and four visible columns, I see 127k GetCellText calls - so there is apparently only one call per tree cell after all (plus some calls from regular paint operations). Which still isn't fun, a tree is supposed to work efficiently with large amounts of data, it should only consider visible rows rather than instantiating objects for each single cell. Apparently, this is happening for all tree widgets, not just the one in Adblock Plus. I've created a minimal testcase - it is also reporting that GetCellText() is being called for all tree cells, at least in the affected browser instance. A second browser instance running with an identical copy of the profile reports only calls for the visible rows. > Other question how does it happen that a11y is enabled for you? Good question. It seems to enable itself after an lengthy browsing session and I have no idea why that happens. As I said, right now I have two Firefox instances running on identical profiles - one is affected (has been running for a week), the other (started yesterday) isn't. The only preferences related to accessibility are: user_pref("accessibility.typeaheadfind.autostart", false); user_pref("accessibility.typeaheadfind.flashBar", 0); user_pref("accessibility.win32.force_disabled", true); The last one was added after I noticed hangs when closing browser tabs. Doesn't help obviously.
Attached file Testcase (deleted) —
This testcase needs to be copied somewhere where it is accessible via chrome://, then it can simply be opened in a browser tab. It contains a tree widget and will register a tree view for it that does almost nothing but counts GetCellText() calls. It also claims to have 20000 rows. The page will then display the number of calls. Normally it displays something like 84 - calls only for the visible rows. In my "broken" browser instance it hangs for a few seconds and displays 60175 then.
(In reply to comment #1) > Other question how does it happen that a11y is enabled for you? Is there a single flag controlling whether a11y is enabled? Maybe I can attach a breakpoint to it to see when this flag is flipped.
a11y gets enabled either when extension starts to use nsIAccessibiltityRetrieval or some application sends WM_GETOBJECT to Firefox. I don't think there's another way to start a11y on Windows (and your two-browsers experience looks quite strange). After a11y is started then there's no way to stop it until you close Firefox. Thanks, Wladimir, for the testcase, I think I can come with patch to fix this issue.
(In reply to comment #5) > a11y gets enabled either when extension starts to use > nsIAccessibiltityRetrieval Ok, found it without debugging - a search in the extensions directory was enough. It's DOM Inspector, it triggers nsIAccessibilityRetrieval when the inspector window is opened. I will file an additional DOMi bug, paying such a high price by default simply because of using DOMi sounds unreasonable.
Btw, the issue isn't reproducible in the current 7.0a1 nightly (which of course doesn't mean that it is really fixed).
No longer blocks: abp
Hm, DOMi shouldn't start a11y just because you run it (of course if View->Show Accessible Nodes isn't turned on). Please cc me on that bug.
(In reply to comment #7) > Btw, the issue isn't reproducible in the current 7.0a1 nightly (which of > course doesn't mean that it is really fixed). I guess it should be because a11y simply isn't started there.
My bad, it is reproducible in the nightly as well - but "Show accessible nodes" preference (inspector.dom.showAccessibleNodes) needs to be checked. And apparently it isn't checked by default, I must have switched it on at some point. So no DOM Inspector bug here.
Attached patch patch (deleted) — Splinter Review
Assignee: nobody → surkov.alexander
Status: NEW → ASSIGNED
Attachment #535562 - Flags: review?(trev.saunders)
Attachment #535562 - Flags: feedback?(marco.zehe)
(In reply to comment #3) > Created attachment 535278 [details] > Testcase > > This testcase needs to be copied somewhere where it is accessible via > chrome://, then it can simply be opened in a browser tab. It contains a tree I have tried, but don't get *any* reaction. Either I copy the file to a wrong place and it doesn't get loaded without giving me an error 404, or the file doesn't work on my system at all for some reason. Can you give me some more specific details where I could copy this to make sure it works? Thanks!
Attached file Test extension (deleted) —
Here the same testcase as an extension. Install it, then open chrome://testextension/content/test.xul in the browser. Normally it shows you an alert message saying something like "Number of calls: 87". If the number of calls is beyond 60000 then something is wrong.
Comment on attachment 535562 [details] [diff] [review] patch f=me. After I found a different way of loading the file properly, I found that this indeed speeds up matters. Also I ran the whole test suite, and all tests still pass.
Attachment #535562 - Flags: feedback?(marco.zehe) → feedback+
The extension definitely shows the bug fixed: I get a "number of calls" of 163. And all 20000 rows are there.
Comment on attachment 535562 [details] [diff] [review] patch >diff --git a/accessible/src/base/nsAccessible.h b/accessible/src/base/nsAccessible.h >--- a/accessible/src/base/nsAccessible.h >+++ b/accessible/src/base/nsAccessible.h >@@ -322,20 +322,34 @@ public: > */ > nsAccessible* GetEmbeddedChildAt(PRUint32 aIndex); > > /** > * Return index of the given embedded accessible child. > */ > PRInt32 GetIndexOfEmbeddedChild(nsAccessible* aChild); > >- PRUint32 GetCachedChildCount() const { return mChildren.Length(); } >- nsAccessible* GetCachedChildAt(PRUint32 aIndex) const { return mChildren.ElementAt(aIndex); } >+ /** >+ * Return number of content children/content child at index. The content >+ * child is created from markup in contrast to it's never constructed by its >+ * parent accessible (like treeitem accessibles for XUL trees). >+ */ >+ PRUint32 ContentChildCount() const { return mChildren.Length(); } >+ nsAccessible* ContentChildAt(PRUint32 aIndex) const >+ { return mChildren.ElementAt(aIndex); } >+ next time can the rename be a seperate patch from the real change?thanks :) Otherwise I think the patch does what we want. I'm a little worried about this causing other problems in xul trees so lets watch out for that.
Attachment #535562 - Flags: review?(trev.saunders) → review+
(In reply to comment #17) > next time can the rename be a seperate patch from the real change?thanks :) it makes sense for big patches, but here I'm not sure :)
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla7
Marco, let's get it running on trunk for few days and then ask approval for Firefox 6. This can result into quite serious performance regression.
We'll need a note to drivers here in the bug explaining the patch, and the risk.
(In reply to comment #21) > We'll need a note to drivers here in the bug explaining the patch, and the > risk. I don't see anything risky in the patch, changes may be not trivial but shouldn't regress us. Therefore I'd take couple days to make sure we don't spot regressions.
(In reply to comment #22) > (In reply to comment #21) > > We'll need a note to drivers here in the bug explaining the patch, and the > > risk. > I don't see anything risky in the patch, changes may be not trivial but > shouldn't regress us. Therefore I'd take couple days to make sure we don't > spot regressions. Unfortunately, this patch needs bug 660149 and bug 660153 to apply cleanly. Since Aurora is in pure bug fix mode from regressions introduced recently, I do currently not see how we can get this one into Firefox 6, especially when it depends on two other patches.
Marco, the fix is in nsDocAccessible::CacheChildrenInSubtree(nsAccessible* aRoot), anything else (including two other patches) is cleaning up.
OK, can you then please provide an Aurora-specific patch that contains the fix for this bug without any additional cleanup that we can then request approval on just in case? And also, can you clearly point out when this regressed? Is this an FX4 regression?
(In reply to comment #25) > OK, can you then please provide an Aurora-specific patch that contains the > fix for this bug without any additional cleanup that we can then request > approval on just in case? Sure, it's trivial. Trevor, can you take this? to make sure the reviewing is close enough to patch works ;) > And also, can you clearly point out when this > regressed? Is this an FX4 regression? yes.
Attached patch patch against aurora (deleted) — Splinter Review
apply only the fix against aurora. tests pass, and I've confirmed with the extension the number of calls was ~150
Attachment #538128 - Flags: approval-mozilla-aurora?
Additions to the approval request: This is a regression introduced in Firefox 4 which impairs all extensions doing lots of tree table manipulation when accessibility is turned on. It's a safe fix and improves performance by a big magnitude (see comments above for value comparisons).
Keywords: perf, regression
Attachment #538128 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Verified fixed in Aurora build: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a2) Gecko/20110616 Firefox/6.0a2
Keywords: verified-aurora
Depends on: 673757
Mozilla/5.0 (Windows NT 6.1; rv:7.0) Gecko/20100101 Firefox/7.0 Verified issue on Windows 7 using the test case attached in Comment 14 - the alert message shows up with a "Number of calls: 120". Setting resolution to VERIFIED FIXED.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: