630072 - quora.com bloats out of control, part 1: js_UnbrandAndClearSlots leaks

Reporter

Description

•

14 years ago

Numbers for linux32 bit running w/ Flash: Memory according to Ubuntu about:home - 27M quora.com - 52M after 200 quora DOMWindows - 430-460M The initial load of Quora isn't bad. It's actually quite heavy in all browsers. But after clicking around for 200 DOMWindows in one tab in a debug build, our heap has grown out of control. Reading my debug logs, I don't see DocShell bloat, but I do see DOMWindow bloat. I'll attach my shutdown log in a debug build. Notice the build shuts down cleanly (no leaks), but DOMWindows with very old serial numbers are still alive at shutdown. The last 10 DOMWindows destroyed are a grab bag: --DOMWINDOW == 10 (0xad2cacc) [serial = 44] [outer = (nil)] [url = http://www.quora.com/Is-Mexico-a-part-of-North-America-or-Central-America] --DOMWINDOW == 9 (0xbac54c4) [serial = 89] [outer = (nil)] [url = http://www.quora.com/Urban-Dictionary] --DOMWINDOW == 8 (0x9a671834) [serial = 62] [outer = (nil)] [url = http://www.quora.com/Why-might-someone-think-that-being-called-a-gentleman-would-be-bad] --DOMWINDOW == 7 (0x987fdc6c) [serial = 123] [outer = (nil)] [url = http://www.quora.com/Android-Market/Are-there-any-tools-that-help-Android-developers-monitor-sales] --DOMWINDOW == 6 (0xc139a54) [serial = 100] [outer = (nil)] [url = http://www.quora.com/If-I-owned-a-domain-name-can-I-use-Posterous-iPhone-app-to-blog-there] --DOMWINDOW == 5 (0xb3d40c4) [serial = 78] [outer = (nil)] [url = http://www.quora.com/Why-is-calamansi-not-in-the-Merriam-Webster-dictionary] --DOMWINDOW == 4 (0x94c7485c) [serial = 133] [outer = (nil)] [url = http://www.quora.com/What-are-the-best-social-media-monitoring-product-websites] WARNING: NS_ENSURE_TRUE(!(mAsyncExecutionThread)) failed: file /home/sayrer/dev/mozilla-central/storage/src/mozStorageConnection.cpp, line 674 --DOMWINDOW == 3 (0x9a31ecc) [serial = 7] [outer = (nil)] [url = http://www.quora.com/Business-Development] --DOMWINDOW == 2 (0x9ecf0a74) [serial = 112] [outer = (nil)] [url = http://www.quora.com/Where-do-I-find-investors-for-an-Internet-startup-idea] --DOMWINDOW == 1 (0xa4af277c) [serial = 200] [outer = (nil)] [url = http://www.quora.com/Business-Development] --DOMWINDOW == 0 (0xe3781e4) [serial = 192] [outer = (nil)] [url = http://www.quora.com/Business-Development]

Robert Sayre

Reporter

Comment 1

•

14 years ago

Attached file shutdown log (deleted) — Details

Robert Sayre

Reporter

Updated

•

14 years ago

blocking2.0: --- → ?

Robert Sayre

Reporter

Comment 2

•

14 years ago

Just tried a similar exercise on Facebook.com and it went much better.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 3

•

14 years ago

I wonder if we in this case keep the DOM Windows in bfcache and not evict them before 20-30 mins http://mxr.mozilla.org/mozilla-central/source/docshell/shistory/src/nsSHEntry.cpp#61 Robert, what are the exact steps to reproduce? I've never used (or even heard of) quora.com.

Robert Sayre

Reporter

Comment 4

•

14 years ago

I just logged into quora.com and clicked around using the suggested questions and tags.

OS: Linux → Windows CE

Robert Sayre

Reporter

Updated

•

14 years ago

OS: Windows CE → Linux

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 5

•

14 years ago

I guess I need some more exact steps to reproduce, since when I have only quora.com open and I click the questions and tags, DOMWindow count stays between 16 and 22. And memory usage goes up perhaps 40 megs after some page loads. This is on 32 bit linux, debug build, with Flash.

Robert Sayre

Reporter

Comment 6

•

14 years ago

I was able to prevent this bug by setting browser.sessionhistory.max_entries to 0. Does that mean it is bfcache related? I do get a lot of this warning with browser.sessionhistory.max_entries to 50 (the default). WARNING: NS_ENSURE_TRUE(txToRemove) failed: file /home/sayrer/dev/mozilla-central/docshell/shistory/src/nsSHistory.cpp, line 1251

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 7

•

14 years ago

I should probably just hide that warning, since it can happen in valid cases too. It is just that NS_ENSURE_TRUE is quite handy. But anyway, sounds like we're keeping DOMWindows in bfcache. Would be great to be able to reproduce.

Boris Zbarsky [:bzbarsky]

Comment 8

•

14 years ago

sayrer, is the problem there if browser.sessionhistory.max_entries is left at the default 50 but you change browser.sessionhistory.max_total_viewers to 0? What about setting max_total_viewers to 1?

Robert Sayre

Reporter

Comment 9

•

14 years ago

(In reply to comment #8) > sayrer, is the problem there if browser.sessionhistory.max_entries is left at > the default 50 but you change browser.sessionhistory.max_total_viewers to 0? Still happens. > What about setting max_total_viewers to 1? Still happens.

Boris Zbarsky [:bzbarsky]

Comment 10

•

14 years ago

That doesn't obviously sound like bfcache to me, then...

Julian Seward [:jseward]

Comment 11

•

14 years ago

Attached image Overall growth (massif visualiser pic) (deleted) — Details

I can reproduce this, running on Massif (x64-linux). Just one tab. Don't even need to click through 200 questions, more like 20 ish. Attached is a png showing heap use going up and up. Two more pngs to follow.

Julian Seward [:jseward]

Comment 12

•

14 years ago

Attached image Top level heap allocators, about half way up (119 MB) (deleted) — Details

Julian Seward [:jseward]

Comment 13

•

14 years ago

Attached image Top level heap allocators at peak residency (215MB) (deleted) — Details

At least at first glance, there appears to be twice as much of pretty much everything, compared to the midpoint snapshot.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 14

•

14 years ago

Well, that happens if there are many DOMWindows alive, especially if they are in bfcache when also presshell etc. are kept alive. I wish I could reproduce this. I'll retry tomorrow. Also, anyone who can produce this, does waiting 30mins (yes minutes), release some of the memory?

David Mandelin [:dmandelin]

Comment 15

•

14 years ago

(In reply to comment #14) > Also, anyone who can produce this, does waiting 30mins (yes minutes), > release some of the memory? Is there a constant somewhere that can be changed to make the wait shorter?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 16

•

14 years ago

See comment 3

Julian Seward [:jseward]

Comment 17

•

14 years ago

Attached image Results if left idle for 30 mins (deleted) — Details

(In reply to comment #14) > Also, anyone who can produce this, does waiting 30mins (yes minutes), > release some of the memory? At least to a first approximation, no. The attached picture is a bit hard to make sense of, because the horizontal axis is in instructions, not wallclock time. What I did was: 1. start up, ramp up mem usage as discussed above This takes us out to about 33 billion insns (3.3e+10 on x axis) 2. Left it idle for 45 mins 3. Scrolled up and down a few times, so as to burn up a couple billion instructions, in order to give an identifiable post-idle period on the graph. I believe (3) produces the small plateau from about 3.4e10 to 3.6e10 (x axis) at around 1.15e+08 (y axis). So the memory use has fallen somewhat, but not by any means back to the starting level of around 2e+07. I can provide the entire heap profile file for further examination if that's helpful, either in raw form (massif.out.pid, 33MB) or as text after passing through ms_print.

Nicholas Nethercote [inactive]

Comment 18

•

14 years ago

(In reply to comment #17) > Created attachment 508601 [details] > Results if left idle for 30 mins > > At least to a first approximation, no. The attached picture is a bit > hard to make sense of, because the horizontal axis is in instructions, > not wallclock time. You can use Massif's --time-unit=ms option to get around this problem.

Boris Zbarsky [:bzbarsky]

Comment 19

•

14 years ago

smaug, I wonder whether we should add a bunch of NSPR logging so we can get some visibility into what session history is doing...

Johnny Stenback (:jst)

Comment 20

•

14 years ago

peterv, could you look into this with your patches from bug 628599 applied? Seems this is potentially bad, but if it's not, we can certainly unblock on here.

Assignee: nobody → peterv

blocking2.0: ? → final+

Whiteboard: [hardblocker]

Steve Roussey (:sroussey)

Comment 21

•

14 years ago

Not sure if this will just be noise: I sometimes see my FF4 session grow to gigabytes for only a 2-4 tabs. On these occasions it seems to grow and shrink when I shut down, cycling the number of times that there were tabs open. Once while shutting down, this grow-and-shrink cycle caused it to crash (it was hitting close to 4GB). Unfortunately, I can't get anything from the crash report: https://crash-stats.mozilla.com/report/pending/d7f6e256-28cc-4a5a-9f34-e89a62110126

Nicholas Nethercote [inactive]

Updated

•

14 years ago

Blocks: mlk2.0

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 22

•

14 years ago

re-tested on 64bit linux using a build which has the patches for Bug 630947 and bug 614347. DOMWindow count stays between 16 and 24.

Peter Van der Beken [:peterv]

Comment 23

•

14 years ago

(In reply to comment #22) > re-tested on 64bit linux using a build which has the patches for Bug 630947 and > bug 614347. DOMWindow count stays between 16 and 24. I'm having a hard time reproducing this even without those patches, DOMWindow count also stays around 16-24 and on shutdown all windows that are being released were still open.

blocking2.0: final+ → ?

Target Milestone: --- → mozilla2.0b11

Robert Sayre

Reporter

Updated

•

14 years ago

blocking2.0: ? → final+

Robert Sayre

Reporter

Comment 24

•

14 years ago

(In reply to comment #23) > > I'm having a hard time reproducing this even without those patches, The best way to reproduce this is to click on a question, and then click on related questions. It's the question page DOMWindows that stay around. The others (like buttons, etc) are collected.

shutdown log 14 years ago Robert Sayre (deleted), text/plain		Details
Overall growth (massif visualiser pic) 14 years ago Julian Seward [:jseward] (deleted), image/png		Details
Top level heap allocators, about half way up (119 MB) 14 years ago Julian Seward [:jseward] (deleted), image/png		Details
Top level heap allocators at peak residency (215MB) 14 years ago Julian Seward [:jseward] (deleted), image/png		Details
Results if left idle for 30 mins 14 years ago Julian Seward [:jseward] (deleted), image/png		Details
gdb log in defineGetter for InstallTrigger 14 years ago Peter Van der Beken [:peterv] (deleted), text/plain		Details
patch 14 years ago Andreas Gal :gal (deleted), patch		Details \| Diff \| Splinter Review
Updated patch 14 years ago Johnny Stenback (:jst) (deleted), patch	mrbkap : review+ dvander : review+	Details \| Diff \| Splinter Review
patch 14 years ago Andreas Gal :gal (deleted), patch		Details \| Diff \| Splinter Review
patch 14 years ago Andreas Gal :gal (deleted), patch	brendan : review+	Details \| Diff \| Splinter Review