Closed Bug 639780 Opened 14 years ago Closed 13 years ago

Significant memory leak introduced between 4.0b10 and 4.0b12; causing regular system OOMs

Categories

(Core Graveyard :: Plug-ins, defect)

x86_64
Linux
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: mozilla, Unassigned)

References

Details

(Keywords: memory-leak, regression, Whiteboard: [MemShrink:P2])

Attachments

(11 files)

User-Agent: Mozilla/5.0 (X11; Linux i686; rv:2.0b12) Gecko/20100101 Firefox/4.0b12 Build Identifier: Mozilla/5.0 (X11; Linux i686; rv:2.0b12) Gecko/20100101 Firefox/4.0b12 Two weeks ago I upgraded from 4.0b10 on both of my machines to 4.0b12 on both of my machines. Since then, one of the machines (named intercal) has started OOMing regularly, and the other (named perl) experienced abnormally high memory usage given how few tabs it has open. The OOMs on intercal seem to be correlated with abnormally high memory usage for Flash's plugin-container. I did not upgrade Flash anytime recently, so it would seem that the high Flash memory usage is related to my recent FF upgrade. Here are my recent OOMs: $grep -A1 'Out of memory' /var/log/kern.log Mar 6 09:35:38 intercal kernel: [5268880.338201] Out of memory: Kill process 424 (firefox-bin) score 370 or sacrifice child Mar 6 09:35:38 intercal kernel: [5268880.338212] Killed process 424 (firefox-bin) total-vm:3768460kB, anon-rss:3037200kB, file-rss:0kB -- Mar 6 17:33:16 intercal kernel: [5297538.593814] Out of memory: Kill process 16228 (firefox-bin) score 673 or sacrifice child Mar 6 17:33:16 intercal kernel: [5297538.593826] Killed process 16296 (plugin-containe) total-vm:632640kB, anon-rss:100232kB, file-rss:1636kB -- Mar 7 00:02:04 intercal kernel: [5320866.717173] Out of memory: Kill process 16228 (firefox-bin) score 582 or sacrifice child Mar 7 00:02:04 intercal kernel: [5320866.717186] Killed process 2000 (plugin-containe) total-vm:309004kB, anon-rss:7280kB, file-rss:0kB -- Mar 7 23:49:12 intercal kernel: [5406493.667396] Out of memory: Kill process 16228 (firefox-bin) score 494 or sacrifice child Mar 7 23:49:12 intercal kernel: [5406493.667410] Killed process 4195 (plugin-containe) total-vm:355372kB, anon-rss:10648kB, file-rss:4kB -- Mar 7 23:54:12 intercal kernel: [5406794.459170] Out of memory: Kill process 16228 (firefox-bin) score 495 or sacrifice child Mar 7 23:54:12 intercal kernel: [5406794.459181] Killed process 16228 (firefox-bin) total-vm:5250364kB, anon-rss:4065652kB, file-rss:0kB This last one, at 5GB, is between 2x and 4x the memory usage I saw under 4.0b10 Note that I run two independent (`ff4 -no-remote` from different homedirs) FF instances on intercal. I only run Flash in one of those two instances (pid 16228 in the OOMs). Memory usage in the non-Flash instance is currently as follows: Memory mapped: 1,432,354,816 Memory in use: 1,286,469,108 perl has 2 windows, 7 tabs in one, 22 tabs in the other. Memory usage is currently as follows: Memory mapped: 1,439,694,848 Memory in use: 696,473,970 There was one GoDaddy tab I left open for about a day, at which point I noticed perl lagging badly. I closed that single tab on a guess that the memory usage was somehow related to JS, and FF immediately dropped about 600MB in RSS, which meant I could use my system again. At the same time, 700MB of RSS is pretty insane for 30 tabs. Reproducible: Always Steps to Reproduce: Don't know yet. I tend to have a bunch of tabs open at a time (80-120 per FF instance on intercal). On 4.0b10 and earlier, with this quantity of tabs, the instances would generally run up to ~2GB apiece and sit there.
Oh, I should mention that intercal has 8GB of ram (with no swap). perl has 1.5GB of ram and 400MB of swap. The complete memory usage listing (from about:memory) for intercal's non-Flash instance is: Memory mapped: 1,432,354,816 Memory in use: 1,351,033,924 malloc/allocated 1,351,039,412 malloc/mapped 1,432,354,816 malloc/committed 1,432,354,816 malloc/dirty 3,428,352 js/gc-heap 727,711,744 js/string-data 10,509,872 js/mjit-code 0 storage/sqlite/pagecache 20,116,632 storage/sqlite/other 1,981,776 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,383,240 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,499,639 images/content/used/uncompressed 1,827,705 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,127,374 layout/bidi 630 gfx/surface/image 2,244,048 content/canvas/2d_pixel_bytes 8,471,040 The complete memory usage listing for perl is: Memory mapped: 1,439,694,848 Memory in use: 696,473,970 malloc/allocated 696,477,434 malloc/mapped 1,439,694,848 malloc/committed 1,439,694,848 malloc/dirty 2,703,360 js/gc-heap 324,009,984 js/string-data 7,278,700 js/mjit-code 3,731,521 storage/sqlite/pagecache 115,389,704 storage/sqlite/other 1,561,992 images/chrome/used/raw 0 images/chrome/used/uncompressed 156,160 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 4,623,817 images/content/used/uncompressed 2,066,400 images/content/unused/raw 4,208 images/content/unused/uncompressed 23,120 layout/all 15,101,251 layout/bidi 1,766 gfx/surface/image 167,616 content/canvas/2d_pixel_bytes 6,246,596
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Why is this a JS issue? It's plugin-container that's using all the RAM... the about:memory numbers above are for the main process only.
Assignee: general → nobody
Component: JavaScript Engine → Plug-ins
QA Contact: general → plugins
bz: It's possible that there are multiple issues here. The perl browser has zero plugins (to be specific, about:plugins says "No enabled plugins found"), and yet was using 600MB of RSS for a single, JS-laden tab. That's clearly not a plugin-container bug. At this point, I've restarted the intercal Flash browser, but haven't used Flash at all (I use Flashblock). Memory seems to be holding steady (if you count 500MB RSS swings as "steady"), but most of the OOMs have happened while I've been away at work, so we'll see what happens when I get back home tonight. I set up a quick loop to capture memory usage (`ps u`) for both processes on 60-second intervals, just to see if that shows anything interesting If there's anything you want me to try, let me know; I'm all ears.
Blocks: mlk2.0
Keywords: mlk, regression
Note that I locked my screen and went to sleep immediately after I started running the measurements. Despite all the sawtoothiness, though, the base memory rate seems to be holding steady. We'll see if that holds up throughout the day while I'm at work. Again, note that even though this browser has the Flash plugin ("Flash+" in the title), I use Flashblock and have not yet tried playing any Flash content.
> That's clearly not a plugin-container bug. Sure, but it's also not clearly a "bug" depending on what the page is doing... Worth investigating, for sure.
Keywords: mlk
x0 on this graph is the same time as x0 on the other graph. Also, the Y axis limits are identical between the two (0:1,700,000 bytes). Clearly, in addition to the sawtoothiness, there's a linear increase in RSS, which I'd classify as a leak of some sort. Again, we'll see where this goes once I get back from work today. And to reiterate, this happened while the machine was locked and I was sleeping, so there was no human interaction until the very end of the data on the graph. Finally, I haven't yet been, but I'll start recording similar metrics for perl; my memory usage has been going up, so I presume it's seeing some kind of leak as well, but it'd be useful to see what the actual data looks like.
Oh, here's a current memory breakdown for intercal's non-flash browser (that's showing the linear ramp in memory usage): Memory mapped: 1,579,155,456 Memory in use: 1,536,258,396 malloc/allocated 1,536,206,540 malloc/mapped 1,579,155,456 malloc/committed 1,579,155,456 malloc/dirty 2,519,040 js/gc-heap 871,366,656 js/string-data 12,834,334 js/mjit-code 0 storage/sqlite/pagecache 20,149,648 storage/sqlite/other 2,034,104 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,383,240 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,499,639 images/content/used/uncompressed 1,827,705 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,126,301 layout/bidi 630 gfx/surface/image 2,244,048 content/canvas/2d_pixel_bytes 8,471,040
FWIW, it just occurred to me that the units in both graphs are off by an order of magnitude; they should be kilobytes, not bytes. (so the y scales are from 0 to 1.7GB)
I returned home to an unresponsive-but-not-yet-OOM-killed Flash+ browser. Flash- is sitting pretty. Graphs from all three browser instances forthcoming.
Attached image intercal non-flash memory usage graph (deleted) —
The non-flash instance on intercal continues its nice linear ramp
The flash-enabled instance was fine for a long while, and then it suddenly blew up. To reiterate, I was at work at the time, and the machine was locked. Also, as I had mentioned, the Flash+ instance is completely unresponsive right now. It doesn't redraw anything. It _is_ still alive, though, so I'll leave it and see what happens. Given the memory usage, though, I imagine something might get OOM-killed by the morning: $free -to total used free shared buffers cached Mem: 8199588 8065452 134136 0 134964 285508 Swap: 0 0 0 Total: 8199588 8065452 134136
Attached image Memory usage graph for perl's browser (deleted) —
perl seems pretty well-behaved. The noise at the beginning and end is from when I was using the browser to interact with bugzilla (this morning and now). As a reminder, perl's browser has no plugins (namely, no Flash)
Finally, this is the current memory breakdown for intercal's non-flash browser (that's showing the linear ramp in memory usage). Note that js/gc-heap is growing steadily with each one of these I post: Memory mapped: 1,858,076,672 Memory in use: 1,827,098,152 malloc/allocated 1,827,105,688 malloc/mapped 1,858,076,672 malloc/committed 1,858,076,672 malloc/dirty 1,368,064 js/gc-heap 1,138,753,536 js/string-data 14,886,114 js/mjit-code 0 storage/sqlite/pagecache 20,218,224 storage/sqlite/other 2,034,440 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,383,240 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,499,639 images/content/used/uncompressed 1,827,705 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,127,374 layout/bidi 630 gfx/surface/image 2,244,048 content/canvas/2d_pixel_bytes 8,471,040 And here's perl's: Memory mapped: 1,437,597,696 Memory in use: 739,658,102 malloc/allocated 739,661,566 malloc/mapped 1,437,597,696 malloc/committed 1,437,597,696 malloc/dirty 2,412,544 js/gc-heap 324,009,984 js/string-data 6,504,054 js/mjit-code 6,118,204 storage/sqlite/pagecache 137,863,000 storage/sqlite/other 1,749,384 images/chrome/used/raw 0 images/chrome/used/uncompressed 157,184 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 5,154,241 images/content/used/uncompressed 4,202,468 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 17,555,945 layout/bidi 1,486 gfx/surface/image 91,840 content/canvas/2d_pixel_bytes 6,246,596
First off: if there's anything people want me to try/check/whatever, please let me know. Nothing would make me happier than if this were fixed, so if there's some (reasonable) way I can be of assistance, please speak up. That said, a short update. Unfortunately, an errant > instead of >> last night killed my previous data; oops. That said, intercal flash+ memory usage went up slightly after I grabbed a core dump last night, and then dropped by 35MB overnight. It's currently sitting at 3215196kB of RSS, solid; no fluctuation at all. perl instance is sitting right around 759MB. Base rate is flat, and it's getting tiny GC fluctuations (in other words, looks very reasonable) intercal flash- is continuing its ramp; current detailed numbers are below. We're up to 1.3GB classified as js/gc-heap: Memory mapped: 2,031,091,712 Memory in use: 1,920,810,456 malloc/allocated 1,920,817,992 malloc/mapped 2,031,091,712 malloc/committed 2,031,091,712 malloc/dirty 3,354,624 js/gc-heap 1,308,622,848 js/string-data 12,601,150 js/mjit-code 0 storage/sqlite/pagecache 20,259,928 storage/sqlite/other 2,141,136 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,384,712 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,498,361 images/content/used/uncompressed 1,825,657 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,259,690 layout/bidi 630 gfx/surface/image 2,244,048 content/canvas/2d_pixel_bytes 8,471,040
Double-update for intercal flash-. Sometime yesterday: Memory mapped: 2,497,708,032 Memory in use: 2,409,595,346 malloc/allocated 2,409,602,882 malloc/mapped 2,497,708,032 malloc/committed 2,497,708,032 malloc/dirty 3,104,768 js/gc-heap 1,768,947,712 js/string-data 16,135,706 js/mjit-code 0 storage/sqlite/pagecache 20,364,064 storage/sqlite/other 2,140,408 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,384,712 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,498,361 images/content/used/uncompressed 1,825,657 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,273,051 layout/bidi 630 gfx/surface/image 2,244,048 content/canvas/2d_pixel_bytes 8,471,040 half-an-hour ago: Memory mapped: 2,942,304,256 Memory in use: 2,834,912,766 malloc/allocated 2,834,920,302 malloc/mapped 2,942,304,256 malloc/committed 2,942,304,256 malloc/dirty 3,125,248 js/gc-heap 2,204,106,752 js/string-data 17,475,580 js/mjit-code 0 storage/sqlite/pagecache 20,432,920 storage/sqlite/other 2,192,712 images/chrome/used/raw 0 images/chrome/used/uncompressed 13,397,864 images/chrome/unused/raw 0 images/chrome/unused/uncompressed 0 images/content/used/raw 21,498,361 images/content/used/uncompressed 1,825,657 images/content/unused/raw 0 images/content/unused/uncompressed 0 layout/all 47,281,341 layout/bidi 630 gfx/surface/image 2,257,200 content/canvas/2d_pixel_bytes 8,471,040
So, I woke up yesterday morning to the revelation that _all three_ of my browser instances (across both machines) had been OOM-killed. Fun graphs forthcoming. I think my conclusion so far is that there's a slow, consistent memory leak somewhere (which causes js/gc-heap to grow without bound), as well as a spontaneous leak which causes the browser to blow up pretty immediately. One thought: is it possible that having dialogs open somehow hampers GC effectiveness/retards GC frequency/something like that? For the intercal/Flash+ instance that blew up and then stopped responding, iirc it did have a "do you want to accept this cookie?" dialog open (judging by the window title; the contents weren't being drawn).
Does the memory leak still occur if all addons disabled?
Same constant-base-rate linear ramp
It looked reasonable up until everything broke. It's interesting to note that the base level of memory usage increased after each large swing (this also goes for some of the swings from ~800 to ~1500 minutes). Also interesting is that each large swing shows at least two different slopes — it starts bleeding memory at a constant rate, and then at some point, it starts going at an even faster constant rate.
Attached image Pre-OOM memory usage for perl (deleted) —
Perl was well-behaved until things took a sudden turn for the worse. I should note that this browser instance wasn't really killed by the OOM-killer; it hit the "out of memory" message which, presumably, means that FF's custom malloc was unable to allocate more memory and killed the process. It's interesting that the rate of increase tapered off some near the end.
Had a power outage, so sort of forgot about all this until just now. I'll set up my monitoring again. [1739260.164336] Out of memory: Kill process 28908 (firefox-bin) score 570 or sacrifice child [1739260.164351] Killed process 28908 (firefox-bin) total-vm:5782716kB, anon-rss:4681200kB, file-rss:0kB This was the non-flash instance on intercal As for running without plugins, I can probably do a short A/B test on the nonflash intercal instance, but I find the web-browsing experience to be nearly unbearable without vertical tabs, so it'll have to be a short-lived experiment. For the FF folks: is there any way to get a feel for where the memory is actually being used/held, rather than just "it's in the JS heap"? It would be pretty useful if I could see the memory footprint of each tab (and it'd make it that much easier to figure out if it's a problem related to resources for closed or open tabs).
Not yet. I think Mike Shaver is working on something there.
Keywords: mlk
No longer blocks: mlk2.0
No longer blocks: mlk-fx5+
Whiteboard: [MemShrink:P2]
Omari: any chance you can try your experiments again with a Nightly build of FF6 from http://nightly.mozilla.org/? Numerous leak fixes and similar improvements have occurred since the FF4 betas you tried. Thanks!
In particular, bug 656120 has been fixed, and it has helped with a lot of bug reports involving slow increases in memory usage.
I'm going to close this; there hasn't been any extra info from the reporter in two months, and there's a good chance some MemShrink-related fixes have fixed the original problem. Omari, please reopen if you can still reproduce with Firefox 7 or later; also note that Firefox 7 has per-compartment reporters in about:memory, which give a lot more detail about how the JS engine uses memory.
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → INCOMPLETE
Howdy, all. Sorry for the lack of response for awhile. Here's a quick update and the current state of affairs: For one, I think the bugfix that njn mentioned did fix the immediate issue of FF's memory usage growing to the point of getting OOM-killed repeatedly. That said, memory usage is still significantly higher than I'd expect. I'm currently running Nightly 2011-10-01 (10.0a1) on perl. I'm currently at 2.5GB resident for just one of my two FF instances. Things that look ultra-suspicious are 322MB for "gc-heap-chunk-dirty-unused" and 700MB for "heap-unclassified". In particular, I recorded the reported memory usage (attachments forthcoming) before and after hitting the "Minimize memory usage" button. gc-heap-chunk-dirty-unused decreased by a whole 2MB and heap-unclassified by 7MB. By percentage, this is nothing.
Status: RESOLVED → UNCONFIRMED
Resolution: INCOMPLETE → ---
Note in particular gc-heap-chunk-dirty-unused and heap-unclassified.
Attachment #564640 - Attachment mime type: application/octet-stream → application/xhtml+xml
Attachment #564646 - Attachment mime type: application/octet-stream → application/xhtml+xml
Attached file Memory usage after browser restart (deleted) —
In a short moment of non-stupidity, I grabbed a memory usage dump after restarting the browser. To be explicit, this was a hard shutdown rather than a clean shutdown — I hit ^C in the console I was running the browser from. As such, the browser should have the same state as it did before I killed it (and in particular, I've waited for it to finish lazy-loading all of my zillion tabs). After the restart, the browser's resident size is reduced by _1 GB_. Referring to the metrics I mentioned before, heap-unclassified is down by 300MB to 430MB (from 700MB before), and gc-heap-chunk-dirty-unused is down to 26MB (from 320MB). That alone is a savings of 0.6GB.
heap-unclassified of 20-30% isn't that unusual, unfortunately. We have various efforts to improve this linked off of Bug 563700. gc-heap-chunk-dirty-unused I think is the result of heap fragmentation. Until we get a much fancier GC, we probably won't be able to avoid that entirely. How long did you have the browser running before you restarted it? A few hours? A few days? Some addons can also cause memory use to increase, so you can try disabling those. Unfortunately, without more specifics about what pages or browsing behavior are causing your memory usage to increase, there's nothing we can really act on here. Also, you can copy and paste about:memory as plain text, which makes it a little easier to view.
Attachment #564671 - Attachment mime type: application/octet-stream → application/xhtml+xml
Omari, thanks for the updates. I'm going to close this again, because there's no longer a clear problem, such as a leak. If Firefox is using more memory than you might expect, that alone doesn't make for a terribly good bug report. There's not enough data here to take any concrete actions, unfortunately. The good news is that we have many other bugs open for reducing memory usage in general, search for bugs with "MemShrink" in the whiteboard :) Nb: We have bug 668809 open for the goal of startup memory usage matching memory usage after browsing for some time.
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → INCOMPLETE
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: