Closed
Bug 403481
Opened 17 years ago
Closed 17 years ago
minefield memory usage keeps growing as png images are dynamically loaded (maps.google.com)
Categories
(Core :: Graphics, defect, P2)
Tracking
()
RESOLVED
FIXED
People
(Reporter: nirvn.asia, Assigned: vlad)
References
()
Details
(Keywords: memory-leak, regression)
Attachments
(3 files, 4 obsolete files)
(deleted),
image/png
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
pavlov
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a9pre) Gecko/2007110404 Minefield/3.0a9pre
Build Identifier:
Memory usage of latest firefox minefield grows infinitely as it loads pngs. The problem can easily be seen by navigating maps.google.com for a couple of minutes.
It seems a loaded image will leak some part of itself as memory consumption keeps growing on maps.google.com even if you happen to request map tile image files you had previously charged. Furthermore, memory isn't released after closing page.
Reproducible: Always
Steps to Reproduce:
1. Go to maps.google.com
2. Spend five minutes exploring a part of the world
Actual Results:
Memory usage grows indefinitely
Expected Results:
Memory usage is not reaching +200meg in five minutes
This was confirmed also being a problem in safe mode with all extensions disabled, under a new profile.
Reporter | ||
Comment 1•17 years ago
|
||
Evidence A
Comment 3•17 years ago
|
||
I can confirm this issue. The memory allocation reach 315MB+ since the nov 11 build.
Updated•17 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 4•17 years ago
|
||
Just as a data point, using RAMBack doesn't appear to recover any memory.
Comment 5•17 years ago
|
||
This just loads a single image, and then in a setInterval resets the img.src = src + ? + random number. Memory usage will creep up while viewing this page.
Comment 6•17 years ago
|
||
(In reply to comment #3)
> I can confirm this issue. The memory allocation reach 315MB+ since the nov 11
> build.
>
Sorry, It's with the Nov 7 build that this problem started.
Comment 7•17 years ago
|
||
So either Bug 386585 or Bug 399630?
Comment 9•17 years ago
|
||
(In reply to comment #8)
> backing out Bug 399630 doesn't seem to help.
But whoever debugs this, better to re-check, that is a small patch anyway.
Comment 10•17 years ago
|
||
I ran my testcase with trace-malloc. I dumped a bloat log right after loading it, and then let it run for a little bit and dumped another log. This attachment is diffbloatdump between the two.
Comment 11•17 years ago
|
||
And this is diffbloatdump --allocation-count on the same logs.
Updated•17 years ago
|
Attachment #288377 -
Attachment is obsolete: true
Updated•17 years ago
|
Attachment #288378 -
Attachment is obsolete: true
Comment 12•17 years ago
|
||
Ok, I re-ran with --depth=20 and filtered them through fix-linux-stack:
http://people.mozilla.com/~tmielczarek/allocations.1-2.diff.log
http://people.mozilla.com/~tmielczarek/allocations.1-2.count.diff.log
Each of those is ~3.5Mb.
Comment 13•17 years ago
|
||
So the bulk of the stuff there is:
* sqlite stuff from the URL classifier
* image-related stuff from the GTK drag code
My initial guess is that the URL classifier was still loading its db or something like that, and the leak here is related to the GTK drag code. But I could be wrong. Taking another diff after doing the steps to see the leak a few times longer than you did it for this log should help show which it is.
Reporter | ||
Comment 14•17 years ago
|
||
Hrm; why was this specific memory issue/leak was not discovered by one of the nice automated tests and leak detector mechanism that were setup in the past few months? Anyone? It seems this could have gone unnoticed for still quite some time, making it pretty hard to find the regression window.
Comment 15•17 years ago
|
||
Because we don't run google maps as a testcase? New leak testing is being worked on, that's not the issue here.
I re-ran using google maps, I guess my testcase isn't actually representative:
644289 malloc
785075 realloc (/home/luser/obj-firefox-debug/tools/trace-malloc/lib/../../../
../mozilla/tools/trace-malloc/lib/nsTraceMalloc.c:1182)
774892 PR_Realloc (/home/luser/obj-firefox-debug/nsprpub/pr/src/malloc/../..
/../../../mozilla/nsprpub/pr/src/malloc/prmem.c:483)
742156 NS_Realloc_P (/home/luser/obj-firefox-debug/xpcom/base/../../../moz
illa/xpcom/base/nsMemoryImpl.cpp:292)
742156 nsTArray_base::EnsureCapacity(unsigned int, unsigned int) (/home/
luser/obj-firefox-debug/xpcom/build/nsTArray.cpp:104)
742556 char* nsTArray<char>::AppendElements<char>(char const*, unsigne
d int) (/home/luser/obj-firefox-debug/modules/libpr0n/src/../../../dist/include/
xpcom/nsTArray.h:501)
742556 imgContainer::AddRestoreData(char const*, unsigned int) (/hom
e/luser/obj-firefox-debug/modules/libpr0n/src/../../../../mozilla/modules/libpr0
n/src/imgContainer.cpp:586)
742556 ReadDataOut (/home/luser/obj-firefox-debug/modules/libpr0n/
decoders/png/../../../../../mozilla/modules/libpr0n/decoders/png/nsPNGDecoder.cp
p:309)
Updated•17 years ago
|
Attachment #288349 -
Attachment is obsolete: true
Comment 16•17 years ago
|
||
Full log at: http://people.mozilla.com/~tmielczarek/gmaps-allocations.diff.log
~3.5Mb
Comment 17•17 years ago
|
||
wfm on Windows XP. Taking bug.
Status: NEW → ASSIGNED
Version: unspecified → Trunk
Updated•17 years ago
|
Assignee: nobody → glennrp
Status: ASSIGNED → NEW
Updated•17 years ago
|
Status: NEW → ASSIGNED
Comment 18•17 years ago
|
||
Has anyone verified that backing out the patch from bug #386585 stops the leakage?
Reporter | ||
Comment 19•17 years ago
|
||
ted, this is not google maps specific, leak has to do with loading images in general (seems to be png specific). I'm simply surprised it was not caught as part of other leak test that has images in it.
Btw, comment was meant to help, not to criticize :)
Updated•17 years ago
|
Flags: blocking1.9?
Reporter | ||
Comment 20•17 years ago
|
||
Has the libpng update (bug 403239) healed the fox?
Comment 21•17 years ago
|
||
There is nothing in that update having anything to do with memory allocation. What is more interesting is whether backing out bug #386585 heals it.
Comment 23•17 years ago
|
||
Notice that the original report says "Gecko/2007110404 Minefield/3.0a9pre".
20071104 was before libpng-1.2.22 was checked in.
Comment 24•17 years ago
|
||
Oh, comment 6 says Nov 7.
Someone needs to find the right regression range.
Comment 25•17 years ago
|
||
I've been reviewing checkins back to October 10, and the only one that looks suspicious is bug #296818. I didn't see anything else apparently involving PNG and memory allocations.
Reporter | ||
Comment 26•17 years ago
|
||
glenn, I didn't report the bug using a faulty build. I had to downgrade to November 4th nightly, not affected by this bug, as memory usage was hurting my poor 512meg ram operating system. Sorry for any confusion.
Reporter | ||
Comment 27•17 years ago
|
||
Confirming memory usage is normal using the hourly build (
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a9pre) Gecko/2007110614 Minefield/3.0a9pre) right before libpng 1.2.22 was applied to trunk.
Remains to test whether following hourly with patch applied is problematic.
Reporter | ||
Comment 28•17 years ago
|
||
Hrm, memory usage is normal using the hourly build (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a9pre) Gecko/2007110620 Minefield/3.0a9pre) which, according to the hourly-archive page, has libpng 1.2.22 patch applied.
Smaug/anyone, have you tried a build without patch from bug 399630 applied?
Comment 29•17 years ago
|
||
(In reply to comment #20)
> Has the libpng update (bug 403239) healed the fox?
>
Has anyone tested this? I originally thought this update had nothing to
do with memory allocation, but it does remove use of snprintf() that was
introduced in bug #386585.
Comment 30•17 years ago
|
||
Memory usage seems normal using Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b2pre) Gecko/2007111504
Comment 31•17 years ago
|
||
Mathiew, can we mark this FIXED now?
Comment 32•17 years ago
|
||
(In reply to comment #31)
> Mathiew, can we mark this FIXED now?
>
This bug is not resolved with the latest build
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b2pre) Gecko/2007111620 Minefield/3.0b2pre ID:2007111620
Comment 33•17 years ago
|
||
oops, it's not added in the trunk yet, my bad
Reporter | ||
Comment 34•17 years ago
|
||
glenn, nope, memory usage is still out of control using Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b2pre) Gecko/2007111800 Minefield/3.0b2pre
Reporter | ||
Comment 35•17 years ago
|
||
libpng might not be the guilty one. download more hourly builds to try and find the patch which led to leak, I came down to this:
The regression happened between 2007-11-06 20:55 and 2007-11-07 12:52
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-11-06+20%3A55&maxdate=2007-11-07+12%3A52&cvsroot=%2Fcvsroot
Comment 36•17 years ago
|
||
There's an hourly archive at:
- http://hourly-archive.localgho.st/
Which has the last 14 days' builds. So if you are real quick, then you could get a more accurate range. But soon the builds from the 6th are going to fall off the end..
Comment 37•17 years ago
|
||
I keep getting hit by this now on windows...even though I said WFM the other day. I was using the hybrid mode (the satellite view with road names mode), panning around and firefoxs memory consumption just skyrocketed even after stopping panning...firefox soon became unresponsive and killed through task manager.
OS: Linux → Windows XP
Updated•17 years ago
|
OS: Windows XP → All
Updated•17 years ago
|
Keywords: mlk,
regression
Reporter | ||
Comment 38•17 years ago
|
||
steve, yep, am using this page. Doing my best but Cambodian internet connections are not reliable nor very fast.
Hope range we have now can be enough for coders to figure out which patch is guilty.
Comment 39•17 years ago
|
||
The change to nsWindow.cpp near the end of the patch for bug #380464, which is in the range looks a little suspicious.
if (mThebesSurface && !mThebesSurface->CairoStatus()) {
[some code removed]
else {
mThebesSurface = nsnull;
}
becomes
if (mThebesSurface && mThebesSurface->CairoStatus() != 0)
mThebesSurface = nsnull;
which seems to reverse the conditions for setting mThebesSurface to nsnull.
Comment 40•17 years ago
|
||
(In reply to comment #39)
Oh, never mind, they are the same. The new version is easier to read, too.
Comment 41•17 years ago
|
||
Ive just finished testing the hourly builds and as far as i can tell the regression starts in the 2007-11-06 22:57 build. Can someone confirm?
Comment 42•17 years ago
|
||
kritenks: There was a problem with Firefox downloading the large phishing-sites database when it was started with a fresh or outdated profile, that had similar symptoms (see bug #404645 and dupes, and Asa's comment about the Firefox 3 beta1 release on Mozillazine and slashdot). This probably confused some of our range testing. The range for the phishing bug seems to be from 19 Nov through 21 Nov, and affected earlier minefield nightly builds run during that period.
Kurt: Re: comment #37. Please see if the problem has gone away on Windows. It is working now on Windows for me, but locked up two days ago, so I suppose I was seeing the phishing database bug.
Comment 43•17 years ago
|
||
(In reply to comment #42)
> Kurt: Re: comment #37. Please see if the problem has gone away on Windows.
> It is working now on Windows for me, but locked up two days ago, so I suppose I
> was seeing the phishing database bug.
>
Must have been that bug because this WFM once again. Sorry about that I just remembered reading this bug a while ago and I was trying to find something with google maps as soon as I started firefox up and figured I finally hit this bug. Changing back to linux.
OS: All → Linux
Reporter | ||
Comment 44•17 years ago
|
||
so based on kritenks hourly test and mine, regression happened between 2007-11-06 20:50 to 22:57:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-11-06+20%3A50&maxdate=2007-11-06+22%3A57&cvsroot=%2Fcvsroot
that leaves us with 5 bugs:
bug 398880, getElementsByClassName('') leaks an nsVoidArray
Bug 313984, ASSERTION: Either two people are trying to submit or the previous submit
bug 394751, hang with long lines of text and page break characters
bug 396315, correctly fetch cmaps for all fonts
bug 399369, disable font fallback when a character is in the PUA
... all of them unfortunately seems unrelated to png image memory leak problem.
Comment 45•17 years ago
|
||
(In reply to comment #44)
> so based on kritenks hourly test and mine, regression happened between
> 2007-11-06 20:50 to 22:57:
Well that regression range in incorrect. According to the tinderbox pages there was no Linux build that started at 2007-11-06 20:50.
There was a build at 20:43 and the next one started at 20:55.
I am doing CVS pulls by time from the source tree from just before and just after the libpng update to verify it as the cause.
Comment 46•17 years ago
|
||
I found the real culprit.
This is a regression caused by the ceck-in for bug 380464.
This was verified via back-out.
Comment 47•17 years ago
|
||
Vlad can you take a look since this is caused by bug 380464?
Assignee: nobody → vladimir
Comment 48•17 years ago
|
||
I've been seeing a pixmap leak in minefield with the maps at www.openstreetmap.org. After a few hours I've seen xrestop report 4GB+ of pixmaps in use.
The commit for bug 380464 looks like it removes a gdk_pixmap_unref(pixmap) call. Perhaps I'm missing something, but how does it get freed now?
Comment 49•17 years ago
|
||
Resolving INVALID and removing the block of 1.9, per comments 42, 46, and 48.
Assignee: vladimir → glennrp
Flags: blocking1.9+
Updated•17 years ago
|
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → INVALID
Comment 50•17 years ago
|
||
How is this INVALID? There's a clear bug with a known regression range. It may not be related to your libpng update, but that doesn't make it any less of a bug.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Updated•17 years ago
|
Assignee: glennrp → nobody
Status: REOPENED → NEW
Updated•17 years ago
|
Assignee: nobody → vladimir
Flags: blocking1.9?
Comment 51•17 years ago
|
||
OK. But shouldn't the flags show that this depends on bug #380464, instead of the other way around, and reopen bug #380464?
Status: NEW → ASSIGNED
Comment 52•17 years ago
|
||
This bug is already marked as blocking 380464. It's been that way since 2007-11-23 09:54:55 PST.
I'll leave it to Vlad to decide how he wants to handle this otherwise.
Comment 53•17 years ago
|
||
And no, the standard procedure on Bugzilla is that regressions from bugs are marked as blocking the original bug, not depends on. There's long discussion in the newsgroups about the reasoning behind that.
Comment 54•17 years ago
|
||
Just to clarify.
1. Bugs tat cause regressions are only re-opened if it is decided to clar the regression by backing out the patch.
2. The dependency filed has to do with fixing a bug. as in this bug cannot be fixed until the bug it depends on is fixed. SO, if you switch it form fixing to breaking you have to reverse the state. It is all logical if not confusing. I have filed bug 405215 that suggest modifying the filed labels to clarify the way we use them.
Comment 56•17 years ago
|
||
The attached patch seems to make the pixmap usage stable on my machine. While browsing the map it goes up to around 25MB and then drops back to 8MB about a minute after the tab is closed.
NB: The patch does not address the !xrenderFormat case.
I'm not an X, GTK or Mozilla coder so I can't claim that it is the right way to fix the problem.
Comment 57•17 years ago
|
||
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b2pre) Gecko/2007112504 Minefield/3.0b2pre
Browsing http://maps.google.co.uk/ in hybrid mode for about 5 minutes caused System Monitor to report:
Process Name: firefox-bin
Virtual Memory: 245.8 MiB
Resident Memory: 89.0 MiB
Writable Memory: 72.3 MiB
Shared Memory: 15.1 MiB
X Server Memory: 1.7 GiB
Closing all tabs and waiting for a few minutes returns little or no memory. Test using hybrid mode for quickest reproduction.
Assignee | ||
Comment 58•17 years ago
|
||
Too much code got removed in 380464; put this rather important chunk back.
Attachment #290241 -
Flags: superreview?(pavlov)
Attachment #290241 -
Flags: review?(pavlov)
Updated•17 years ago
|
Attachment #290241 -
Flags: superreview?(pavlov)
Attachment #290241 -
Flags: superreview+
Attachment #290241 -
Flags: review?(pavlov)
Attachment #290241 -
Flags: review+
Comment 59•17 years ago
|
||
Attachment 290241 [details] [diff] fixes the massive leak I was seeing (while using Google Maps) in the X server's address space due to our leaking pixmaps (as shown by xrestop).
Comment 60•17 years ago
|
||
Attachment 290421 [details] [diff] works for me too.
There is still a possibility of a leak though: if gdk_pixmap_new() succeeds and the gfxXlibSurface() call fails.
Assignee | ||
Comment 61•17 years ago
|
||
Bulletproof some more and free the pixmap in all error cases.
Attachment #290241 -
Attachment is obsolete: true
Attachment #290287 -
Flags: review?(pavlov)
Updated•17 years ago
|
Attachment #290287 -
Flags: review?(pavlov) → review+
Assignee | ||
Comment 62•17 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•