Closed
Bug 708499
Opened 13 years ago
Closed 13 years ago
Reproducible memory leak when calling nsISemanticUnitScanner
Categories
(Core :: XPConnect, defect)
Tracking
()
VERIFIED
FIXED
Tracking | Status | |
---|---|---|
firefox9 | + | fixed |
firefox10 | + | fixed |
firefox11 | --- | unaffected |
People
(Reporter: simon, Assigned: bholley)
References
Details
(4 keywords, Whiteboard: [MemShrink:P1] [qa!], [qa!:9], [qa!:10])
Attachments
(7 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
The attached script reproducibly causes Firefox to consume an additional ~500MB of memory that it won't release after clicking "Minimize Memory Usage" in about:memory. The memory all goes into heap-unclassified, not into js. Two of us have reproduced this on Firefox 9.0b4. The script needs Chrome privileges (e.g., by pasting the code in ExecuteJS).
This seems to be new in Firefox 9; I have tested on both Firefox 8 and Nightly, and it doesn't seem to happen in either of them. It's probably more general than an issue with nsISemanticUnitScanner, since nsISemanticUnitScanner hasn't changed in quite a long time. Our example code is a simplified version of code used in our Firefox extension. Apologies if this has already been reported in another bug, but since I have no idea where the leak is actually taking place.
Reporter | ||
Updated•13 years ago
|
tracking-firefox9:
--- → ?
Reporter | ||
Updated•13 years ago
|
Keywords: regression
Updated•13 years ago
|
Attachment #579921 -
Attachment mime type: application/octet-stream → text/plain
Comment 1•13 years ago
|
||
So this happens in Firefox 9 but not on trunk?
Reporter | ||
Comment 2•13 years ago
|
||
Firefox 9 and 10 leak. Firefox 8 and the trunk don't.
Comment 3•13 years ago
|
||
OK. I'm working on a beta build, but in the meantime would you be willing to try finding a regression range for this? Either for the problem appearing or going away?
Comment 4•13 years ago
|
||
Seems to have gone away between 2011-11-25-03-10-16-mozilla-central and 2011-11-26-03-10-27-mozilla-central.
Comment 5•13 years ago
|
||
Dan, what are the build IDs for those in about:config?
Comment 6•13 years ago
|
||
20111125031016 and 20111126031027
Comment 7•13 years ago
|
||
No, I meant the part that looks like an http://hg.mozilla.org/mozilla-central/... url.
Comment 8•13 years ago
|
||
Reporter | ||
Comment 9•13 years ago
|
||
First appeared between 9/26 and 9/27 nightlies: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c722928d8b69&tochange=1f800c226837
Comment 10•13 years ago
|
||
Er, yes. Sorry about the type in comment 5. :(
So the fix range is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=84117219ded0&tochange=c58bad0b4640
That includes a bunch of changes to xpconnect in the second inbound merge; I bet one of those fixed this.
For the original regression range, my money is totally on bug 683802.
Bobby, can you take a look? I bet we're failing to NS_Free the heap-allocated PRUnichar* string we create for the in wstring param. Of course I don't understand why the hell we heap-allocate there to start with.....
Blocks: 683802
Status: UNCONFIRMED → NEW
tracking-firefox10:
--- → ?
Component: General → XPConnect
Ever confirmed: true
Keywords: mlk
QA Contact: general → xpconnect
Assignee | ||
Comment 11•13 years ago
|
||
I will look into it tomorrow.
Comment 12•13 years ago
|
||
So yeah. On beta I see XPCConvert::JSData2Native calling nsMemory::Alloc (with length 80000) then the Next() call, then no call to nsMemory::Free.
On trunk, I get an nsMemory::Free from CallMethodHelper::~CallMethodHelper calling CallMethodHelper::CleanupParam. In particular, dp->DoesValNeedCleanUp() is true for param 0 on trunk, but false in beta.
Comment 13•13 years ago
|
||
On trunk, CallMethodHelper::ConvertIndependentParam does:
2649 // Flag cleanup for anything that isn't self-contained.
2650 if (!type.IsArithmetic())
2651 dp->SetValNeedsCleanup();
but on beta presumably not. In particular, on beta this happens for T_CHAR_STR and DOMSTRING and such, but not T_WCHAR_STR.
But the T_WCHAR_STR case in xpcconvert is definitely making a copy.
Comment 14•13 years ago
|
||
And that copy-making started with http://hg.mozilla.org/mozilla-central/rev/1a6a02df6029
Before that, unless useAllocator was true we just used the incoming string's chars (which was a good thing!).
Ouch, this is pretty bad. We should get this into 9 if at all possible.
Updated•13 years ago
|
Whiteboard: [MemShrink]
Updated•13 years ago
|
Updated•13 years ago
|
Comment 16•13 years ago
|
||
Tracking for FF9. We're looking for the lowest risk fix at this point - we're planning to code freeze tomorrow evening PT. Please consider low risk backouts at this point. Thanks!
Backing out the giant refactoring might be more risky than just fixing this, unfortunately :-/
Assignee | ||
Comment 18•13 years ago
|
||
Ok, so it looks like I actually fixed this explicitly in the second set of patches (the ones on trunk). See the commit message here:
http://hg.mozilla.org/mozilla-central/rev/8790687685ee
So the fix is probably just to add:
case nsXPTType::T_WCHAR_STR:
next to the T_CHAR_STR case. I'm going to reason about this for a little bit to make sure, test out a patch, and then post it for review.
Updated•13 years ago
|
Whiteboard: [MemShrink] → [MemShrink] [qa+]
Assignee | ||
Comment 19•13 years ago
|
||
Looking over the patch, I believe that there's an analogous leak in the array-of-wstring case: http://hg.mozilla.org/mozilla-central/rev/1a6a02df6029#l5.139
This, thankfully, was also fixed in the subsequent patchset, which clarified the semantics of this stuff considerably. This one fixed it:
http://hg.mozilla.org/mozilla-central/rev/d7e55d8251a6
and this one made the fix even clearer:
http://hg.mozilla.org/mozilla-central/rev/2c2730b0cbf7
Now, whether we actually use arrays-of-wstrings anywhere in gecko is anyone's guess. But it's probably wise to fix it anyway.
However, I really want to verify this stuff. I was under the impression that we were running tracemalloc on xpcshell tests on tinderbox, which we clearly aren't, because we test all of these cases in the xpcshell test framework that I wrote.
In fact, I just discovered that xpcshell doesn't support tracemalloc period, which is bad. I'm going to look into how much trouble it might be to turn it on.
Comment 20•13 years ago
|
||
> So the fix is probably just to add:
For 9.0, I agree. But can we do a followup on NOT allocating in this situation? The old code used not to, in many cases, and it's fundamentally really bad behavior when we can use the existing string data. Especially for an interface like this, where the string data passed in is expected to be large.
Failing that, we should consider nuking all use of wstring in IDL in favor of AString, which doesn't copy all the string data.
> Now, whether we actually use arrays-of-wstrings anywhere in gecko is anyone's guess.
http://mxr.mozilla.org/mozilla-central/search?string=array&find=idl&findi=&filter=wstring&hitlimit=&tree=mozilla-central
Presumably we care about the JS-calling-C++ uses (so e.g. nsIExpatSink is not a problem, and probably not prompt/promptservice). Seems like we have lots of those too, though: the debugger service, spellchecking stuff, stringbundles, nsIProcess, password manager, login manager, various PSM apis). So if there's a leak for wstring arrays, we're almost certainly hitting it.
Assignee | ||
Comment 21•13 years ago
|
||
So here's the state of affairs. I spent a while trying to make tracemalloc work for xpcshell (see bug 708831), but eventually I decided just to use Instruments (native osx developer swiss army knife that does leak analysis), which seems to be a pretty slick tool.
However, the testcase situation wasn't ideal. I could fix the one attached to this bug, obviously, but I wanted to do a more thorough analysis of all of the ~100 cases (permutations of base type, in-outness, and array-ness) for different argument types to be sure we can't leak. In the days since this beta train left the station (the original nightly station, that is), I've added a fair amount more test coverage of these corner cases. So the first thing I did was to backport the most modern testing code to mozilla-beta. This is the first patch. There's not a real reason to land it on beta, but it also makes it easier for everyone to verify that the code doesn't leak. Since it's just test code, there's not really any risk. So I'd be in favor of landing it to beta as the closest thing we'll get to test coverage for this bug. I don't have a super strong opinion on this though, so I'll let the drivers make the call.
The second patch fixes the wstring leaks: both the one that prompted this bug, and the array case mentioned in comment 19.
The third patch fixes yet another leak that I found by running Instruments on the new test coverage. It's a regression from http://hg.mozilla.org/mozilla-central/rev/04dc934f61d5#l2.50 where we leak sized-string out-params. Whether these ever come up in gecko is something I can't say, but I think it's probably a good idea to take the patch.
With those two patches, we don't appear to leak on any of the test cases. Huzzah!
Assignee | ||
Comment 22•13 years ago
|
||
Here's the backported test coverage.
Assignee | ||
Comment 23•13 years ago
|
||
Here's the wstring leak fix
Assignee | ||
Updated•13 years ago
|
Attachment #580306 -
Attachment description: mozilla-beta fix - part 1 - backport tests - v1 → mozilla-beta fix - part 1 - backport tests. v1
Assignee | ||
Comment 24•13 years ago
|
||
Assignee | ||
Comment 25•13 years ago
|
||
Comment on attachment 580307 [details] [diff] [review]
mozilla-beta fix - part 2 - fix wstring leak. v1
Flagging bz for review.
Attachment #580307 -
Attachment is patch: true
Attachment #580307 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 26•13 years ago
|
||
Comment on attachment 580308 [details] [diff] [review]
mozilla-beta fix - part 3 - fix sized out-param leak. v1
Flagging bz for review here too.
This one's a bit confusing, so I'll provide a bit of context.
There are 3 different types of parameters we can be dealing with in ConvertDependentParams: Arrays, sized-strings, and iid_is interface pointers.
Interface pointers are always flagged with SetValNeedsCleanup(), right above the context barrier of this patch :-(. This appears to be the reason for the !IsInterfacePointer() check, though it's quite dumb. Regardless, we don't have to worry about them one way or the other, because flagging for cleanup is idempotent and it has already happened.
When we clean up arrays, we always deallocate the array buffer itself, so we use ValNeedsCleanup() to indicate whether the _contents_ of the array need cleanup. This is true for any pointer type, which is what the existing code is trying to do.
sized strings are pointer types, and thus always need to be cleaned up (given that, at the moment, we're always allocating). We _used_ to clandestinely flag them for cleanup before this patch:
http://hg.mozilla.org/mozilla-central/rev/04dc934f61d5#l2.50
Shared parameters don't make it into xpconnect, so the !IsShared() always resulted in true. When I removed it, I mistook the || for an && (or something). What I should have done was just removed that part of the predicate altogether. This was the regression.
So in the new world, we flag everything for cleanup, exception arrays, where we flag if and only if the datum type is a pointer.
Let me know if there's any other clarification I can provide.
Attachment #580308 -
Attachment is patch: true
Attachment #580308 -
Flags: review?(bzbarsky)
Comment 27•13 years ago
|
||
Comment on attachment 580307 [details] [diff] [review]
mozilla-beta fix - part 2 - fix wstring leak. v1
r=me
Attachment #580307 -
Flags: review?(bzbarsky) → review+
Comment 28•13 years ago
|
||
Comment on attachment 580308 [details] [diff] [review]
mozilla-beta fix - part 3 - fix sized out-param leak. v1
r=me
Attachment #580308 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 29•13 years ago
|
||
Pushed this to try: https://tbpl.mozilla.org/?tree=Try&rev=5f4f63faf9d1
Assignee | ||
Comment 30•13 years ago
|
||
So, this had a bit of a rocky try run, which is somewhat troubling given that this code is days away from release. At the same time though, it's hard to imagine these patches causing any of those failures. What do people think? Should I push again? Is there an easy way to get readable stacks out of those couple of minidumps?
Comment 31•13 years ago
|
||
> What do people think?
Push the base revision you were building on top of to try as well, just in case?
I see known intermittents and known permaoranges (M4 on snow leopard is hidden on beta for the same failure).
Assignee | ||
Comment 33•13 years ago
|
||
Comment on attachment 580307 [details] [diff] [review]
mozilla-beta fix - part 2 - fix wstring leak. v1
Ok, flagging for approval.
Attachment #580307 -
Flags: approval-mozilla-beta?
Assignee | ||
Updated•13 years ago
|
Attachment #580308 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 34•13 years ago
|
||
Comment on attachment 580306 [details] [diff] [review]
mozilla-beta fix - part 1 - backport tests. v1
bz r+ed these on IRC.
Attachment #580306 -
Flags: approval-mozilla-beta?
Comment 35•13 years ago
|
||
Comment on attachment 580306 [details] [diff] [review]
mozilla-beta fix - part 1 - backport tests. v1
[Triage Comment]
Approving these three patches because this contributes to a major known memory leak, and the patches are considered low risk. There will be a followup for aurora 10 in the coming days.
Attachment #580306 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•13 years ago
|
Attachment #580307 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•13 years ago
|
Attachment #580308 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Assignee | ||
Comment 36•13 years ago
|
||
Pushed these 3 patches to mozilla-beta:
http://hg.mozilla.org/releases/mozilla-beta/rev/4d4553e11c57
http://hg.mozilla.org/releases/mozilla-beta/rev/2dc222dac518
http://hg.mozilla.org/releases/mozilla-beta/rev/06f94caa18d4
Assuming all goes well there, here are the things left to do on this bug:
1 - Prepare the same 3 patches on mozilla-aurora
2 - Make sure that everything is good on mozilla-central.
I'll do these at the beginning of next week. I need to go get outside and take a break from all this. If there's some emergency with the beta landing please call me on my mobile (listed on phonebook).
Assignee | ||
Comment 37•13 years ago
|
||
These landed without incident on beta. I've spun up analogous patches for aurora, and verified on my local build (with valgrind, this time) that they fix the problem.
Flagging bz for rubber stamp on the 3 patches.
Assignee | ||
Comment 38•13 years ago
|
||
Flagging bz for rs.
Attachment #581361 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 39•13 years ago
|
||
Flagging bz for rs.
Attachment #581362 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 40•13 years ago
|
||
Flagging bz for rs.
Attachment #581363 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 41•13 years ago
|
||
Pushed the aurora patches to try as well: https://tbpl.mozilla.org/?tree=Try&rev=20239c628fc5
Comment 42•13 years ago
|
||
Comment on attachment 581361 [details] [diff] [review]
mozilla-aurora fix - part 1 - backport tests. v1
r=me
Attachment #581361 -
Flags: review?(bzbarsky) → review+
Comment 43•13 years ago
|
||
Comment on attachment 581362 [details] [diff] [review]
mozilla-aurora fix - part 2 - fix wstring leak. v1
r=me
Attachment #581362 -
Flags: review?(bzbarsky) → review+
Comment 44•13 years ago
|
||
Comment on attachment 581363 [details] [diff] [review]
mozilla-aurora fix - part 3 - fix sized out-param leak. v1
r=me
Attachment #581363 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 45•13 years ago
|
||
Comment on attachment 581361 [details] [diff] [review]
mozilla-aurora fix - part 1 - backport tests. v1
Flagging for aurora. Hopefully this is a no-brainer since we landed on beta.
Attachment #581361 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 46•13 years ago
|
||
Comment on attachment 581362 [details] [diff] [review]
mozilla-aurora fix - part 2 - fix wstring leak. v1
Flagging for aurora.
Attachment #581362 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 47•13 years ago
|
||
Comment on attachment 581363 [details] [diff] [review]
mozilla-aurora fix - part 3 - fix sized out-param leak. v1
Flagging for aurora.
Attachment #581363 -
Flags: approval-mozilla-aurora?
Updated•13 years ago
|
Whiteboard: [MemShrink] [qa+] → [MemShrink:P1] [qa+]
Updated•13 years ago
|
Attachment #581361 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•13 years ago
|
Attachment #581362 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 48•13 years ago
|
||
Comment on attachment 581363 [details] [diff] [review]
mozilla-aurora fix - part 3 - fix sized out-param leak. v1
[Triage Comment]
Approving for aurora, as we've already landed the analogous patches on beta.
Attachment #581363 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 49•13 years ago
|
||
Assignee | ||
Updated•13 years ago
|
Assignee | ||
Comment 50•13 years ago
|
||
Everything should be good at this point, since mc is unaffected.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment 51•13 years ago
|
||
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:9.0) Gecko/20100101 Firefox/9.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:10.0a2) Gecko/20111218 Firefox/10.0a2
Verified on Firefox 9 RC and Firefox 10 Aurora using the test case provided in comment 0.
STR:
1. Start Firefox 9 RC/Aurora.
2. Install Execute JS add-on.
3. Execute the code in test case from comment 0 using JS Execute.
4. Open about:memory in new tab and check values.
Firefox does not consume additional memory anymore after executing the code.
Setting to verified as Firefox Nightly is unaffected.
Status: RESOLVED → VERIFIED
Keywords: verified-aurora,
verified-beta
Whiteboard: [MemShrink:P1] [qa+] → [MemShrink:P1] [qa!], [qa!:9], [qa!:10]
You need to log in
before you can comment on or make changes to this bug.
Description
•