Closed Bug 633382 Opened 14 years ago Closed 14 years ago

Xray wrappers don't cache resolved native properties on the holder object

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla2.0b12
Tracking Status
blocking2.0 --- final+

People

(Reporter: whimboo, Assigned: gal)

References

()

Details

(Keywords: memory-leak, regression, Whiteboard: [hardblocker][has patch], fixed-in-tracemonkey)

Attachments

(3 files, 1 obsolete file)

Attached file test profile (deleted) —
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b12pre) Gecko/20110209 Firefox/4.0b12pre ID:20110209030359

After copying the XPI files of  Personas Interactive and PageTweak inside extensions folder of a profile and starting Minefield, we will grab all available memory. In some cases it was even hard to prevent a system freeze.

Steps:
1. Extract the example profile
2. Start Minefield with that profile

Wait some seconds (about 5) and watch how the memory usage goes up.
That's a regression for Firefox 4. I will check for the range.
Regressed recently between the builds 11020803 and 11020903.

PASS: http://hg.mozilla.org/mozilla-central/rev/3470891975c7
FAIL: http://hg.mozilla.org/mozilla-central/rev/fd0817e454fe

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3470891975c7&tochange=fd0817e454fe

Not sure which of those checkins could be related. Olli, could it somehow depend on bug 630947? I could do a hg bisect tomorrow.
blocking2.0: --- → ?
over to extension compatibility for now, this doesn't look likely to be add-ons manager stuff. Probably will need bisecting to narrow down the field though.
Component: Add-ons Manager → Extension Compatibility
Product: Toolkit → Firefox
QA Contact: add-ons.manager → extension.compatibility
I don't think that it's an extension issue, because it has been regressed in-between the above mentioned builds. Moving to Core:General for now, until I have more details.
Component: Extension Compatibility → General
Product: Firefox → Core
QA Contact: extension.compatibility → general
Thankfully we have still have hourly builds on Feb 8th, which gave me the chance to nail down this problem really quickly. The regression started between the builds 20110208161707 and 20110208162137:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=39a631a71d18&tochange=4c0ca7d630b9

Expose console and InstallTrigger as content objects (bug 631225, r=dtownsend). a=blocker
author	Andreas Gal <gal@mozilla.com>
	Mon Feb 07 23:18:18 2011 -0800 (at Mon Feb 07 23:18:18 2011 -0800)
changeset 62193	4c0ca7d630b9
parent 62192	39a631a71d18
child 62194	10677c174653
pushlog:	4c0ca7d630b9
Blocks: 631225
Whiteboard: [mlk]
Ok, while minimizing the testcase I noticed that this problem only occurs when PageTweak is installed and the first-run page of Personas Interactive is open.

Updated steps:
1. Install PageTweak (https://addons.mozilla.org/de/firefox/addon/pagetweak-ad-blocker-video-dow/)
2. Open http://pages.brandthunder.com/btpersonas/firstrun?overlay

The memory will grow quite fast immediately.
Summary: Huge memory leak when migrating Personas Interactive and PageTweak → Huge memory leak when having PageTweak installed and opening the given URL
Disabling AutoPagerize in the preferences of PageTweak stops this problem.
For now I would blame the ConsoleAPI.
Component: General → DOM
QA Contact: general → general
I think the bisected regressor is a red herring. Can someone try the patch in bug 630072? I fixed a leak there that might fix this.
Assignee: nobody → gal
I don't think the patch in bug 630072 applies cleanly to trunk.
At least it didn't earlier today.
Andreas, I did another bisect this time with hg and I see the exact same changeset as the regressor:

The first bad revision is:
changeset:   62193:4c0ca7d630b9
user:        Andreas Gal <gal@mozilla.com>
date:        Mon Feb 07 23:18:18 2011 -0800
summary:     Expose console and InstallTrigger as content objects (bug 631225, r=dtownsend). a=blocker

Also while waiting for a while I see the following slow script warning:

file:///Volumes/data/build/obj/minefield/dist/Minefield.app/Contents/MacOS/components/ConsoleAPI.js:102
And that's exactly the newly introduced evalInSandbox call:
http://mxr.mozilla.org/mozilla-central/source/dom/base/ConsoleAPI.js?force=1#87
Henrik, the object that code added isn't wrapped right. That doesn't make that code the regressor. I will refresh the patch in 630072 if needed. The patch is against TM, not central.
Depends on: 630072
Depends on: 633672
Depends on: 633738
Depends on: 633879
blocking2.0: ? → final+
Whiteboard: [mlk] → [mlk][may be dup of 630072][hardblocker]
jst: can you test with the patch combination that fixed quora for you?
There's still issues here, even with the quora patches applied, but this isn't looking like a leak, there's some runaway JS that keeps allocating like crazy here. After a starting Firefox and waiting ~5s the browser locks up and a while later the slow script dialog pops up. When randomly breaking in the debugger while it appears locked up, I got this from DumpJSStack():

0 XPCOMUtils_QueryInterface(iid = {6368a821-d3e2-4cbd-9699-5a8ba569e2f3}) ["resource://gre/modules/XPCOMUtils.jsm":257]
    this = [object Object]
1 <TOP LEVEL> ["<unknown>":0]
    <failed to get 'this' value>
2 log(message = Error: Permission denied for <http://platform0.twitter.com> to get property Location.href) ["chrome://pagetweak/content/gmcompiler.js":1050]
3 anonymous(list = [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],
[many many more pages of the same]
t],[object Object]) ["chrome://pagetweak/content/gmcompiler.js":713]
    i = 325
4 anonymous(url = "http://wedata.net/databases/AutoPagerize/items.json", res = [object Object]) ["chrome://pagetweak/content/gmcompiler.js":762]
    r_keys = url,nextLink,insertBefore,pageElement
    info = [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],
[many many more pages of the same]
ct Object]
5 anonymous(res = [object Object]) ["chrome://pagetweak/content/gmcompiler.js":921]
    this = [object Object]
6 <TOP LEVEL> ["<unknown>":0]
    <failed to get 'this' value>
7 anonymous(1007) ["chrome://pagetweak/content/xmlhttprequester.js":86]
    this = [object XrayWrapper [object Window @ 0x7fffc9275390 (native @ 0x7fffc8886878)]]
8 <TOP LEVEL> ["<unknown>":0]
    <failed to get 'this' value>


and once more (different run):
0 CA_init(aWindow = [object XrayWrapper [object Window @ 0x7fffc8b5c9e0 (native @ 0x7fffc98de478)]]) ["file:///home/jst/fast/work/tip/fb-dbg/dist/bin/components/ConsoleAPI.js":102]
    contentObject = undefined
    sandbox = [object Sandbox]
    chromeObject = [object Object]
    self = [object Object]
    id = 17
    this = [object Object]
1 <TOP LEVEL> ["<unknown>":0]
    <failed to get 'this' value>
2 log(message = Error: Permission denied for <http://platform0.twitter.com> to get property Location.href) ["chrome://pagetweak/content/gmcompiler.js":1050]
3 anonymous(list = [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],...
Is the console API caching expensively here?  I don't know what happened with those patches in the end.
Whiteboard: [mlk][may be dup of 630072][hardblocker] → [mlk][may be dup of 630072][softblocker]
the storage bugs haven't landed yet, so they shouldn't be affecting this.

I'd have to look at what PageTweak does to see if it's hitting the console or not. It looks like based on c#11 that the fix for exposedProps is the culprit here.

cc'ing andreas.
I have patches in flight to fix some leaks. I land the last one in an hour and then I will try to reproduce this. Thanks for all the help with STRs so far.
This is not a dup but we have a diagnosis. Bug in Xray wrappers. Details coming.
Whiteboard: [mlk][may be dup of 630072][softblocker] → [mlk][hardblocker]
Summary: Huge memory leak when having PageTweak installed and opening the given URL → Xray wrappers don't cache resolved native properties on the holder object
There are three bugs here:

When resolving a native property on an xray wrapper, we forgot to look on the holder first whether we already resolved this property. This is usually slow, but not fatal. In case of window.wrappedJSObject.console we run CA_init which makes a sandbox every time, which makes a new compartment every time. Making a compartment uses a lot of memory (mostly for the prop tree, tracer, JM state etc) that we don't account towards gcMallocBytes, so we end up using a ton of memory. 

We need 3 fixes. Each of these fixes alone makes the symptoms go away.

1) This bug: check the holder object before re-resolving.

2) Use contentWindow.Function.bind() instead of the sandbox. Content might intercept that but can't do anything malicious with it anyway. We don't care. This avoids making a sandbox every time.

3) When making a compartment, we have to update gcMallocBytes to create visible memory pressure for the GC.
Attached patch patch (obsolete) (deleted) — Splinter Review
Attachment #512367 - Flags: review?(mrbkap)
Whiteboard: [mlk][hardblocker] → [mlk][hardblocker][has patch]
Filed 3) as bug 634155.
Filed 2) as bug 634156.
Neither 2) nor 3) are blockers.
Comment on attachment 512367 [details] [diff] [review]
patch

Nix the braces around the single-line if statement and r=mrbkap.
Attachment #512367 - Flags: review?(mrbkap) → review+
http://hg.mozilla.org/tracemonkey/rev/5b7eab632ba6
Whiteboard: [mlk][hardblocker][has patch] → [mlk][hardblocker][has patch], fixed-in-tracemonkey
Backed out in http://hg.mozilla.org/tracemonkey/rev/c323382f09cd - broke the reftest harness (so all of C,R,J), and 300-odd passwordmgr tests in mochitest-5. Hasn't gotten around to finishing any mochitest-other runs yet, but I wouldn't expect that to go well if the passwordmgr ones were so unhappy.
Whiteboard: [mlk][hardblocker][has patch], fixed-in-tracemonkey → [mlk][hardblocker][has patch]
26 mochitest-chrome and 83 mochitest-browser-chrome, unsurprising things like webconsole and focus and plaintext links that mess about in content.
Thanks for having my back philor.
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/5b7eab632ba6
http://hg.mozilla.org/mozilla-central/rev/c323382f09cd (backout)
Note: not marking as fixed because last changeset is a backout.
Attached patch patch (deleted) — Splinter Review
Attachment #512367 - Attachment is obsolete: true
Attachment #512624 - Flags: review?(mrbkap)
Attachment #512624 - Flags: review?(mrbkap) → review+
Attached patch Additional fix from mrbkap. (deleted) — Splinter Review
This is an additional fix from mrbkap that fixes xslt related test failures seen with the above patch. This also fixes other oranges, specifically ones that triggers us to re-use inner window objects, which leaves wrappers laying around referring to the old document, and potentially other old values as well.
Attachment #512690 - Flags: review+
http://hg.mozilla.org/tracemonkey/rev/951f34044122
Keywords: hang
Whiteboard: [mlk][hardblocker][has patch] → [mlk][hardblocker][has patch], fixed-in-tracemonkey
Landed on mozilla-central:

http://hg.mozilla.org/mozilla-central/rev/f9e8182eb125
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment on attachment 512690 [details] [diff] [review]
Additional fix from mrbkap.

> JS_TransplantObject(JSContext *cx, JSObject *origobj, JSObject *target)
> {
>+     // This function is called when an object moves between two
>+     // different compartments. In that case, we need to "move" the
>+     // window from origobj's compartment to target's compartment.
..12345

Fibonacci indentation! 4, please, rs=me on followup.

Is C++ one-line style better than ye old

/*
 * Big words
 * here.
 */

style? Dunno, sometimes the extra /* and */ on their own line help, sometimes // looks like chicken-scratch.

/be
Andreas, I have tested the most recent tinderbox build and it looks fine now. But one more question. The memory usage is quite different when loading the given website and the pageteak add-on enabled or disabled.

          enabled     disabled
Memory     350MB       150MB

With the PageTweak add-on enabled the cpu load is also higher and it takes much longer until it drops to 0%. Is that only the cause of the extension or should we get better in performance?
The extension does a ton of logging. That might eat CPU and memory. We shoiuld look into it though. Paging shaver and his about:memory swiss army knive.
Thanks Andreas! I will file a follow-up bug for that. But for now it looks great with this patch in place! Thanks for this quick turnaround. Marking as verified fixed with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b12pre) Gecko/20110215 Firefox/4.0b12pre
Status: RESOLVED → VERIFIED
Target Milestone: --- → mozilla2.0b12
Blocks: 634673
I was able to replicate this with an ad-hoc endurance test, and can also confirm that the huge spike in memory usage is fixed in today's build.

Firefox 4.0b12pre (2.0b12pre, en-US, 20110214030347)
Add-ons: PageTweak (1.0.5), Personas Interactive (1.0.6)
http://mozmill.blargon7.com/#/endurance/report/aa40f05493d5cee5fc40cc7c1501c206

Firefox 4.0b12pre (2.0b12pre, en-US, 20110217030357)
Add-ons: PageTweak (1.0.5), Personas Interactive (1.0.6)
http://mozmill.blargon7.com/#/endurance/report/aa40f05493d5cee5fc40cc7c1501cb2f
Status: VERIFIED → RESOLVED
Closed: 14 years ago14 years ago
(In reply to comment #41)
> cdleary-bot mozilla-central merge info:
> http://hg.mozilla.org/mozilla-central/rev/951f34044122

Chris, can you please explain that comment? What's different to the patch which has already been landed on the 15th? I'm not sure why the verified status has been removed.
Status: RESOLVED → VERIFIED
Keywords: mlk
Whiteboard: [mlk][hardblocker][has patch], fixed-in-tracemonkey → [hardblocker][has patch], fixed-in-tracemonkey
No longer blocks: mlk2.0
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: