Closed Bug 630072 Opened 14 years ago Closed 14 years ago

quora.com bloats out of control, part 1: js_UnbrandAndClearSlots leaks

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b11
Tracking Status
blocking2.0 --- final+

People

(Reporter: sayrer, Assigned: gal)

References

Details

(Keywords: memory-leak, regression, Whiteboard: [hardblocker], fixed-in-tracemonkey [has patch]mozmill-test-needed][mozmill-endurance])

Attachments

(7 files, 3 obsolete files)

Numbers for linux32 bit running w/ Flash: Memory according to Ubuntu about:home - 27M quora.com - 52M after 200 quora DOMWindows - 430-460M The initial load of Quora isn't bad. It's actually quite heavy in all browsers. But after clicking around for 200 DOMWindows in one tab in a debug build, our heap has grown out of control. Reading my debug logs, I don't see DocShell bloat, but I do see DOMWindow bloat. I'll attach my shutdown log in a debug build. Notice the build shuts down cleanly (no leaks), but DOMWindows with very old serial numbers are still alive at shutdown. The last 10 DOMWindows destroyed are a grab bag: --DOMWINDOW == 10 (0xad2cacc) [serial = 44] [outer = (nil)] [url = http://www.quora.com/Is-Mexico-a-part-of-North-America-or-Central-America] --DOMWINDOW == 9 (0xbac54c4) [serial = 89] [outer = (nil)] [url = http://www.quora.com/Urban-Dictionary] --DOMWINDOW == 8 (0x9a671834) [serial = 62] [outer = (nil)] [url = http://www.quora.com/Why-might-someone-think-that-being-called-a-gentleman-would-be-bad] --DOMWINDOW == 7 (0x987fdc6c) [serial = 123] [outer = (nil)] [url = http://www.quora.com/Android-Market/Are-there-any-tools-that-help-Android-developers-monitor-sales] --DOMWINDOW == 6 (0xc139a54) [serial = 100] [outer = (nil)] [url = http://www.quora.com/If-I-owned-a-domain-name-can-I-use-Posterous-iPhone-app-to-blog-there] --DOMWINDOW == 5 (0xb3d40c4) [serial = 78] [outer = (nil)] [url = http://www.quora.com/Why-is-calamansi-not-in-the-Merriam-Webster-dictionary] --DOMWINDOW == 4 (0x94c7485c) [serial = 133] [outer = (nil)] [url = http://www.quora.com/What-are-the-best-social-media-monitoring-product-websites] WARNING: NS_ENSURE_TRUE(!(mAsyncExecutionThread)) failed: file /home/sayrer/dev/mozilla-central/storage/src/mozStorageConnection.cpp, line 674 --DOMWINDOW == 3 (0x9a31ecc) [serial = 7] [outer = (nil)] [url = http://www.quora.com/Business-Development] --DOMWINDOW == 2 (0x9ecf0a74) [serial = 112] [outer = (nil)] [url = http://www.quora.com/Where-do-I-find-investors-for-an-Internet-startup-idea] --DOMWINDOW == 1 (0xa4af277c) [serial = 200] [outer = (nil)] [url = http://www.quora.com/Business-Development] --DOMWINDOW == 0 (0xe3781e4) [serial = 192] [outer = (nil)] [url = http://www.quora.com/Business-Development]
Attached file shutdown log (deleted) —
blocking2.0: --- → ?
Just tried a similar exercise on Facebook.com and it went much better.
I wonder if we in this case keep the DOM Windows in bfcache and not evict them before 20-30 mins http://mxr.mozilla.org/mozilla-central/source/docshell/shistory/src/nsSHEntry.cpp#61 Robert, what are the exact steps to reproduce? I've never used (or even heard of) quora.com.
I just logged into quora.com and clicked around using the suggested questions and tags.
OS: Linux → Windows CE
OS: Windows CE → Linux
I guess I need some more exact steps to reproduce, since when I have only quora.com open and I click the questions and tags, DOMWindow count stays between 16 and 22. And memory usage goes up perhaps 40 megs after some page loads. This is on 32 bit linux, debug build, with Flash.
I was able to prevent this bug by setting browser.sessionhistory.max_entries to 0. Does that mean it is bfcache related? I do get a lot of this warning with browser.sessionhistory.max_entries to 50 (the default). WARNING: NS_ENSURE_TRUE(txToRemove) failed: file /home/sayrer/dev/mozilla-central/docshell/shistory/src/nsSHistory.cpp, line 1251
I should probably just hide that warning, since it can happen in valid cases too. It is just that NS_ENSURE_TRUE is quite handy. But anyway, sounds like we're keeping DOMWindows in bfcache. Would be great to be able to reproduce.
sayrer, is the problem there if browser.sessionhistory.max_entries is left at the default 50 but you change browser.sessionhistory.max_total_viewers to 0? What about setting max_total_viewers to 1?
(In reply to comment #8) > sayrer, is the problem there if browser.sessionhistory.max_entries is left at > the default 50 but you change browser.sessionhistory.max_total_viewers to 0? Still happens. > What about setting max_total_viewers to 1? Still happens.
That doesn't obviously sound like bfcache to me, then...
Attached image Overall growth (massif visualiser pic) (deleted) —
I can reproduce this, running on Massif (x64-linux). Just one tab. Don't even need to click through 200 questions, more like 20 ish. Attached is a png showing heap use going up and up. Two more pngs to follow.
At least at first glance, there appears to be twice as much of pretty much everything, compared to the midpoint snapshot.
Well, that happens if there are many DOMWindows alive, especially if they are in bfcache when also presshell etc. are kept alive. I wish I could reproduce this. I'll retry tomorrow. Also, anyone who can produce this, does waiting 30mins (yes minutes), release some of the memory?
(In reply to comment #14) > Also, anyone who can produce this, does waiting 30mins (yes minutes), > release some of the memory? Is there a constant somewhere that can be changed to make the wait shorter?
Attached image Results if left idle for 30 mins (deleted) —
(In reply to comment #14) > Also, anyone who can produce this, does waiting 30mins (yes minutes), > release some of the memory? At least to a first approximation, no. The attached picture is a bit hard to make sense of, because the horizontal axis is in instructions, not wallclock time. What I did was: 1. start up, ramp up mem usage as discussed above This takes us out to about 33 billion insns (3.3e+10 on x axis) 2. Left it idle for 45 mins 3. Scrolled up and down a few times, so as to burn up a couple billion instructions, in order to give an identifiable post-idle period on the graph. I believe (3) produces the small plateau from about 3.4e10 to 3.6e10 (x axis) at around 1.15e+08 (y axis). So the memory use has fallen somewhat, but not by any means back to the starting level of around 2e+07. I can provide the entire heap profile file for further examination if that's helpful, either in raw form (massif.out.pid, 33MB) or as text after passing through ms_print.
(In reply to comment #17) > Created attachment 508601 [details] > Results if left idle for 30 mins > > At least to a first approximation, no. The attached picture is a bit > hard to make sense of, because the horizontal axis is in instructions, > not wallclock time. You can use Massif's --time-unit=ms option to get around this problem.
smaug, I wonder whether we should add a bunch of NSPR logging so we can get some visibility into what session history is doing...
peterv, could you look into this with your patches from bug 628599 applied? Seems this is potentially bad, but if it's not, we can certainly unblock on here.
Assignee: nobody → peterv
blocking2.0: ? → final+
Whiteboard: [hardblocker]
Not sure if this will just be noise: I sometimes see my FF4 session grow to gigabytes for only a 2-4 tabs. On these occasions it seems to grow and shrink when I shut down, cycling the number of times that there were tabs open. Once while shutting down, this grow-and-shrink cycle caused it to crash (it was hitting close to 4GB). Unfortunately, I can't get anything from the crash report: https://crash-stats.mozilla.com/report/pending/d7f6e256-28cc-4a5a-9f34-e89a62110126
re-tested on 64bit linux using a build which has the patches for Bug 630947 and bug 614347. DOMWindow count stays between 16 and 24.
(In reply to comment #22) > re-tested on 64bit linux using a build which has the patches for Bug 630947 and > bug 614347. DOMWindow count stays between 16 and 24. I'm having a hard time reproducing this even without those patches, DOMWindow count also stays around 16-24 and on shutdown all windows that are being released were still open.
blocking2.0: final+ → ?
Target Milestone: --- → mozilla2.0b11
blocking2.0: ? → final+
(In reply to comment #23) > > I'm having a hard time reproducing this even without those patches, The best way to reproduce this is to click on a question, and then click on related questions. It's the question page DOMWindows that stay around. The others (like buttons, etc) are collected.
I looked at the behavior for comment 24 with my shiny new GC heuristics which is much more aggressive about GCing and CCing iteratively in this case. This is not a GC scheduling problem, and I think this is not a CC problem in general. We keep GCing and we keep CCing frequently. What I did notice is that GC times are going up consistently, and with that CC times are going up massively too (even though we suspect always the same low number of objects). I think this means that we are leaking _GC objects_. The GC heap is growing and grow, or to be more precise the gray portion of it (XPConnect roots).
I still wonder why I, nor peterv, can't reproduce the problem. Does quora.com give different content to Europe or something?
When I've seen FF4 grow to almost 4GB, I noticed that over time it increasingly would "pause" for longer and longer time periods. Clicking in the window would first grey out the window and clicking again would give a "Window is not responsive" message from Windows. I would sometimes see it loose a few 100MB and gain it back and thensome on a single page refresh. Sadly, the situation would only repeat in seemingly non-deterministic ways.
(In reply to comment #22) > re-tested on 64bit linux using a build which has the patches for Bug 630947 and > bug 614347. DOMWindow count stays between 16 and 24. smaug, can you reproduce the problem without those patches, per comment 24? If so, it would seem we have our fix. I repro'd that behavior easily with a random week-old TM build. I can try with the patches later to see if that helps.
I re-tested this today with trunk build and couldn't reproduce. I was following the STR from comment 24. Note, the patch for Bug 630947 landed yesterday.
smaug, I am pretty sure this is a true leak and not a GC scheduling problem. I see the trash persisting across global GCs and CCs (GCs are getting slower and slower every time).
This could very well be a true leak. I just can't reproduce it, which is why I started to think if quora.com shows different content for Europeans.
With GC/CC logging I get this pattern when clicking through quora pages: GC 106ms GC 117ms GC 117ms GC 120ms GC 124ms GC 130ms GC 134ms GC 127ms GC 145ms GC 137ms --close window-- CC 229,556 objects (!!) 246ms CC 36,803 objects GC 72ms CC 6 objects GC 25ms CC 6 objects GC 24ms
This is definitely a true leak. With my patch we GC and CC a ton here, but the CC doesn't destroy much until we close the tab, at which point some 200k objects are destroyed. GCs and CCs are getting longer and longer along the way, GC up to 250ms and CC up to 500ms. This is definitely a true leak.
(Note: comment 32 removed uninteresting CCs during the initial GCs which didn't collect anything or much)
So by "leak" we mean a cycle that's apparently manually broken somewhere at page teardown, right?
We could try disabling the things we manually cycle-break on page teardown to get some idea of which objects participate in the cycle, I guess...
This is not DOM windows sticking around. I see DOM global objects being created and then destroyed when I click on the next question (++DOM --DOM in debug builds). Overall count stays around 15-25. The leak is elsewhere.
Once quora starts bloating navigating the page to cnn.com does _NOT_ release the garbage (multiple GCs/CCs occur in between). Setting the bf cache to 0 and navigating does also _NOT_ release the garbage. Closing the tab immediately releases the garbage.
This page seems to keep consuming more and more memory over time (Chrome grows a little, then stays put). Not sure if it is helpful, but it is representative of stuff I work on, and might at least be a data point. But also be completely unrelated (sorry if so): http://dev.sencha.com/deploy/ChartsDemo/examples/chart/Live%20Animated.html
Ok, we do end up leaking the main inner window after all: When I close the tab and we finally break the cycle the DOM windows unwind: --DOMWINDOW == 30 (0x7fffc6419078) [serial = 236] [outer = (nil)] [url = http://www.quora.com/From-a-founders-perspective-how-should-convertible-note-terms-be-structured-if-the-startup-does-not-expect-any-subsequent-fundraising-rounds] --DOMWINDOW == 29 (0x7fffc61b2078) [serial = 239] [outer = (nil)] [url = http://www.quora.com/How-much-equity-as-a-percentage-of-the-overall-number-of-shares-should-a-CXO-role-expect-from-a-company-that-is-just-finished-raising-its-A-round] --DOMWINDOW == 28 (0x7fffc9fafc78) [serial = 216] [outer = (nil)] [url = http://www.quora.com/What-are-standard-caps-for-Series-A-Valuations-within-early-stage-convertible-notes] --DOMWINDOW == 27 (0x7fffc9faf878) [serial = 212] [outer = (nil)] [url = http://www.quora.com/What-are-some-examples-of-top-VCs-having-done-convertible-notes-with-early-stage-companies] --DOMWINDOW == 26 (0x7fffc8570078) [serial = 208] [outer = (nil)] [url = http://www.quora.com/What-is-the-real-difference-between-raising-a-small-equity-seed-round-sometimes-called-a-series-seed-with-no-board-seats-and-raising-convertible-debt-with-a-cap] --DOMWINDOW == 25 (0x7fffcae58878) [serial = 146] [outer = (nil)] [url = http://www.quora.com/How-can-the-same-person-hold-more-than-1-board-seat-at-the-same-company] --DOMWINDOW == 24 (0x7fffc8085078) [serial = 145] [outer = (nil)] [url = http://www.quora.com/How-did-Mark-Zuckerberg-retain-control-of-Facebook-and-what-exactly-did-Sean-Parker-do-to-accomplish-this] --DOMWINDOW == 23 (0x7fffd0053878) [serial = 142] [outer = (nil)] [url = http://www.quora.com/What-did-Sequoia-do-to-****-off-Sean-Parker] --DOMWINDOW == 22 (0x7fffc8186478) [serial = 139] [outer = (nil)] [url = http://www.quora.com/What-will-Sean-Parkers-next-move-be-after-Founders-Fund] --DOMWINDOW == 21 (0x7fffc76fa478) [serial = 88] [outer = (nil)] [url = http://www.quora.com/How-much-equity-should-I-expect-when-joining-a-company-of-two-people-as-an-employee-when-the-company-is-roughly-one-year-old] --DOMWINDOW == 20 (0x7fffc4ebf878) [serial = 243] [outer = (nil)] [url = http://www.quora.com/How-much-equity-should-I-expect-when-joining-a-company-of-two-people-as-an-employee-when-the-company-is-roughly-one-year-old] --DOMWINDOW == 19 (0x7fffc649a878) [serial = 232] [outer = (nil)] [url = http://www.quora.com/If-a-startups-valuation-is-x-how-much-larger-should-its-cap-on-a-convertible-note-be] --DOMWINDOW == 18 (0x7fffcc727478) [serial = 228] [outer = (nil)] [url = http://www.quora.com/What-are-standard-caps-for-Series-A-Valuations-within-early-stage-convertible-notes] --DOMWINDOW == 17 (0x7fffcb18ac78) [serial = 224] [outer = (nil)] [url = http://www.quora.com/If-a-startups-valuation-is-x-how-much-larger-should-its-cap-on-a-convertible-note-be] --DOMWINDOW == 16 (0x7fffca859878) [serial = 220] [outer = (nil)] [url = http://www.quora.com/Does-a-cap-on-a-convertible-note-put-signaling-pressure-on-a-series-A-valuation] --DOMWINDOW == 15 (0x7fffc764a878) [serial = 204] [outer = (nil)] [url = http://www.quora.com/When-does-a-company-start-to-give-up-board-seats-Angel-round-Series-A] --DOMWINDOW == 14 (0x7fffc7867878) [serial = 135] [outer = (nil)] [url = http://www.quora.com/Sean-Parker-1/What-does-Sean-Parker-think-the-next-big-revolution-in-Web-X-0-will-be]
I linked my quora account to facebook and a bunch of facebook iframes are loaded in the background every time I navigate to a new quora question. Maybe this only happens with the facebook link?
jst was able to reproduce this without facebook link.
For me we reliably create 4 new nsHTMLDocuments on every page navigation, and we never clean them up until the window is closed.
It's possible the fix for bug 611653 is somehow involved, since it changes the way we clean up global windows. That fix landed to TM on Feb 1, and this bug seems to go back before then, so probably not. But it's at least possible it changed the symptoms.
I just tested a build from late 2008, and I'm seeing this same problem there as well. I'm not sure it's worth looking any further back... This is an *old* "leak".
Oh, and in that old build from 2008, we actually leak windows, whereas now we don't, we just hold on to them for way too long.
This is really a horrible leak and it makes CC and GC unbearably slow because so much slow XPConnect marking is involved. We clearly shouldn't block on this, but lets call it .x and we take an approved patch? peterv, do you have cycles to look into this?
blocking2.0: final+ → .x
Whiteboard: [hardblocker]
I must admit that I wasn't expecting to see the phrases "really a horrible leak" and "clearly shouldn't block on this" so close together.
I am seeing documents stay around longer than they should be. Looks like InstallTrigger at least can create cycles involving multiple windows: --[XPCWrappedNativeScope::mGlobalJSObject]-> 0x23633140 [JS Object (Window) (global=23633140)] --[InstallTrigger getter]-> 0x257305b0 [JS Object (Proxy) (global=23633140)] --[wrappedObject]-> 0x1d5d9ea8 [JS Object (Function) (global=1f25e780)] --[upvars[0]]-> 0x1f264708 [JS Object (Proxy) (global=1f2d8e10)] --[proto]-> 0x1f264990 [JS Object (Proxy) (global=16aa7460)] --[proto]-> 0x1f2648b8 [JS Object (Proxy) (global=16aa7460)] --[wrappedObject]-> 0x20da80f0 [JS Object (Object) (global=20da80a0)] --[parent]-> 0x20da80a0 [JS Object (Window) (global=20da80a0)] I'll see if I can figure out more about that. (In reply to comment #49) > I must admit that I wasn't expecting to see the phrases "really a horrible > leak" and "clearly shouldn't block on this" so close together. Personally, given that this leak has been around for so long, I'm not sure why we'd block this release on it. We should resolve it as quickly as possible, but how is blocking the release going to help with that?
(In reply to comment #50) > I am seeing documents stay around longer than they should be. Looks like > InstallTrigger at least can create cycles involving multiple windows: > > --[XPCWrappedNativeScope::mGlobalJSObject]-> 0x23633140 [JS Object > (Window) (global=23633140)] > --[InstallTrigger getter]-> 0x257305b0 [JS Object (Proxy) > (global=23633140)] > --[wrappedObject]-> 0x1d5d9ea8 [JS Object (Function) (global=1f25e780)] > --[upvars[0]]-> 0x1f264708 [JS Object (Proxy) (global=1f2d8e10)] > --[proto]-> 0x1f264990 [JS Object (Proxy) (global=16aa7460)] > --[proto]-> 0x1f2648b8 [JS Object (Proxy) (global=16aa7460)] > --[wrappedObject]-> 0x20da80f0 [JS Object (Object) (global=20da80a0)] > --[parent]-> 0x20da80a0 [JS Object (Window) (global=20da80a0)] > InstallTrigger was rewritten for FF4, Bug 550936.
Yes, I agree with peter here. I want to kill this leak asap, but would you keep FF4 out of the hands of hundreds of millions of users just because of this leak? Yes, there were InstallTrigger changes and yes, jst and I saw also InstallTrigger being still around.
09:54:18 < peterv> ok, the InstallTrigger is weird 09:54:41 < peterv> it almost looks like we compile the getter in the old window's scope and install in the new window or something 09:54:50 < peterv> and so new window keeps old alive 09:54:59 < peterv> but proxies are involved 09:55:17 < peterv> I'll be back probably in two hours 09:55:25 < peterv> ok 09:55:43 < peterv> InstallTrigger is a getter
0x23633140 is the old window peter says he has also seen this with the proto chain, not just InstallTrigger, but lets fix this stuff step by step
Ok ok! Sorry, I missed the fact that the leak is pre-existing; it's hard keeping track of all these different leak bugs.
Ok, now I remember where I saw InstallTrigger. It was still set on the window after the window was clear scoped, probably because its a ... getter! And the getter is parented to the window. So the getter is cached in the shape. But how does this make it onto the new window? Anyway, this is a good working theory so far. I will reproduce and poke around.
I wonder if this is related to bug 611242 comment 3. That bug is fixed, but I don't think we ever figured out where the per-tabness came from.
So I was in the defineGetter for InstallTrigger, and there's definitely something weird. The getter is a proxy parented to the new inner window (http://www.quora.com/Are-there-any-Chrome-extensions-like-FlashGot-for-FireFox), but checking proxy->proto->proto->parent we find the old inner window (http://www.quora.com/Which-extension-for-Chrome-would-be-the-best-replacement-for-SEOBooks-SEO-for-Firefox-plugin).
I might have an idea whats going on here. The wrapper map still has a wrapper for the proto, but its the wrong global and we pick it up. Have to find blake.
Alright, we have a couple problems here. a) When we pick up a wrapper from the wrapper map, we re-parent it to the current global object. We only do this for the wrapper itself, not its proto chain of wrappers. I can fix this. b) The global object we calculate might not be accurate for fast natives (which is almost everything), so I need Waldo's patch here. c) When we clear an object, dmandelin's new clear code doesn't clear method shapes so we are entrailing the getter which is parented to the wrong window potentially. Working on a fix for this. There might be more problems still.
Attached patch patch (obsolete) (deleted) — Splinter Review
Assignee: peterv → gal
blocking2.0: .x → final+
Keywords: regression
Some part of it is a regression. The attached patch fixes a bunch of leaks, but not all. Need peterv to find the rest. What's remaining is likely the original leak, so that might not be a regression any more.
As Andreas mentions we're still leaking with that patch. Right now it looks like the editor is holding on to the document, I'm still tracking down who holds on to the editor.
blocking2.0: final+ → ---
blocking2.0 was removed accidentally.
blocking2.0: --- → ?
blocking2.0: ? → final+
Blocks: 633382
Still applies cleanly to TM trunk.
Andreas, I'm still seeing the InstallTrigger getter hold an old window alive with a slightly different path: --[InstallTrigger getter]-> 0x7fffcd03bcc0 [JS Object (Proxy) (global=7fffcd066a20)] --[wrappedObject]-> 0x7fffd21ea4d0 [JS Object (Function) (global=7fffdeb35f30)] --[upvars[0]]-> 0x7fffdebba0d0 [JS Object (Proxy) (global=7fffdaa6c0d8)] --[proto]-> 0x7fffdaa93340 [JS Object (Proxy) (global=7fffdaa6c0d8)] --[wrappedObject]-> 0x7fffd475d108 [JS Object (XPC_WN_ModsAllowed_NoCall_Proto_JSClass - Window) (global=7fffd475b120)] --[parent]-> 0x7fffd475b120 [JS Object (Window) (global=7fffd475b120)] --[xpc_GetJSPrivate(obj)]-> 0x7fffe1cb3ac0 [XPCWrappedNative (Window)] --[mIdentity]-> 0x7fffd46a5078 [nsGlobalWindow]
Yeah, this makes sense. We re-parent the wrapper and its proto chain to a more recent window but if we don't do that for the getter we still leak. I have to try to reproduce this.
The steps I'm using are: * load www.quora.com * search for 'fi', Firefox will be first in the dropdown list, select it * select a question * select a question from the sidebar on the right * select a question from the sidebar on the right At this point it looks like the www.quora.com window is alive, held by the chain I've posted in comment 66. So that is probably the first window loaded in that compartment. That's with attachment 511565 [details] [diff] [review] applied.
If we just keep the original window alive, I would be willing to downgrade to soft blocker. Yesterday we were keeping _all_ windows alive, bloating like nuts. I will check and then get my patch reviewed and landed.
Attached patch Updated patch (obsolete) (deleted) — Splinter Review
This is Andreas' patch, but removes some leftover js_UnbrandAndClearSlots cruft in jsobj.h and jsscope.h. Applies cleanly to current mozilla-central.
we should land the patch and investigate the rest of the leak in a new bug
Attachment #511858 - Flags: review?(mrbkap)
Attachment #511858 - Attachment is patch: true
Attachment #511858 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 511858 [details] [diff] [review] Updated patch Dvander, can you please review the ClearScope part.
Attachment #511858 - Flags: review?(dvander)
Attachment #511565 - Attachment is obsolete: true
Comment on attachment 511858 [details] [diff] [review] Updated patch I'm holding my breath that changing the signature of JS_ClearScope isn't going to bite us, but I "like" the approach.
Attachment #511858 - Flags: review?(mrbkap) → review+
Attachment #511858 - Flags: review?(dvander) → review+
Gal's patch landed: http://hg.mozilla.org/mozilla-central/rev/3fb25cc2c040 There's more to do here, so leaving bug open.
Here is my rationale for the change: If you ignore the return (OOM), you are no worse off than before: you cleared the scope, but it doesn't have enough slots for JM. Yes, this API change is lame but hey its the best we can do right now. This is too important to fix to ignore it and I don't like working around API design bugs such as ClearScope not being able to fail.
Blocks: 633672
No longer blocks: 633672
Depends on: 633672
Whiteboard: [hardblocker], fixed-in-tracemonkey
JST is saying the isWrapper() assert is hitting, that means we have a wrapper that has a non-wrapper proto. Weird! mrbkap? + do { + JS_ASSERT(obj->isWrapper()); + obj->setParent(global);
I had to back this out due to hitting the assertion: Assertion failure: obj->isWrapper(), at ../../../js/src/jscompartment.cpp:276 in the mochichrome test js/src/xpconnect/tests/chrome/test_evalInSandbox.xul. And it seems we also hit the same assertion in the browser chrome test toolkit/components/console/hudservice/tests/browser/browser_webconsole_bug_583816_tab_focus.js. http://hg.mozilla.org/mozilla-central/rev/704dd93dad14
Comment on attachment 511858 [details] [diff] [review] Updated patch >+ /* >+ * Compile-and-go scripts might rely on these slots to be present, >+ * so set a bunch of dummy properties to make sure compiled code >+ * doesn't reach into empty space. >+ */ >+ uint32 n = 0; >+ while (obj->slotSpan() < span) { >+ if (!JS_DefinePropertyById(cx, obj, INT_TO_JSID(n), JSVAL_VOID, NULL, NULL, 0)) >+ return false; >+ ++n; >+ } > } > Do not make JS_ClearScope fallible. If any compile-n-go scripts remain that use global vars (how can this be), I don't see how changing those slots to undefined can help. What's really going wrong here? /be
(In reply to comment #78) > Comment on attachment 511858 [details] [diff] [review] > Updated patch > > >+ /* > >+ * Compile-and-go scripts might rely on these slots to be present, > >+ * so set a bunch of dummy properties to make sure compiled code > >+ * doesn't reach into empty space. > >+ */ > >+ uint32 n = 0; > >+ while (obj->slotSpan() < span) { > >+ if (!JS_DefinePropertyById(cx, obj, INT_TO_JSID(n), JSVAL_VOID, NULL, NULL, 0)) > >+ return false; > >+ ++n; > >+ } > > } > > > > Do not make JS_ClearScope fallible. If any compile-n-go scripts remain that use > global vars (how can this be), I don't see how changing those slots to > undefined can help. Indeed compile-n-go means run at most once, so if any remain, they are currently running! How in the world can it be necessary to set a running script's vars to undefined? Everything runs on the main thread, so this means JS_ClearScope is nesting on a stack under which script is in js::Interpret. This should *never happen*. /be
Whiteboard: [hardblocker], fixed-in-tracemonkey → [hardblocker], fixed-in-tracemonkey [has patch]
The script could enter a nested event loop. I think we try to avoid calling JS_ClearScope if script is running, but if there are multiple windows involved and code that belongs to both of them running we'd only avoid clearing the one the script was entered in, I bet. A stack to the JS_ClearScope (both JS and C++) would likely make it very clear what's happening.
Ok, but setting slot to undefined won't reliably cause the script to fail, or prevent it from setting slots to other values (IINM). Yeah, we need to get the stack, if any. That might reveal a lot. /be
Stack for the chrome test assertion: Assertion failure: obj->isWrapper(), at ../../../mozilla/js/src/jscompartment.cpp:276 #0 0x00007ffff7bce29b in raise () from /lib64/libpthread.so.0 #1 0x00007ffff656a118 in JS_Assert (s=0x7ffff6e5da0c "obj->isWrapper()", file= 0x7ffff6e5d928 "../../../mozilla/js/src/jscompartment.cpp", ln=276) at ../../../mozilla/js/src/jsutil.cpp:83 #2 0x00007ffff6438786 in JSCompartment::wrap (this=0x7fffde6e7000, cx= 0x7fffcd22b400, vp=0x7fffffff87e8) at ../../../mozilla/js/src/jscompartment.cpp:276 #3 0x00007ffff6438b19 in JSCompartment::wrap (this=0x7fffde6e7000, cx= 0x7fffcd22b400, objp=0x7fffffff8860) at ../../../mozilla/js/src/jscompartment.cpp:348 #4 0x00007ffff656d3cb in JSCrossCompartmentWrapper::get (this=0x7ffff7bb8fe0, cx=0x7fffcd22b400, wrapper=0x7fffcc8691a0, receiver=0x7fffcc868048, id= ..., vp=0x7fffffff91d0) at ../../../mozilla/js/src/jswrapper.cpp:516 #5 0x00007ffff650fd95 in js::JSProxy::get (cx=0x7fffcd22b400, proxy= 0x7fffcc8691a0, receiver=0x7fffcc868048, id=..., vp=0x7fffffff91d0) at ../../../mozilla/js/src/jsproxy.cpp:800 #6 0x00007ffff64c766e in js_GetPropertyHelperWithShapeInline (cx= 0x7fffcd22b400, obj=0x7fffcc868048, receiver=0x7fffcc868048, id=..., getHow=1, vp=0x7fffffff91d0, shapeOut=0x7fffffff8ab8, holderOut= 0x7fffffff8ab0) at ../../../mozilla/js/src/jsobj.cpp:5379 #7 0x00007ffff64c782c in js_GetPropertyHelperInline (cx=0x7fffcd22b400, obj= 0x7fffcc868048, receiver=0x7fffcc868048, id=..., getHow=1, vp= 0x7fffffff91d0) at ../../../mozilla/js/src/jsobj.cpp:5412 #8 0x00007ffff64c787a in js_GetPropertyHelper (cx=0x7fffcd22b400, obj= 0x7fffcc868048, id=..., getHow=1, vp=0x7fffffff91d0) at ../../../mozilla/js/src/jsobj.cpp:5418 #9 0x00007ffff66e69f9 in js::Interpret (cx=0x7fffcd22b400, entryFrame= 0x7fffdd1fe1a0, inlineCallCount=0, interpMode=JSINTERP_NORMAL) at ../../../mozilla/js/src/jsinterp.cpp:4135 #10 0x00007ffff64a0895 in js::RunScript (cx=0x7fffcd22b400, script= 0x7fffcd403fc0, fp=0x7fffdd1fe1a0) at ../../../mozilla/js/src/jsinterp.cpp:640 #11 0x00007ffff64a1d07 in js::Execute (cx=0x7fffcd22b400, chain= 0x7fffcc868048, script=0x7fffcd403fc0, prev=0x0, flags=0, result= 0x7fffffffa560) at ../../../mozilla/js/src/jsinterp.cpp:1006 #12 0x00007ffff63fd4f7 in EvaluateUCScriptForPrincipalsCommon (cx= 0x7fffcd22b400, obj=0x7fffcc868048, principals=0x7fffdf0b5298, chars= 0x7fffcd5f4280, length=25, filename= 0x7fffcd238f80 "chrome://mochitests/content/chrome/js/src/xpconnect/tests/ch rome/test_evalInSandbox.xul", lineno=101, rval=0x7fffffffa560, compileVersion= JSVERSION_DEFAULT) at ../../../mozilla/js/src/jsapi.cpp:5060 #13 0x00007ffff63fd633 in JS_EvaluateUCScriptForPrincipals (cx=0x7fffcd22b400, obj=0x7fffcc868048, principals=0x7fffdf0b5298, chars=0x7fffcd5f4280, length=25, filename= 0x7fffcd238f80 "chrome://mochitests/content/chrome/js/src/xpconnect/tests/chrome/test_evalInSandbox.xul", lineno=101, rval=0x7fffffffa560) at ../../../mozilla/js/src/jsapi.cpp:5087 #14 0x00007ffff596e441 in xpc_EvalInSandbox (cx=0x7fffd7dadc00, sandbox= 0x7fffcc868048, source=..., filename= 0x7fffcd238f80 "chrome://mochitests/content/chrome/js/src/xpconnect/tests/chrome/test_evalInSandbox.xul", lineNo=101, jsVersion=JSVERSION_DEFAULT, returnStringOnly=0, rval=0x7fffdd1fe138) at ../../../../../mozilla/js/src/xpconnect/src/xpccomponents.cpp:3688 #15 0x00007ffff596dd3f in nsXPCComponents_Utils::EvalInSandbox (this= 0x7fffcd4109d0, source=...) at ../../../../../mozilla/js/src/xpconnect/src/xpccomponents.cpp:3578 #16 0x00007ffff620bfaa in NS_InvokeByIndex_P (that=0x7fffcd4109d0, methodIndex= 6, paramCount=1, params=0x7fffffffaa60) at ../../../../../../../mozilla/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_x86_64_unix.cpp:195 #17 0x00007ffff59adb84 in CallMethodHelper::Invoke (this=0x7fffffffaa20) at ../../../../../mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:3126 #18 0x00007ffff59abb79 in CallMethodHelper::Call (this=0x7fffffffaa20) at ../../../../../mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2392 #19 0x00007ffff59a7c16 in XPCWrappedNative::CallMethod (ccx=..., mode= XPCWrappedNative::CALL_METHOD) at ../../../../../mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2356 #20 0x00007ffff59b798c in XPC_WN_CallMethod (cx=0x7fffd7dadc00, argc=2, vp= 0x7fffdd1fe138) at ../../../../../mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1613 #21 0x00007ffff64a480e in js::CallJSNative (cx=0x7fffd7dadc00, native= 0x7ffff59b7719 <XPC_WN_CallMethod(JSContext*, uintN, jsval*)>, argc=2, vp= 0x7fffdd1fe138) at ../../../mozilla/js/src/jscntxtinlines.h:697 #22 0x00007ffff6668641 in CallCompiler::generateNativeStub (this= 0x7fffffffb5f0) at ../../../mozilla/js/src/methodjit/MonoIC.cpp:808 #23 0x00007ffff6664a20 in js::mjit::ic::NativeCall (f=..., ic=0x7fffcd2eb0e0) at ../../../mozilla/js/src/methodjit/MonoIC.cpp:1015 #24 0x00007fffdb846099 in ?? () #25 0x00007fffdb844ca8 in ?? () #26 0x0000000000000000 in ?? () (gdb) call js_DumpObject(obj) object 0x7fffcc851898 class 0x7ffff7b571a0 XPC_WN_ModsAllowed_NoCall_Proto_JSClass flags: delegate own_shape inDictionaryMode hasPropertyTable properties: "__defineSetter__": slot 11 "__defineGetter__": slot 10 "__lookupSetter__": slot 9 "__lookupGetter__": slot 8 enumerate "dump": slot 7 enumerate "scrollByLines": slot 6 enumerate "getSelection": slot 5 enumerate shared "top": slot -1 enumerate shared "parent": slot -1 enumerate shared "name": slot -1 enumerate "addEventListener": slot 4 enumerate "removeEventListener": slot 3 enumerate "dispatchEvent": slot 2 enumerate "getComputedStyle": slot 1 enumerate shared "localStorage": slot -1 enumerate shared "globalStorage": slot -1 enumerate shared "sessionStorage": slot -1 proto <Global Scope Polluter object at 0x7fffcc845900> parent <Window object at 0x7fffcc845828> private 0x7fffcd52fc40 slots: 0 (reserved) = undefined 1 = <function getComputedStyle at 0x7fffcc854b80 (JSFunction at 0x7fffcc854b80)> 2 = <function dispatchEvent at 0x7fffcc854c00 (JSFunction at 0x7fffcc854c00)> 3 = <function removeEventListener at 0x7fffcc854c80 (JSFunction at 0x7fffcc854c80)> 4 = <function addEventListener at 0x7fffcc854d00 (JSFunction at 0x7fffcc854d00)> 5 = <function getSelection at 0x7fffcc854f00 (JSFunction at 0x7fffcc854f00)> 6 = <function scrollByLines at 0x7fffcc854f80 (JSFunction at 0x7fffcc854f80)> 7 = <function dump at 0x7fffcc854e80 (JSFunction at 0x7fffcc854e80)> 8 = <function __lookupGetter__ at 0x7fffcc855080 (JSFunction at 0x7fffcc855080)> 9 = <function __lookupSetter__ at 0x7fffcc855100 (JSFunction at 0x7fffcc855100)> 10 = <function __defineGetter__ at 0x7fffcc855180 (JSFunction at 0x7fffcc855180)> 11 = <function __defineSetter__ at 0x7fffcc855200 (JSFunction at 0x7fffcc855200)>
Scripts can be run on a window using call after it was closed. We clear the scope because otherwise we leak. JM crashes after clearscope when running code. We need the clear scope. We need JM to not crash. Nobody likes this hack but we need to ship now. I don't need the bool return. After 4 we will try to remove the clear scope on window close. I will post a smaller patch without the API change.
How do compile-n-go scripts run after the window has been closed? Do you mean functions? /be
If we really need something like what the patch does, then rather than re-add properties to increase slotSpan, just (infallibly) trim the shape path back to the old slotSpan. /be
Summary: quora.com bloats out of control → quora.com bloats out of control, part 1: Wrappers hold objects from old scopes alive
How? Is setting slotSpan sufficient?
No, but clearing scopes back to below the saved slotSpan is apparently wrong -- can you say why? What happens, what bug (resolved already) records a malfunction? See JSObject::clear -- you want a variant that while-loops until !shape->parent OR shape->slotSpan == the saved slotSpan. Something still smells here, and needs to be diagnosed. Is this function code or global code that is running, with JS_ClearScope nesting under it? Or is function code (compilen-n-go) called again later? If so, can we kill such JS engine entry hard, instead of screwing around with global slots? /be
Shapes can keep function objects alive (branding). We can't use any part of the old shape paths. This is a terrible 10 year old hack. Nobody likes it. I wish we could fix this for real for 4 but we don't have that luxury. We need a low risk spot fix. If you want to take the bug feel free but if I keep it I really want to stick to a minimal fix. Too much to do. Too little time.
(In reply to comment #87) > No, but clearing scopes back to below the saved slotSpan is apparently wrong -- > can you say why? What happens, what bug (resolved already) records a > malfunction? I believe bug 611653 is what first brought the issue of compile-and-go functions running in a window where we've cleared scope.
The patch was backed out, so what is fixed here? My point is we need to understand the real problem that results in leaks if we get rid of JS_ClearScope. Without that, we're playing whack-a-mole. Andreas: I reviewed dmandelin's unbranding patch, no need to rehash that. That patch did deal with unbranding, the issue here (besides the patch bouncing) is the setting to undefined of slots up to the old slotSpan. Fixing this bug is not a matter of who jumps on the hand grenade, especially if the patch keeps bouncing. We need a better understanding of the problem before hacking on JS_ClearScope. jst: thanks, re-reading. /be
Johnny, that's the unbranding patch dmandelin did, with five reviews including mine. What I was looking for is a much older bug, where we tried getting rid of JS_ClearScope usage and just letting old inner windows become garbage. Blake may remember, or possibly no one knows what caused the malfunctions (leaks at least). If we can figure that out, I think we can make a good short term patch. If we can't, I'm not sure what to do. At the least, instead of defining int-id'ed properties with undefined values in slots that could be in use by a running script, we'd be better off killing all running scripts and prevening any from running. Just voiding global slots may or may not cure leaks, but it can't be good for those scripts or any effects they can commit. /be
It's not the slots. You are holding shapes which hold getters setters and functions. Unbranding and clearing slots is not sufficient.
(In reply to comment #92) > It's not the slots. You are holding shapes which hold getters setters and > functions. Unbranding and clearing slots is not sufficient. Then why are you doing the latter in the patch that bounced for this bug? The patch for bug 611653 did unbrand correctly AFAICT, and all JS_ClearScope implementations have always removed shapes holding getters, setters, and methods. That's what clearing the global scope has always done. Again, what is the leak or other problem that arises if we don't bother clearing global scope at all? If we have a latent memory leak, maybe we need something like global clearing to work around it. But we would be better off not even letting compile-and-go scripts (functions? must be) run against the cleared global. Why not batten down the hatches? /be
Bug 470510 is about leaks when we remove JS_ClearScope. It's been on my plate to debug why, but there's been too much other stuff to do.
Comment on attachment 511858 [details] [diff] [review] Updated patch >- if (obj->isNative()) { >- js_UnbrandAndClearSlots(mContext, obj); As bug 611653 comment 78 and later pointed out, unbranding is infallible. These removed lines acted accordingly. The new patch does not, and that's a bug. >@@ -3838,16 +3840,36 @@ JS_ClearScope(JSContext *cx, JSObject *o > if (obj->isNative()) > js_ClearNative(cx, obj); > >+ js_InitRandom(cx); >+ > /* Clear cached class objects on the global object. */ > if (obj->isGlobal()) { >+ if (!obj->unbrand(cx)) >+ return false; So this should just call obj->unbrand(cx) and ignore the r.v. >+ /* >+ * Compile-and-go scripts might rely on these slots to be present, >+ * so set a bunch of dummy properties to make sure compiled code >+ * doesn't reach into empty space. >+ */ >+ uint32 n = 0; >+ while (obj->slotSpan() < span) { >+ if (!JS_DefinePropertyById(cx, obj, INT_TO_JSID(n), JSVAL_VOID, NULL, NULL, 0)) >+ return false; >+ ++n; >+ } From bug 596145 it seems we have events targeting nodes with handlers scoped by the global that was clear-scoped. The new-in-Fx4 GNAME ops optimize name to slot. Therefore we must prevent such control flows from reaching any functions compiled via a compile-n-go load of their enclosing script. So while laying down undefines is tempting, I continue to believe that if we can't fix the underlying leaks, we ought to stop any such precompiled-as-if-the-global-is-never-cleared scripts (functions) from running. It may be that these control flows are the source of the leaks peterv has not yet diagnosed. Johnny, Blake, anyone: why are events flowing to an inner window that has been cleared? /be
bz may know this: if the inner is back or forward in history, should its DOM not be frozen from getting events? If cleared but not yet back or forward, that seems like a bad intermediate state in which to receive events. /be
(In reply to comment #96) > bz may know this: if the inner is back or forward in history, should its DOM > not be frozen from getting events? Of course (duh), if held by the bfcache, the DOM code notices the inner window is frozen and it won't clear scope. nsGlobalWindow::SetNewDocument calls FreeInnerObjects, which calls ClearScopeWhenAllScriptsStop, which takes pains to run itself via a termination function if the context is executing any script. In the stack roc included in bug 596145, we must have cleared the relevant window scope when nothing was running. Ergo something is busted in that machinery, or the clear happened earlier. Let's say the clear happened earlier. Perfectly plausible to have a window or frame navigate while another holds a function compile-specialized to the pre-navigation global (inner window). Then it must be the case that that inner was not bfcached. So it was clear-scoped. This line of reasoning leads to the conclusion that the gname optimization is unsound without either (a) removing clear-scope (fixing undiagnosed leaks); or (b) validating globals as "uncleared" before trying to run (function) scripts against them. Again I would prefer (b) to hacking on how clear works, if (a) is not possible. /be
OS: Linux → Windows CE
Blocks: 633879
Summary: quora.com bloats out of control, part 1: Wrappers hold objects from old scopes alive → quora.com bloats out of control, part 1: js_UnbrandAndClearSlots leaks
I narrowed the scope of this bug and split off the rest.
Attached patch patch (obsolete) (deleted) — Splinter Review
Attachment #511858 - Attachment is obsolete: true
Attached patch patch (deleted) — Splinter Review
Attachment #512087 - Attachment is obsolete: true
Attachment #512088 - Flags: review?(brendan)
Comment on attachment 512088 [details] [diff] [review] patch >+MSG_DEF(JSMSG_CLEARED_SCOPE, 269, 0, JSEXN_TYPEERR, "cleared scope object") A bit terse, and yet not specific. How about "attempting to run a script on a dead global object"? Or s/dead/cleared/ if that helps (I think it does not). >+++ b/js/src/jsapi.h >@@ -1957,22 +1957,26 @@ struct JSClass { > #define JSCLASS_MARK_IS_TRACE (1<<(JSCLASS_HIGH_FLAGS_SHIFT+3)) > #define JSCLASS_INTERNAL_FLAG2 (1<<(JSCLASS_HIGH_FLAGS_SHIFT+4)) > > /* Indicate whether the proto or ctor should be frozen. */ > #define JSCLASS_FREEZE_PROTO (1<<(JSCLASS_HIGH_FLAGS_SHIFT+5)) > #define JSCLASS_FREEZE_CTOR (1<<(JSCLASS_HIGH_FLAGS_SHIFT+6)) > > /* Additional global reserved slots, beyond those for standard prototypes. */ >-#define JSRESERVED_GLOBAL_SLOTS_COUNT 5 >+#define JSRESERVED_GLOBAL_SLOTS_COUNT 6 > #define JSRESERVED_GLOBAL_THIS (JSProto_LIMIT * 3) > #define JSRESERVED_GLOBAL_THROWTYPEERROR (JSRESERVED_GLOBAL_THIS + 1) > #define JSRESERVED_GLOBAL_REGEXP_STATICS (JSRESERVED_GLOBAL_THROWTYPEERROR + 1) > #define JSRESERVED_GLOBAL_FUNCTION_NS (JSRESERVED_GLOBAL_REGEXP_STATICS + 1) > #define JSRESERVED_GLOBAL_EVAL_ALLOWED (JSRESERVED_GLOBAL_FUNCTION_NS + 1) >+#define JSRESERVED_GLOBAL_FLAGS (JSRESERVED_GLOBAL_EVAL_ALLOWED + 1) >+ >+/* Global flags. */ >+#define JSGLOBAL_FLAGS_CLEARED 0x01 ..1234567890123456789012345678901234567890 Nit: indent to column 40, as the earlier #define macro bodies are (before the JSRESERVED_GLOBAL_MY_NAME_IS_TOO_LONG bunch :-). Thanks, this is more like it. Or not, since I have a nagging feeling this will result in overt errors where before we got obscure NaN or undefined or just(!) crash bugs. No good deed goes unpunished, so bring it! /be
Attachment #512088 - Flags: review?(brendan) → review+
Testing with the patch that landed here, and the patches from bug 633738 and bug 633879 I can no longer reproduce this leak!
Status: NEW → RESOLVED
Closed: 14 years ago
OS: Windows CE → All
Hardware: x86 → All
Resolution: --- → FIXED
Version: unspecified → Trunk
> should its DOM not be frozen from getting events? There is no way to enforce that; you can dispatch events from script using dispatchEvent, and you can hold refs to things that are history-cached.
Seems like we could use wrapper brain transplants to avoid that. Though once we have a memory cache that works well, and maybe some cached script compilation stuff, I wonder if bfcache is the big win it was before people really adapted to tabbed browsing!
I'd just like to say that while I understand the need for this patch, and actually think that preventing functions compiled in the scope of destroyed windows is generally for the best, this change breaks quite a lot of code in ways that I'm getting tired of seeing so late in the release cycle.
(In reply to comment #107) > I'd just like to say that while I understand the need for this patch, and > actually think that preventing functions compiled in the scope of destroyed > windows is generally for the best, this change breaks quite a lot of code in > ways that I'm getting tired of seeing so late in the release cycle. Breaks what? Please give URLs or examples? I'm not doubting but it's hard to act on "breaks quite a lot of code". Also, if we gave those functions undefined instead of an error they might be as broken. We need details to know, but what we did before this patch landed wasn't "right". It might have been expected more, though other browsers don't all do the same as we did. /be
The (wrong) behavior of FF has been for many years to clear all slots. Running a method against such a window is almost never successful. At least now we throw some detectable error. The real bug here is clearing the window which we do to paper over a leak/cycle collector problem which has existed for a decade and counting. I will make getting rid of that JS_ClearScope my top priority for post-FF4 (which is days away, really). Sorry that we can't provide a better fix for FF4. We are simply out of time.
Like I said, I like this change, I'm just unhappy that it happened so late in the release cycle. I think in most cases the errors that it throws will probably uncover more bugs than they cause, but I also think that it will cause unexpected breakage in extension code that's already been marked compatible with 4.0.* since AMO began allowing that over a month ago. The code that I've seen breaking has been code of mine carefully constructed not to (directly) access properties on the global, or otherwise timers that have fired just after window destruction. I'm not terribly bothered by the latter, and I've gotten rid of most of the former, but I'm certainly not the only one who's written such code. However, since I fortunately decided to mark my last beta compatible only to b11, after I got burned marking the previous beta 1.0 compatible, I'm not worried about this for my part. I will mention, though, that this error appears in the console after restoring a session from about:sessionrestore: Error: attempt to run compile-and-go script on a cleared scope Source file: chrome://browser/content/aboutSessionRestore.js Line: 279
I've seen it on google docs, too. If it were operating on a closure or similar which didn't manipulate the cleared global itself, would it see a change in behaviour here?
(In reply to comment #111) > I've seen it on google docs, too. If it were operating on a closure or similar > which didn't manipulate the cleared global itself, would it see a change in > behaviour here? Yes, absolutely. An attempt in js::RunScript when trying to run the function compiled against the defunct global is immediately thwarted by an exception. Previously we'd run such a function and if it happened to use globals it might see undefined values. Looks like we need to back off. One idea: if the script uses no globals, let it run. That might be enough. /be
Depends on: 635548
I didn't see this answered above: is there a reason that a call into a JS_ClearScope'd window must necessarily flow through RunScript? The counterexample I'm thinking of is a same-origin (so, no wrapper indirection) cross-global call.
Whiteboard: [hardblocker], fixed-in-tracemonkey [has patch] → [hardblocker], fixed-in-tracemonkey [has patch]mozmill-test-needed][mozmill-endurance]
Keywords: mlk
No longer blocks: mlk2.0
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: