865959 - [meta] need more aggressive GC on reload for pages that use lots of memory

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Description

•

11 years ago

Attached file big-alloc.html (deleted) — Details

The attached testcase allocates a 1GB arraybuffer.  You should be able to keep reloading it practically forever, but right now we still have the objects from the previous page in memory, so future allocations fail. (32-bit build)

We should be looking at the memory in use by a page, and if it's over some threshold, we should aggressively reclaim on reload or navigation away (e.g. we shouldn't keep those pages around in bfcache but blow them away right away).

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Whiteboard: [MemShrink]

Luke Wagner [:luke]

Comment 1

•

11 years ago

Duplicating the proposal in bug 865960 comment 3:
 1. on large ArrayBuffer allocations (and other places with the same large-allocation problem), if we fail, we call some "large allocation failure handler" that is a callback into the DOM
 2. from this handler, we drop everything in the BF Cache in the whole browser
 3. after calling the large allocation failure handler, we retry the allocation.  If this fails, then print an error message (or, even better, attempt to throw, so that the page can give the user the advice to close other tabs)

Here I say "BF Cache", but I don't really understand what I'm talking about and I'm told reload is a separate problem.  I'm hoping that the same callback could deal with reload as well.

Whiteboard: [MemShrink]

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Whiteboard: [MemShrink]

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

11 years ago

Whiteboard: [MemShrink] → [MemShrink:P2]

Nicholas Nethercote [inactive]

Comment 2

•

11 years ago

This is relevant to the post-fix state of bug 902922.  http://p3d.in/3axnM is the test case.

BTW, we now have code that can measure with reasonable accuracy how much memory a page (tab) is using (thanks to bug 918207).  On my fast Linux desktop, measuring the JS memory consumption of a gmail tab takes about 8ms (once bug 930876 lands), and measuring the non-JS memory consumption takes about 5ms.  So, fairly fast.

Nicholas Nethercote [inactive]

Comment 3

•

11 years ago

> BTW, we now have code that can measure with reasonable accuracy how much
> memory a page (tab) is using

Just to clarify:  the measurement of JS memory consumption is very comprehensive;  the measurement of everything else is moderately comprehensive (e.g. much of DOM, layout, style stuff is covered, but images aren't measured per-tab (bug 921300 is open for that), and lots of gfx stuff isn't measured).

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Updated

•

11 years ago

Whiteboard: [MemShrink:P2] → [MemShrink:P2] [games:p1]

Andrew McCreight [:mccr8]

Comment 4

•

11 years ago

Maybe hooking something into WindowDestroyedEvent::Run would be the way to go?  That's where we do NukeCrossCompartmentWrappers.  Maybe we could have some kind of nuke-giant-array-buffers.

Andrew McCreight [:mccr8]

Comment 5

•

11 years ago

You could try printing something in that method to see if the timing is in line with what you are thinking of.

Andrew McCreight [:mccr8]

Comment 6

•

11 years ago

Let me try to throw something together...

Assignee: nobody → continuation

Andrew McCreight [:mccr8]

Comment 7

•

11 years ago

Attached patch nuke array buffers, WIP (obsolete) (deleted) — Details — Splinter Review

Here's a rough patch that seems to more or less handle the reload case.  I spam clicked reload for ten or so seconds on the test case, then quickly measured memory and it was 1gig.  In contrast, without that patch, there was an extra gig or two of heap-unclassified hanging around.

Basically, when we're notified that a window is destroyed, right in the place where we nuke cross compartment wrappers, we iterate over all objects in the zone (this should actually be the compartment, now that I think about it).  If we find any array buffers, we call stealContents on them, and then call js_free on the contents, freeing the memory.  This seems to take about a second, but it looks like that's all the js_free call.

Andrew McCreight [:mccr8]

Comment 8

•

11 years ago

Let me know if this helps the scenario you are thinking of.  Comments 0 and 1 have a ton of different scenarios, so I'm not sure what you are really after.

Nicholas Nethercote [inactive]

Comment 9

•

11 years ago

> we iterate over all objects
> in the zone (this should actually be the compartment, now that I think about
> it).

It might be easier to just create a new IterateCompartmentCells() function.

Andrew McCreight [:mccr8]

Comment 10

•

11 years ago

Yeah, I was going to write a less hacky patch with a new iterator if this is the right approach.  I think you still have to iterate over the entire zone, but filtering the compartment out in the iterator rather than the callback seems like it would be more efficient.

Marco Mucci [:MarcoM]

Updated

•

11 years ago

Blocks: gecko-games

Andrew McCreight [:mccr8]

Comment 11

•

11 years ago

Luke sent me this via email:

"Here's my current ideal:
  - whenever a page becomes inactive (or whatever the term is that means "not the current window of any tab"), if that page is flagged as containing a live large ArrayBuffer (which we could maintain with a counter on the zone that gets inc'd on alloc and dec'd on finalize), we'd immediately nuke it and schedule a GC+CC "real soon".
  - when certain large allocations are about to fail (viz., the ArrayBuffer's js_calloc), we synchronously: (1) nuke all non-active pages (evict the entire bf cache), (2) GC+CC."

Andrew McCreight [:mccr8]

Comment 12

•

11 years ago

For the second one, it sounds like maybe we want to hook up these certain large allocations to the browser's memory pressure notification system.  If we are doing this for large allocations, then we probably still have a decent amount of non-contiguous free memory, so I'd expect we can actually do something without falling over.  Then we'd want to make sure things like flushing the bfcache and synchronous GC/CC are hooked up to the memory pressure notification.  Does that sound right?

For the first one, I'm still not entirely clear on it.  It sounds like maybe you want pages with large ArrayBuffers to not ever be put in the bfcache?  I don't know if there's some mechanism to flag a window as uncachable.  You'd get worse navigation back-and-forth for these pages, but are they are stressing the limit of how much memory the browser can even hold, you are probably better off doing that.  Then, I think we'd want some kind of ArrayBuffer nuking mechanism like I added in the patch here to so we don't have to wait 5-15 seconds for GC/CC to cycle.

Andrew McCreight [:mccr8]

Comment 13

•

11 years ago

Are you really concerned with the general case of "pages that use lots of memory" or is dealing with "pages that have really large array buffers" sufficient?  What is the memory usage breakdown typically like for array buffer vs. images vs. sound vs. whatever for the games you are concerned about?

Andrew McCreight [:mccr8]

Comment 14

•

11 years ago

I'm trying to scope this to be more specific than the "make GC better" of the original summary, but more general than the initial example test case of "rapidly reload a page that just has a 1gig ArrayBuffer".

Luke Wagner [:luke]

Comment 15

•

11 years ago

(In reply to Andrew McCreight [:mccr8] from comment #12)
> For the second one, it sounds like maybe we want to hook up these certain
...
> pressure notification.  Does that sound right?

Yes.  What does 'hooking up to the memory pressure notification' entail, exactly?

> For the first one, I'm still not entirely clear on it.  It sounds like maybe
> you want pages with large ArrayBuffers to not ever be put in the bfcache?  I
> don't know if there's some mechanism to flag a window as uncachable.

Yes, that is a good way to say it.  It sounds like there is already a predicate that evaluates whether a window is cacheable (bz said if you have an onUnload handler you are not cacheable).


> You'd get worse navigation back-and-forth for these pages, but are they are
> stressing the limit of how much memory the browser can even hold, you are
> probably better off doing that.

Agreed

> Then, I think we'd want some kind of
> ArrayBuffer nuking mechanism like I added in the patch here to so we don't
> have to wait 5-15 seconds for GC/CC to cycle.

Ah, good idea!  So maybe the Zone keeps a list of its contained huge buffers and we have NukeCrossCompartmentWrappers also neuter all buffers in this list.  That way we get the synchronous free() call w/o the jank of a sync GC+CC.

(Just to confirm: if you are uncacheable, you get immediately Nuke'd?)

(In reply to Andrew McCreight [:mccr8] from comment #13)
> Are you really concerned with the general case of "pages that use lots of
> memory" or is dealing with "pages that have really large array buffers"
> sufficient?

The latter would be sufficient; although the former may be an interesting longer term goal.

> What is the memory usage breakdown typically like for array
> buffer vs. images vs. sound vs. whatever for the games you are concerned
> about?

There are a couple huge array buffers (>100MB: one for the heap, one for the VFS, several held by the AudioBuffer) that take the lion's share of memory, but I've also probably <200MB combined other WebGL, JS GC heap, JIT code and other resources that would be useful to clear "soon".  I think synchronously neutering the large ABs would get us most of the way, though.

(In reply to Andrew McCreight [:mccr8] from comment #14)
> I'm trying to scope this to be more specific than the "make GC better" of
> the original summary, but more general than the initial example test case of
> "rapidly reload a page that just has a 1gig ArrayBuffer".

Yes, much appreciated!  As stated above, it's not *quite* as simple as a single 1GB AB; more like several >50MB ABs that total more than 512MB.  It'd also be reasonable to only do this eviction (to preserve the bfcache benefits) on 32-bit systems.

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Depends on: 936128

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Depends on: 936218

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Depends on: 936236

Andrew McCreight [:mccr8]

Comment 16

•

11 years ago

> (Just to confirm: if you are uncacheable, you get immediately Nuke'd?)

I'd expect so, but I haven't tested it.

Andrew McCreight [:mccr8]

Comment 17

•

11 years ago

If it proves useful, we could add more things to WindowDestroyedEvent::Run.  For instance, I get the impression that dumping the JIT code is no big deal.  tn also points out we could probably throw out image data for the current tab, under the assumption that mostly big images will only be used by a single tab.

Andrew McCreight [:mccr8]

Comment 18

•

11 years ago

This is kind of a meta bug.  I think I've filed everything mentioned in this bug prior to comment 17, and that should be enough to go on for now.

Keywords: meta

Andrew McCreight [:mccr8]

Comment 19

•

11 years ago

We usually don't put MemShrink tags on meta bugs so I'm removing it.

Whiteboard: [MemShrink:P2] [games:p1] → [games:p1]

Andrew McCreight [:mccr8]

Updated

•

10 years ago

Depends on: 965916

Luke Wagner [:luke]

Updated

•

10 years ago

No longer depends on: 936218

Luke Wagner [:luke]

Comment 20

•

10 years ago

Attached patch notify-on-failure (WIP) (obsolete) (deleted) — Details — Splinter Review

This unfinished patch synchronously sends the low-memory notification and it seems to fix the testcase on 32-bit Windows.

I'll probably hoist the logic from AllocateArrayBufferContents into some JSRuntime::{malloc,calloc,realloc}Reentrant functions that we can use later.  Also, Andrew, do you know if "heap-minimize" is the right "topic" for the notification?

Assignee: continuation → luke

Attachment #827756 - Attachment is obsolete: true

Status: NEW → ASSIGNED

Andrew McCreight [:mccr8]

Comment 21

•

10 years ago

I'm not really sure.  about:memory does "heap-minimize", but then other place I see "memory-pressure" being used (like the code I added in bug 965916).  I don't know what the relationship between those is.  Maybe njn does?

Nicholas Nethercote [inactive]

Comment 22

•

10 years ago

I DXR'd for "heap-minimize" and got this: http://dxr.mozilla.org/mozilla-central/source/xpcom/base/nsIMemory.idl#17

So "memory-pressure" is the topic, and "heap-minimize" is one of several sub-topics. The sub-topics are used only in a few places, it looks like. I can't tell you anything more than that.

Luke Wagner [:luke]

Comment 23

•

10 years ago

Oops, yes, I meant "sub-topic".  "heap-minimize" it is, then.

Luke Wagner [:luke]

Comment 24

•

10 years ago

Comment on attachment 8374498 [details] [diff] [review]
notify-on-failure (WIP)

This should have been in bug 936236.

Attachment #8374498 - Attachment is obsolete: true

Luke Wagner [:luke]

Comment 25

•

10 years ago

So unfortunately, attachment 742147 [details] still OOMs (on 32-bit Windows; not sure why it worked before but not now).  There is an improvement, though: the second reload always succeeds.  My first thought was background sweeping (hence bug 936236 comment 28), but that doesn't fix it.

Digging in, I can see that, on the reloads that fail, the big array buffer is never finalized (synchronously or even if I wait a while).  On reloads that succeed, the finalization happens before the NotifyObservers call returns.  In both fail/success cases, I can confirm the GC/CC/GC tri happen synchronously.  So it seems like something is temporarily keeping the window reachable on the refreshes that fail.  BF Cache is suspect, but (1) according to bz, refresh doesn't put the previous page in the cache, (2) manually adding window.unload listener doesn't fix either.

Andrew, perhaps you could help see how the should-be-dead window is staying alive?

Andrew McCreight [:mccr8]

Comment 26

•

10 years ago

> Andrew, perhaps you could help see how the should-be-dead window is staying alive?
Sure, I can look later today, or maybe early next week.

If you want to look at it before I get to it, what you do is:
1. Run the browser with XPCOM_CC_LOG_ALL=1
2. Run find_roots.py <log name> nsGlobalWindow on the CC log where you think the window should be cleaned up already (from https://github.com/amccreight/heapgraph ), which you can figure out by grepping for nsGlobalWindow, and seeing the last one where the window in question shows up.

Luke Wagner [:luke]

Comment 27

•

10 years ago

Any chance you could give a look?  (If not, I'll take a stab, but I think you are much more experienced interpreting the output.)

Andrew McCreight [:mccr8]

Comment 28

•

10 years ago

I started looking at it, but I got distracted by something.  I'll try to pick it up again...

Anthony Vaughn [:anthony][:avaughn][San Francisco - Pacific]

Updated

•

8 years ago

Whiteboard: [games:p1] → [games:p?]

BugBot [:suhaib / :marco/ :calixte]

Comment 29

•

2 years ago

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: mail → nobody

Status: ASSIGNED → NEW

BugBot [:suhaib / :marco/ :calixte]

Updated

•

2 years ago

Summary: need more aggressive GC on reload for pages that use lots of memory → [meta] need more aggressive GC on reload for pages that use lots of memory

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

big-alloc.html 11 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) (deleted), text/html		Details
nuke array buffers, WIP 11 years ago Andrew McCreight [:mccr8] (deleted), patch		Details \| Diff \| Splinter Review
notify-on-failure (WIP) 10 years ago Luke Wagner [:luke] (deleted), patch		Details \| Diff \| Splinter Review