Closed
Bug 746253
Opened 13 years ago
Closed 13 years ago
OOM when creating a lot of garbage without pauses with IGC on
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla15
People
(Reporter: azakai, Assigned: till)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
igor
:
review+
|
Details | Diff | Splinter Review |
STR:
1. Unpack attached archive
2. python -m SimpleHTTPServer 8888
3. Browse to localhost:8888/client.html
In Firefox, memory usage rises dramatically until we run out of it, and there is a silent failure (the only way you can tell an OOM happened is to view memory usage on your machine). In Chrome, memory usage rises very little, and loading proceeds up to a later point where a "Shader compilation halt" exception is thrown (this is expected).
Reporter | ||
Updated•13 years ago
|
Attachment #615804 -
Attachment mime type: text/plain → application/octet-stream
Reporter | ||
Comment 1•13 years ago
|
||
(The expected exception can be seen in the web console.)
Reporter | ||
Comment 2•13 years ago
|
||
This only happens when javascript.options.mem.gc_incremental is set to true. So it looks like when IGC is on, garbage accumulates until we eventually OOM, it is not collected quickly enough.
There are 2 bugs here, though. Aside from garbage not being collected quickly enough, there is the question of why garbage is accumulating at all. The code itself is emscripten-compiled and does not generate any significant amount of garbage. So perhaps this is TI-generated stuff? (Note that all the unnecessary GCing probably explains why it takes much longer to load in Firefox than Chrome.)
Comment 3•13 years ago
|
||
What does about:memory look like when the memory usage is rising? In other words, what is it that is using more and more memory?
Whiteboard: [MemShrink]
Reporter | ||
Comment 4•13 years ago
|
||
The problem is that the memory increase happens during a continuous span of processing, without returning to the browser event loop, the UI hangs during that time. So I can't open about:memory until the OOM happens and memory goes back down.
Actually come to think of it, that might be connected to the OOM, if IGC only runs when code stops running.
Reporter | ||
Updated•13 years ago
|
Blocks: gecko-games
Reporter | ||
Comment 5•13 years ago
|
||
Turns out there was a bug in the JS here that did generate garbage, so GCing is expected in this testcase. Still, there remains the bug shown here that when garbage is created quickly before returning to the browser event loop, we can OOM with IGC on.
Reporter | ||
Updated•13 years ago
|
Summary: BananaBread OOM → OOM when creating a lot of garbage without pauses with IGC on
I'll take a look at this soon.
Assignee: general → wmccloskey
This seems to fix the problem. All these GCs are TOO_MUCH_MALLOC GCs. When one of these triggered, we would start incremental GC. However, since the event loop wasn't spinning, we wouldn't GC until the next 100MB of malloc bytes were used. There are checks in place if gcBytes grows too quickly, but not gcMallocBytes. This fixes that problem.
Attachment #620548 -
Flags: review?(igor)
Assignee | ||
Comment 8•13 years ago
|
||
This problem indicates that there is value in having gcMallocBytes tracked in the runtime, too, as opposed to what I was saying in bug 723350. Maybe the solution for that would be to set a flag on the runtime as soon as at least one compartment has triggered a TOO_MUCH_MALLOC GC?
I'll test with the attached test-case and a modified version of the patch from comment 7 and report back.
Comment 9•13 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #7)
> Created attachment 620548 [details] [diff] [review]
> patch
>
> This seems to fix the problem. All these GCs are TOO_MUCH_MALLOC GCs. When
> one of these triggered, we would start incremental GC. However, since the
> event loop wasn't spinning, we wouldn't GC until the next 100MB of malloc
> bytes were used. There are checks in place if gcBytes grows too quickly, but
> not gcMallocBytes. This fixes that problem.
But what is exactly the problem then? Bad incremental GC scheduling so its slice does not run as needed?
Assignee | ||
Comment 10•13 years ago
|
||
This patch builds on Bug 723350 and is strictly a reduced version of billm's patch that leaves out the runtime trigger.
Testing with azakai's test-case shows that it effectively removes the unbounded memory growth and leads to non-incremental GCs that only collect the affected compartment instead of being global.
Attachment #620649 -
Flags: review?(igor)
(In reply to Igor Bukanov from comment #9)
> But what is exactly the problem then? Bad incremental GC scheduling so its
> slice does not run as needed?
Yes, it's bad incremental scheduling. Once an incremental GC starts, there are three reasons why we'll do more slices:
1. There's a 100ms timer in nsJSEnvironment.
2. If we draw a frame, we do another slice.
3. If gcBytes grows too large, we'll immediately finish the GC.
Since we never run the event loop in this code, #1 and #2 will not trigger. And since the demo allocates a lot of malloc bytes but almost no JS objects, we won't hit #3.
We will do a slice every time the malloc trigger hits, which is every 100MB of malloc allocation. But this is not frequent enough to stop us from OOMing.
Trigger #3 is a "safety valve" to ensure that we GC frequently enough. The problem is that it wasn't tracking malloc bytes. This patch makes it so it does.
I'm fine with Till's patch since it fixes the problem as well.
Comment 12•13 years ago
|
||
Comment on attachment 620649 [details] [diff] [review]
Use non-incremental GCs for compartmental TOO_MUCH_MALLOC GCs
Review of attachment 620649 [details] [diff] [review]:
-----------------------------------------------------------------
As a spot fix I like this better.
Attachment #620649 -
Flags: review?(igor) → review+
Comment 13•13 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #11)
> 1. There's a 100ms timer in nsJSEnvironment.
> 2. If we draw a frame, we do another slice.
> 3. If gcBytes grows too large, we'll immediately finish the GC.
>
> Since we never run the event loop in this code, #1 and #2 will not trigger.
> And since the demo allocates a lot of malloc bytes but almost no JS objects,
> we won't hit #3.
We have all that wonderful operational callback infrastructure including the extra thread that reliably triggers the callback to stop the long-running script or do other things. We should use that to trigger the incremental GC scheduling reliably.
Attachment #620548 -
Flags: review?(igor)
(In reply to Igor Bukanov from comment #13)
> We have all that wonderful operational callback infrastructure including the
> extra thread that reliably triggers the callback to stop the long-running
> script or do other things. We should use that to trigger the incremental GC
> scheduling reliably.
Yeah, I guess it would be nice to use that to trigger the incremental slices.
However, I still don't see this patch as a spot fix. I've tried to design the incremental GC heuristics so that we always start the GC before any allocation triggers hit. If we do hit an allocation trigger (like gcTriggerBytes or gcMaxMallocBytes) then I think we should always do a non-incremental GC right away.
It would probably make sense to start an incremental GC from MaybeGC if gcMallocBytes == gcMaxMallocBytes/2, or something like that. I'm not sure how useful that would be in practice, though.
Comment 16•13 years ago
|
||
Assuming that attachment 620649 [details] [diff] [review] is the patch intended for checkin, it does not apply cleanly to inbound. Please rebase and re-request.
Keywords: checkin-needed
Assignee: wmccloskey → tschneidereit+bmo
Target Milestone: --- → mozilla15
Comment 18•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•