Closed Bug 961324 Opened 11 years ago Closed 9 years ago

Counters for nursery and tenured heap allocation

Categories

(Core :: JavaScript Engine, defect)

25 Branch
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: fitzgen, Unassigned)

References

(Blocks 2 open bugs)

Details

In the developer tools' memory tool, we want to be able to present populations and total sizes for categories of objects while the system runs. Experience suggests that running counters often become inaccurate, and the best way to get reliable numbers is to do a full traversal of the heap (a "census"). However, we want to produce a graph showing heap usage with fine granularity, and a full traversal might not be feasible to produce each data point.

Thus, we need a hybrid approach in which we conduct a full census occasionally, and then compute intermediate population estimates by counting allocations as they occur.

(We don't need to count deallocations as they occur. Instead, we simply wait until the next census to supply authoritative numbers, and begin counting afresh from that point. Note that no modern, efficient GC can identify the exact moment when an object becomes unreachable; they all delay deallocation to the end of a GC cycle. Generational collectors may put off examining the tenured heap for quite a while, meaning that objects may not be recognized as garbage for quite some time.)

To implement this, we need a hook in SpiderMonkey that will report each newly allocated object to our aggregator to be counted. This hook should distinguish allocations in the nursery, allocations directly in the tenured heap, and promotions from the nursery to the tenured heap. We also need to be notified when the nursery has been cleared after a minor GC cycle. By tracking the nursery population separately from that of the tenured heap, and by counting promotions from the nursery to the tenured heap, we can infer how many objects were not promoted, and thus improve the accuracy of our intermediate populations.
Depends on: 961328
No longer blocks: 960671
> To implement this, we need a hook in SpiderMonkey that will 
> report each newly allocated object to our aggregator to be counted.

This isn't really going to work. The problem is that javascript makes a startlingly large number of allocations, even for trivial code. Our allocator's fast path is ~5 inline instructions, so even bumping an extra counter is going to have a catastrophic impact on our benchmark scores. A call of any sort is, of course, right out.

That said, we do track a huge amount of relevant information. It's an old adage, but true in this case: performance, simplicity, correctness; pick 2. We can't compromise performance, but if you are willing to compromise a bit on the other two, we should be able to build a kick-ass visualization with the data we have, even if not the exact chart you have in mind right now.

The way I see this working is: census when we do full GC with periodic polling for updates by the dev-tools in-between. Key here is that the polling can be done on human-relevant timescales, so effectively have zero impact on performance.

Please keep in mind that even full GC's are highly performance sensitive: we do very, very little work per-object on the main thread, so it will be non-trivial to even get object counts, although I think it could be done with a bit of work. Additionally, I would just ignore the nursery, or think of it as ballast for the moment. We have no idea what objects are in it until we collect it. Our best bet is just to collect it when we poll and work exclusively in the tenured heap. Nursery collections take microseconds and happen a few times a second, so I wouldn't worry about a few extra when the devtools are open.
What sort of time resolution of scanning are you looking for?  Particularly if you are only interested in a single tab, scanning the heap isn't too many ms in my experience.
IMHO, 1-5Hz would be more than sufficient for basically all practical uses, which would lean towards just scanning every few ms. That said, a memory use chart which updates at 60hz would be totally rad.
Well, the object metadata callback, at least, does need to get called for every allocation. The JIT takes a slow path when the callback is set:

https://hg.mozilla.org/integration/mozilla-inbound/file/c75f13d4f160/js/src/jit/IonMacroAssembler.cpp#l653

This tool is not going to be enabled all the time; it's fine to take a slow path when the user cares.
(In reply to Andrew McCreight [:mccr8] from comment #2)
> What sort of time resolution of scanning are you looking for?  Particularly
> if you are only interested in a single tab, scanning the heap isn't too many
> ms in my experience.

That would be nice. We want to basically conduct a "census" of the heap, visiting each object and deciding how to categorize it along several axes (DOM node? Which tag? Inserted? If not, which constructor? etc.)

We also want the ability to produce a "core dump" of the heap, that can serve as a complete snapshot of the heap's contents. But those would be very heavyweight, and necessarily taken only at the developer's request.
> That would be nice. We want to basically conduct a "census" of the heap,
> visiting each object and deciding how to categorize it along several axes
> (DOM node? Which tag? Inserted? If not, which constructor? etc.)

Have a look at http://mxr.mozilla.org/mozilla-central/source/js/src/vm/MemoryMetrics.cpp#258. That's the code used by the JS memory reporter to visit and measure every cell in the GC heap.
Yeah, that's extremely similar to what I'd imagined. However, I think the memory tool will want a broader set of categories than what I see in cstats right now. I wonder how we could share as much as possible...
Nick, are we good here now that we have the census infrastructure?
Flags: needinfo?(nfitzgerald)
On the one hand, this was supposed to be complimentary to the census infrastructure and fill the gaps between taking censuses, so no the census infrastructure doesn't make this obsolete. On the other hand, I think we have largely decided to say "mu" to the question this API was an attempt to provide answers for. We don't need it in the foreseeable future.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(nfitzgerald)
Resolution: --- → WONTFIX
Hi Fitzgen,

Is this feature still desirable?  I landed Bug 1473213 to provide such counters, right now they're enabled only in nightly bit they also don't seem to affect performance (not that I can isolate from other patches landed at the same time).  Nevertheless I have Bug 1480001 to enable these only when the profiler is active, so any concerns about the allocation hot-path are now moot.

You could have this feature and I guess enable it either if devtools is open/collecting memory info or the profiler is recording.

(Looking at this bug because I'm reviewing all memory profiling bugs.)
Flags: needinfo?(nfitzgerald)
In general, I think we should start from specific developer problems or feature plans, dream up ways to implement exciting things, and work backwards from there to specific hacks.

If we have specific plans for UI changes to present this information - say, displaying it in perf.html - then it makes sense to pursue it. But otherwise, we've spent time before building things that don't end up really getting used, and while it's nice to say, "Hey, this facility is available if anyone has ideas" it doesn't end up feeling that satisfying.
(In reply to Jim Blandy :jimb from comment #11)
> If we have specific plans for UI changes to present this information - say,
> displaying it in perf.html.

:-)  That's exactly why I added it.

I just happened to be reviewing old memory profiling bugs and came across this so if there's a more general need for devtools I wanted to let people know that the underlying support is now available.
Oh, great!
I think Jim covered everything :)
Flags: needinfo?(nfitzgerald)
Blocks: heapprof
You need to log in before you can comment on or make changes to this bug.