Closed Bug 1303754 Opened 8 years ago Closed 7 years ago

Reconsider lazy source code options

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(1 file, 1 obsolete file)

We have some flags (per compilation or per compartment) to discard source code completely or to call a callback whenever we need the source code (for Function.prototype.toString etc). This was added to save memory, but these flags inhibit lazy parsing so they can actually make us use more memory.

After removing the options.setSourceIsLazy call in mozJSComponentLoader::ObjectForLocation, I get the following results:

GC heap, parent process:

-2,076,640 B (92.69%) -- gc-things
  -1,303,832 B (58.19%) ── scripts
     557,376 B (-24.88%) ── lazy-scripts
    -530,368 B (23.67%) ── scopes
    -331,984 B (14.82%) ── objects
    -280,352 B (12.51%) ── shapes
    -211,864 B (09.46%) ── strings
     ─33,888 B (-1.51%) ── object-groups
      -9,504 B (00.42%) ── base-shapes

GC heap, 3 content processes combined:

-1,096,520 B (113.43%) -- gc-things
  -735,216 B (76.06%) ── scripts [3]
   346,560 B (-35.85%) ── lazy-scripts [3]
  -294,368 B (30.45%) ── scopes [3]
  -171,440 B (17.74%) ── strings [3]
  -109,384 B (11.32%) ── shapes [3]
   -90,832 B (09.40%) ── objects [3]
   -41,904 B (04.33%) ── object-groups [3]

Note that we have a lot more lazy scripts now and fewer scripts, scopes, strings, etc.

The actual win is a bit bigger because these numbers are only for the GC heap.
Attached patch Patch (obsolete) (deleted) — Splinter Review
Not sure if there's a better reviewer for this..

There's still the uncompressed-source-cache issue we talked about, but this patch is definitely a win locally and shrinks the GC heap a lot.
Attachment #8792597 - Flags: review?(luke)
Can you try throwing this at AWSY? I'm curious to see the behavior on a larger test set.

You can spin up a test with the following try syntax:

> try: -b o -p linux64 -u none -t none -awsy jandem_bug_1303754

I'd suggest doing two runs, one with just a baseline (no patch) and one with your changes.
Flags: needinfo?(jdemooij)
(In reply to Eric Rahm [:erahm] from comment #2)
> You can spin up a test with the following try syntax:

I actually did just that earlier today: https://treeherder.mozilla.org/#/jobs?repo=try&author=jandemooij@gmail.com&group_state=expanded

But it's not showing up here: https://areweslimyet.com/?series=jandem_lazyparsing - do I have to wait a bit longer?
Flags: needinfo?(jdemooij)
Flags: needinfo?(erahm)
And thanks for the reminder, I wanted to ping you about AWSY but forgot :)
(In reply to Jan de Mooij [:jandem] from comment #3)
> (In reply to Eric Rahm [:erahm] from comment #2)
> > You can spin up a test with the following try syntax:
> 
> I actually did just that earlier today:
> https://treeherder.mozilla.org/#/jobs?repo=try&author=jandemooij@gmail.
> com&group_state=expanded
> 
> But it's not showing up here:
> https://areweslimyet.com/?series=jandem_lazyparsing - do I have to wait a
> bit longer?

It looks like the try watcher was notified the build finished before the archives were uploaded, I'll reschedule the runs now. Should take about 1.5 hours.
Flags: needinfo?(erahm)
(In reply to Eric Rahm [:erahm] from comment #5)
> It looks like the try watcher was notified the build finished before the
> archives were uploaded, I'll reschedule the runs now. Should take about 1.5
> hours.

Thanks!
So most of the lines on AWSY show an improvement: https://areweslimyet.com/?series=jandem_lazyparsing

But I'm a bit worried about "JS: After TP5 [+30s]" in the last graph. Not sure why that shows a 9.61 MB regression. I'll investigate, maybe we're double counting stuff or does about:memory detect that?
(In reply to Jan de Mooij [:jandem] from comment #7)
> But I'm a bit worried about "JS: After TP5 [+30s]" in the last graph. Not
> sure why that shows a 9.61 MB regression. I'll investigate, maybe we're
> double counting stuff or does about:memory detect that?

Looking at the 5 iterations before/after, most of them show a clear win if we look at js-main-runtime:

Before: 132.92, 121.13, 132.53, 134.42, 120.78
After:  118.74, 118.36, 118.03, 128.08, 130.39

AWSY shows us the 5th iteration (120.78 -> 130.39), but that's unfortunate here as it seems to be an outlier and the 4 other iterations show a win.

If we look at that 130.39 number, we see 12.72 MB for "gc-heap" with 10.48 MB unused-arenas and 1 MB unused-chunks. Some of the other iterations I looked at have 0 for unused-areneas - I wonder if this is because of a race somewhere where we're not deallocating them fast enough?
See comment 8. Could a race explain the large values we (occasionally) see for unused-arenas? I think I've noticed this before and fixing it should make about:memory and AWSY less noisy.
Flags: needinfo?(jcoppeard)
(In reply to Jan de Mooij [:jandem] from comment #7)
> But I'm a bit worried about "JS: After TP5 [+30s]" in the last graph. Not
> sure why that shows a 9.61 MB regression. I'll investigate, maybe we're
> double counting stuff or does about:memory detect that?

DMD will detect double counting, but about:memory will not.
(In reply to Jan de Mooij [:jandem] from comment #9)
> See comment 8. Could a race explain the large values we (occasionally) see
> for unused-arenas? I think I've noticed this before and fixing it should
> make about:memory and AWSY less noisy.

Fixing this (or understanding it and updating AWSY to avoid it) would be great. As-is we have a pretty noisy pseudo-bi-modal distribution [1]. 9-10MiB drift seems within our current threshold and I agree that overall it looks like a win here.

[1] https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,0c9fbb47d39eaeca9628dafa82aecffb7659459b,1,4%5D
Comment on attachment 8792597 [details] [diff] [review]
Patch

Review of attachment 8792597 [details] [diff] [review]:
-----------------------------------------------------------------

Forwarding to bholley, since he wrote the code and is actually a peer in this module :)
Attachment #8792597 - Flags: review?(luke) → review?(bobbyholley)
Comment on attachment 8792597 [details] [diff] [review]
Patch

Review of attachment 8792597 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't write this code (just shuffled it around), but happy to rubberstamp given that smart people have measured this.
Attachment #8792597 - Flags: review?(bobbyholley) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aad183efb09a
Don't use the lazySource option for JSMs, so we can benefit from syntax parsing. r=bholley
(In reply to Jan de Mooij [:jandem] from comment #9)
The reported value itself is calculated by subtracting all the other measurements from the total.  The total comes from an atomic variable which is modified by both the main thread and the background sweep thread.  However GC would have to be active for there to be a possibility of getting this wrong and I assume we wait for GC to finish before measuring.

It might be because we skip decommitting unused arenas if we are in high-frequency GC mode.  It's possible that the last GC happened in this mode and so we didn't decommit and this left a lot of unused arenas committed.
Flags: needinfo?(jcoppeard)
To make memory reporting more accurate, we already call a function [1] called jemalloc_purge_freed_pages() on OSX which simply protects and unprotects all chunks with madvise(MADV_FREE)d pages to force the OS to page them out. We should probably add something similar for the GC, but maybe we could also run the decommit phase before that on all OSes.

[1] https://dxr.mozilla.org/mozilla-central/source/xpcom/base/nsMemoryReporterManager.cpp#445
[2] https://dxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.c#6735
Er, I meant to link [2] after the description of the implementation.
Depends on: 1304719
Unfortunately, this regressed the (desktop) installer size by a few MB (bug 1304719). The problem is that omni.ja is *smaller* now, but it's harder to compress (it contains more compressed source code instead of bytecode).

Talos also reports various regressions in ts_paint, tpaint, sessionrestore, etc.

I think I'll just back this out for now. Content processes don't use the startup cache, so maybe we could make this change only there. I'll do some measurements with that.
Keywords: leave-open
Depends on: 1304979
Attached patch Patch (deleted) — Splinter Review
Here's a patch to disable lazy source code *if the startup cache is disabled* (i.e. content processes).

This way it should not affect the parent process, the startup cache, and installer size, but we still avoid full parsing all modules in content processes.
Attachment #8792597 - Attachment is obsolete: true
Attachment #8799741 - Flags: review?(bobbyholley)
Comment on attachment 8799741 [details] [diff] [review]
Patch

Review of attachment 8799741 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/loader/mozJSComponentLoader.cpp
@@ +716,5 @@
>          LOG(("Slow loading %s\n", nativePath.get()));
>  
> +        // Use lazy source if both of these conditions hold:
> +        //
> +        // (1) mReuseLoaderGlobal is false. If mReuseLoaderGlobal is true, we

FWIW, we could rip mReuseLoaderGlobal out at this point, since b2g is gone.
Attachment #8799741 - Flags: review?(bobbyholley) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9c89fbb659a8
Don't use lazy source code if the startup cache is not available (i.e. content processes). r=bholley
Should this get marked as fixed?
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: