864927 - optimize in-chrome workers to be more memory efficient

Reporter

Description

•

12 years ago

The RIL JS worker thread uses a lot of memory. I strongly suspect that if we rewrote it in C++, we'd be able to reduce the compoment's memory usage significantly. Here's a dump of the worker thread's memory usage. Most-to-all of unused-arenas will go away after bug 829482, but even after that this worker uses way too much memory. > 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400) > ├──1.56 MB (04.56%) -- gc-heap > │ ├──1.53 MB (04.47%) ── unused-arenas > │ └──0.03 MB (00.09%) -- (2 tiny) > │ ├──0.03 MB (00.09%) ── chunk-admin > │ └──0.00 MB (00.00%) ── unused-chunks > ├──0.61 MB (01.77%) -- compartment(web-worker) > │ ├──0.29 MB (00.84%) -- gc-heap > │ │ ├──0.09 MB (00.27%) ── unused-gc-things [2] > │ │ ├──0.08 MB (00.24%) -- shapes > │ │ │ ├──0.04 MB (00.11%) ── tree > │ │ │ ├──0.03 MB (00.10%) ── dict > │ │ │ └──0.01 MB (00.03%) ── base > │ │ ├──0.06 MB (00.19%) -- objects > │ │ │ ├──0.03 MB (00.10%) ── non-function > │ │ │ └──0.03 MB (00.09%) ── function > │ │ ├──0.04 MB (00.13%) ── scripts > │ │ └──0.01 MB (00.03%) ── sundries [2] > │ ├──0.16 MB (00.47%) ── script-data > │ ├──0.05 MB (00.15%) -- objects-extra > │ │ ├──0.04 MB (00.12%) ── slots > │ │ └──0.01 MB (00.04%) ── elements > │ ├──0.03 MB (00.09%) ── type-inference/script-main > │ ├──0.03 MB (00.08%) ── other-sundries [2] > │ ├──0.03 MB (00.07%) ── analysis-temporary > │ └──0.02 MB (00.07%) -- shapes-extra > │ ├──0.01 MB (00.04%) ── compartment-tables > │ └──0.01 MB (00.03%) ── dict-tables > ├──0.41 MB (01.20%) -- runtime > │ ├──0.13 MB (00.36%) ── gc-marker > │ ├──0.11 MB (00.33%) ── script-sources > │ ├──0.06 MB (00.18%) ── atoms-table > │ ├──0.06 MB (00.18%) ── jaeger-code > │ ├──0.04 MB (00.10%) ── runtime-object > │ ├──0.00 MB (00.01%) ── dtoa > │ ├──0.00 MB (00.01%) ── stack > │ ├──0.00 MB (00.01%) ── temporary > │ ├──0.00 MB (00.00%) ── contexts > │ ├──0.00 MB (00.00%) ── script-filenames > │ ├──0.00 MB (00.00%) ── unused-code > │ ├──0.00 MB (00.00%) ── ion-code > │ ├──0.00 MB (00.00%) ── math-cache > │ └──0.00 MB (00.00%) ── regexp-code > └──0.25 MB (00.72%) -- compartment(web-worker-atoms) > ├──0.14 MB (00.41%) -- gc-heap > │ ├──0.13 MB (00.38%) ── strings > │ ├──0.01 MB (00.02%) ── unused-gc-things > │ └──0.00 MB (00.00%) ── sundries > ├──0.10 MB (00.30%) ── string-chars/non-huge > └──0.00 MB (00.01%) ── other-sundries

Gregor Wagner [:gwagner]

Comment 1

•

12 years ago

(In reply to Justin Lebar [:jlebar] from comment #0) > The RIL JS worker thread uses a lot of memory. I strongly suspect that if > we rewrote it in C++, we'd be able to reduce the compoment's memory usage > significantly. > I think it would be easier to make the GC parameters tunable for workers?

Justin Lebar (not reading bugmail)

Reporter

Comment 2

•

12 years ago

> I think it would be easier to make the GC parameters tunable for workers? If you want to try that, I don't think it could hurt. But what I see here to my naive eyes doesn't look like GC parameter problems (aside from unused-arenas, of course); it looks like we have a bunch of things inside a runtime / component, and they each eat up a tiny bit of space.

Justin Lebar (not reading bugmail)

Reporter

Updated

•

12 years ago

Whiteboard: [MemShrink]

Justin Lebar (not reading bugmail)

Reporter

Comment 3

•

12 years ago

There are four relevant workers here: * RIL worker (this bug) * Net worker (bug 864931) * wifi worker x2 (bug 864932)

Ken Chang[:kenkai]

Comment 4

•

12 years ago

(In reply to Justin Lebar [:jlebar] from comment #0) > The RIL JS worker thread uses a lot of memory. I strongly suspect that if > we rewrote it in C++, we'd be able to reduce the compoment's memory usage > significantly. > > Here's a dump of the worker thread's memory usage. Most-to-all of > unused-arenas will go away after bug 829482, but even after that this worker > uses way too much memory. > > > 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400) Justin, commercial RIL is implemented by C++. Could you please also analyse the memory size of commercial RIL? This answer will help us to clearly know the benefit of rewriting ril worker.

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 5

•

12 years ago

commercial ril 35.66 MB (100.0%) -- explicit ├───9.34 MB (26.18%) ++ js-non-window ├───7.07 MB (19.82%) -- workers/workers() │ ├──2.45 MB (06.88%) -- worker(resource://gre/modules/wifi_worker.js, 0x44658400) │ │ ├──1.77 MB (04.97%) -- gc-heap │ │ │ ├──1.73 MB (04.86%) ── unused-arenas │ │ │ └──0.04 MB (00.11%) ++ (3 tiny) │ │ └──0.68 MB (01.90%) ++ (3 tiny) │ ├──2.32 MB (06.51%) -- worker(resource://gre/modules/net_worker.js, 0x44185400) │ │ ├──1.74 MB (04.89%) -- gc-heap │ │ │ ├──1.70 MB (04.78%) ── unused-arenas │ │ │ └──0.04 MB (00.11%) ++ (3 tiny) │ │ └──0.58 MB (01.62%) ++ (3 tiny) │ └──2.29 MB (06.43%) -- worker(resource://gre/modules/wifi_worker.js, 0x42e44400) │ ├──1.79 MB (05.01%) -- gc-heap │ │ ├──1.75 MB (04.92%) ── decommitted-arenas │ │ └──0.03 MB (00.09%) ++ (3 tiny) │ └──0.51 MB (01.43%) ++ (3 tiny) ├───5.67 MB (15.90%) ── heap-unclassified ├───4.31 MB (12.09%) ++ window-objects ├───3.61 MB (10.13%) ++ images ├───2.73 MB (07.65%) ── freetype ├───1.84 MB (05.15%) ++ (11 tiny) └───1.09 MB (03.07%) ── xpti-working-set and the library file takes less than 500k. moz ril 39.84 MB (100.0%) -- explicit ├───9.93 MB (24.91%) -- workers/workers() │ ├──2.86 MB (07.17%) -- worker(resource://gre/modules/ril_worker.js, 0x42d39400) │ │ ├──1.54 MB (03.86%) -- gc-heap │ │ │ ├──1.50 MB (03.76%) ── unused-arenas │ │ │ └──0.04 MB (00.10%) ++ (3 tiny) │ │ ├──0.66 MB (01.66%) ++ compartment(web-worker) │ │ ├──0.41 MB (01.03%) ++ runtime │ │ └──0.25 MB (00.62%) ++ compartment(web-worker-atoms) │ ├──2.45 MB (06.15%) -- worker(resource://gre/modules/wifi_worker.js, 0x42d40c00) │ │ ├──1.77 MB (04.45%) -- gc-heap │ │ │ ├──1.73 MB (04.35%) ── unused-arenas │ │ │ └──0.04 MB (00.10%) ++ (3 tiny) │ │ └──0.68 MB (01.70%) ++ (3 tiny) │ ├──2.32 MB (05.83%) -- worker(resource://gre/modules/net_worker.js, 0x43f81400) │ │ ├──1.74 MB (04.37%) -- gc-heap │ │ │ ├──1.70 MB (04.27%) ── unused-arenas │ │ │ └──0.04 MB (00.10%) ++ (3 tiny) │ │ └──0.58 MB (01.45%) ++ (3 tiny) │ └──2.29 MB (05.76%) -- worker(resource://gre/modules/wifi_worker.js, 0x42d39800) │ ├──1.79 MB (04.48%) -- gc-heap │ │ ├──1.75 MB (04.40%) ── decommitted-arenas │ │ └──0.03 MB (00.08%) ++ (3 tiny) │ └──0.51 MB (01.28%) ++ (3 tiny) ├───9.52 MB (23.89%) ++ js-non-window ├───5.19 MB (13.02%) ── heap-unclassified ├───4.79 MB (12.02%) ++ images ├───4.26 MB (10.69%) ++ window-objects ├───2.73 MB (06.85%) ── freetype ├───1.69 MB (04.23%) ++ (10 tiny) ├───1.08 MB (02.71%) ── xpti-working-set └───0.67 MB (01.68%) -- startup-cache ├──0.67 MB (01.68%) ── data └──0.00 MB (00.00%) ── mapping

Justin Lebar (not reading bugmail)

Reporter

Comment 6

•

12 years ago

Note that heap-unclassified is ~500kb bigger with the commercial RIL. That may or may not be noise in the data; DMD would tell us. 500kb sounds like a lot for the RIL, but in any case, if we never ship the Mozilla RIL (can someone confirm this?), I'm happy to WONTFIX this bug. We can file a separate bug on the commercial RIL taking up 500kb...

Justin Lebar (not reading bugmail)

Reporter

Comment 7

•

12 years ago

We can re-open this one if we ever start shipping this RIL. For now, WONTFIX.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WONTFIX

ying.xu

Updated

•

11 years ago

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Whiteboard: [MemShrink] → [MemShrink][tarako][~5MB]

Thinker Li [:sinker]

Comment 8

•

11 years ago

How is 5MB from? RIL in JS is good for stability, I would rather to keep it in JS instead of rewriting it in C++. I would rather to squash memory from other modules before starting this bug.

[:fabrice] Fabrice Desré

Comment 9

•

11 years ago

Yeah, discussing with Andreas he'd rather keep it in js too. The issue we face and should fix though is that on multi-sim devices like Tarako we use one worker per SIM. And each one uses around 2.5MB (maybe a bit less now that we removed jsctypes from the worker). Could we refactor to use a single worker with multiple sims?

ying.xu

Comment 10

•

11 years ago

OK (In reply to Thinker Li [:sinker] from comment #8) > How is 5MB from? RIL in JS is good for stability, I would rather to keep it > in JS instead of rewriting it in C++. I would rather to squash memory from > other modules before starting this bug.

Ken Chang[:kenkai]

Comment 11

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #9) > Yeah, discussing with Andreas he'd rather keep it in js too. The issue we > face and should fix though is that on multi-sim devices like Tarako we use > one worker per SIM. And each one uses around 2.5MB (maybe a bit less now > that we removed jsctypes from the worker). > > Could we refactor to use a single worker with multiple sims? It is possible. Vicamo will provide more details. By the way, is it really useless to rewrite RIL worker in C++? Rewriting RIL worker in C++ is very very huge task. If it have to be a target in the future, we need to have a plan to start this task now. Or we just pass this problem to future again.

Vicamo Yang [:vicamo][:vyang]

Comment 12

•

11 years ago

(In reply to Ken Chang[:ken] from comment #11) > (In reply to Fabrice Desré [:fabrice] from comment #9) > > Could we refactor to use a single worker with multiple sims? > It is possible. Vicamo will provide more details. Filed as bug 960894. > By the way, is it really useless to rewrite RIL worker in C++? Rewriting RIL > worker in C++ is very very huge task. If it have to be a target in the > future, we need to have a plan to start this task now. Or we just pass this > problem to future again. If memory pressure here is already not sufficient to make us re-consider this bug, there will probably be no other reason strong enough to re-open it again. I won't call it a uselees move. It's interesting and could be a chance to refactor our code in a more extreme way. But that is still not strong enough after asking advices from Thinker and Cervantes. WIP last night: https://github.com/vicamo/b2g_mozilla-central/tree/bugzilla/864927/rebuild-from-scratch

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

11 years ago

Whiteboard: [MemShrink][tarako][~5MB] → [MemShrink:P1][tarako][~5MB]

Andreas Gal :gal

Updated

•

11 years ago

Summary: Rewrite RIL worker in C++ → optimize in-chrome workers to be more memory efficient

Andreas Gal :gal

Comment 13

•

11 years ago

I am sure we can win back most of the memory here with a little bit of JS engine work. At minimum it should be possible to start a ChromeWorker and disable the JIT from the get go if you know you won't need it (just interpreter). Also, we want workers to be able to share heaps more efficiently as well as code storage. I am sure the JS peeps have more ideas to improve this case.

Andreas Gal :gal

Updated

•

11 years ago

Assignee: nobody → nihsanullah

Andreas Gal :gal

Updated

•

11 years ago

Product: Firefox OS → Core

Andreas Gal :gal

Comment 14

•

11 years ago

Naveed, please discuss a plan with the team and file dependent bugs as needed. It would be good if we can get some initial improvements (if not improvements definitely an analysis) here quickly.

Component: General → JavaScript Engine

Nicolas B. Pierron [:nbp]

Comment 15

•

11 years ago

(In reply to Justin Lebar (not reading bugmail) from comment #0) > > 2.83 MB (08.26%) -- worker(resource://gre/modules/ril_worker.js, 0x42fd6400) > > ├──1.56 MB (04.56%) -- gc-heap > > │ ├──1.53 MB (04.47%) ── unused-arenas Is this still present? I thought we reduced the GC limit to start after 1 MB, and we are now supposed to decommit unused pages (Bug 941792) > > ├──0.61 MB (01.77%) -- compartment(web-worker) I am afraid there is no simple solution for this part. > > │ └──0.02 MB (00.07%) -- shapes-extra > > │ └──0.01 MB (00.03%) ── dict-tables Is there any reason to have objects with a lot of property in the ril_worker? > > ├──0.41 MB (01.20%) -- runtime Bug 916021 is opened to address these issues.

Depends on: 941783, 916021

[:fabrice] Fabrice Desré

Comment 16

•

11 years ago

Attached file memory-reports (deleted) — Details

Here's an up to date memory report from a dual sim device (hence the two ril workers for now - fix pending to share a single worker).

Nicholas Nethercote [inactive]

Comment 17

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #16) > Here's an up to date memory report from a dual sim device (hence the two ril > workers for now - fix pending to share a single worker). Here's some of the output from that report. │ ├──2,954,804 B (07.51%) -- worker(resource://gre/modules/ril_worker.js, 0x436c9c00) │ │ ├──1,677,088 B (04.26%) -- runtime │ │ │ ├────942,560 B (02.40%) ── script-sources │ │ │ ├────336,528 B (00.86%) ── script-data │ │ │ ├────262,144 B (00.67%) ── atoms-table │ │ │ ├─────65,536 B (00.17%) ++ code │ │ │ ├─────40,960 B (00.10%) ── runtime-object │ │ │ ├─────16,384 B (00.04%) ── gc-marker │ │ │ ├──────4,096 B (00.01%) ── dtoa │ │ │ ├──────4,096 B (00.01%) ── interpreter-stack │ │ │ ├──────4,096 B (00.01%) ── temporary │ │ │ ├────────688 B (00.00%) ── contexts │ │ │ ├──────────0 B (00.00%) ── math-cache │ │ │ └──────────0 B (00.00%) ── regexp-data │ │ ├────622,576 B (01.58%) ++ zone(0x446c5c00) │ │ ├────428,080 B (01.09%) ++ zone(0x446c5000) │ │ ├────194,292 B (00.49%) ++ zone(0x446c5800) │ │ └─────32,768 B (00.08%) ++ gc-heap │ └──2,918,164 B (07.42%) -- worker(resource://gre/modules/ril_worker.js, 0x446c4800) │ ├──1,677,088 B (04.26%) -- runtime │ │ ├────942,560 B (02.40%) ── script-sources │ │ ├────336,528 B (00.86%) ── script-data │ │ ├────262,144 B (00.67%) ── atoms-table │ │ ├─────65,536 B (00.17%) ++ code │ │ ├─────40,960 B (00.10%) ── runtime-object │ │ ├─────16,384 B (00.04%) ── gc-marker │ │ ├──────4,096 B (00.01%) ── dtoa │ │ ├──────4,096 B (00.01%) ── interpreter-stack │ │ ├──────4,096 B (00.01%) ── temporary │ │ ├────────688 B (00.00%) ── contexts │ │ ├──────────0 B (00.00%) ── math-cache │ │ └──────────0 B (00.00%) ── regexp-data │ ├────585,936 B (01.49%) ++ zone(0x449c0000) │ ├────428,080 B (01.09%) ++ zone(0x449bec00) │ ├────194,292 B (00.49%) ++ zone(0x449bf800) │ └─────32,768 B (00.08%) ++ gc-heap The obvious thing: do we really need two workers? Because a ton of that stuff is entirely duplicated. The script-sources are identical, the script-data is identical, the atoms-tables are probably identical or almost identical, etc, etc. Also, I've lost track of the various source/parsing optimizations we have going now... can we say "don't bother storing the source for this worker, I promise I won't call toSource() on it"? Relatedly, I bet that script source isn't compressed because we only have one core on these phones. And I've been wondering for a while whether our atoms-tables could be made smaller. I've never quite worked out exactly which strings get atomized and why. Finally, it's not shown in the entries I've cut and pasted above, but the various zone sub-trees for these workers seem reasonable. There's nothing that sticks out as obviously wasteful, e.g. no big chunks of unused GC heap or anything like that -- indeed, the "Other Measurements" section has this: 3,096,576 B (100.0%) -- decommitted ├──2,564,096 B (82.80%) -- workers/workers() │ ├──1,298,432 B (41.93%) ── worker(resource://gre/modules/ril_worker.js, 0x446c4800)/gc-heap/decommitted-arenas │ └──1,265,664 B (40.87%) ── worker(resource://gre/modules/ril_worker.js, 0x436c9c00)/gc-heap/decommitted-arenas └────532,480 B (17.20%) ── js-non-window/gc-heap/decommitted-arenas which shows that decomitting is working well. (In reply to Andreas Gal :gal from comment #13) > I am sure we can win back most of the memory here with a little bit of JS > engine work. At minimum it should be possible to start a ChromeWorker and > disable the JIT from the get go if you know you won't need it (just > interpreter). The JITs aren't a problem -- generated code is only 64 KiB in each worker.

[:fabrice] Fabrice Desré

Comment 18

•

11 years ago

(In reply to Nicholas Nethercote [:njn] from comment #17) > (In reply to Fabrice Desré [:fabrice] from comment #16) > > Here's an up to date memory report from a dual sim device (hence the two ril > > workers for now - fix pending to share a single worker). > ... > The obvious thing: do we really need two workers? Because a ton of that > stuff is entirely duplicated. The script-sources are identical, the > script-data is identical, the atoms-tables are probably identical or almost > identical, etc, etc. As I said, work is underway (in bug 960894) to have a single worker. > Also, I've lost track of the various source/parsing optimizations we have > going now... can we say "don't bother storing the source for this worker, I > promise I won't call toSource() on it"? Relatedly, I bet that script source > isn't compressed because we only have one core on these phones. I don't think we can do that by worker. That would be nice to be able to do "no source" like we have "use strict". nbp, does that look feasible/stupid ?

Flags: needinfo?(nicolas.b.pierron)

Andreas Gal :gal

Comment 19

•

11 years ago

This is chrome JS so anything goes. I recommend we just disable toSource in general for chrome via a pref. At least for FFOS we can definitely make that work. I am pretty sure we don't depend on it.

Andreas Gal :gal

Comment 20

•

11 years ago

If I read the code correctly for chrome we already load the source lazily and do not keep it around. Is "script-sources" maybe the bytecode?

Nicholas Nethercote [inactive]

Comment 21

•

11 years ago

> Is "script-sources" maybe the bytecode? If you hover over an entry in about:memory you get a tool-tip describing it. "script-sources" is definitely the source code. Bytecode is either in "script-data" (in the zone) or "scripts/malloc-heap/data" (in the compartments)... I don't recall how that split works now. Till would know. So maybe the lazy-loading isn't working, or isn't enabled in that repository, or maybe something is causing it to be loaded?

Andreas Gal :gal

Comment 22

•

11 years ago

Oh boy can we save a ton of memory if comment 21 is right. Looking at the source we should be using lazy script loading for chrome. Something must be broken.

Andreas Gal :gal

Comment 23

•

11 years ago

mozJSComponentLoader.cpp:810 CompileOptions options(cx); options.setPrincipals(nsJSPrincipals::get(mSystemPrincipal)) .setNoScriptRval(mReuseLoaderGlobal ? false : true) .setVersion(JSVERSION_LATEST) .setFileAndLine(nativePath.get(), 1) .setSourcePolicy(mReuseLoaderGlobal ? CompileOptions::NO_SOURCE : CompileOptions::LAZY_SOURCE);

Andreas Gal :gal

Comment 24

•

11 years ago

pref("jsloader.reuseGlobal", true); so we are supplying NO_SOURCE here

Nicolas B. Pierron [:nbp]

Comment 25

•

11 years ago

(In reply to Fabrice Desré [:fabrice] from comment #18) > (In reply to Nicholas Nethercote [:njn] from comment #17) > > Also, I've lost track of the various source/parsing optimizations we have > > going now... can we say "don't bother storing the source for this worker, I > > promise I won't call toSource() on it"? Relatedly, I bet that script source > > isn't compressed because we only have one core on these phones. > > I don't think we can do that by worker. That would be nice to be able to do > "no source" like we have "use strict". > nbp, does that look feasible/stupid ? This question was already asked in Bug 941786. The current state is that we are keeping a (not well) compressed version of the source on the runtime to parse the functions on-demand, so as long as we do not execute a function, we keep a hunk to remember its properties such as the captured variables and its inner functions. Knowing that the bytecode is larger than the source, keeping the source in addition to these LazyScript should be smaller than keeping the bytecode around without keeping the source, as long as not all functions are used permanently. Lately, the JS Engine is doing a relazification, which evict the script-data of functions which are no-longer used. I think the "no source" annotation makes sense if all the code is constantly active, in which case most of the script-data is kept in addition to the script-source. I most cases many functions are unused most of the time, and keeping the LazyScript makes more sense than removing the source. Otherwise, this will substitute the script-source and replace it by even more script-data in the memory report. In Bug 944659, which suggests to remove the source even if we have LazyScripts at the cost of loading the source again, but this might be critical in terms of performance as mentioned Luke (Bug 944659 comment 59). (In reply to Nicholas Nethercote [:njn] from comment #17) > │ ├──2,954,804 B (07.51%) -- worker(resource://gre/modules/ril_worker.js, > 0x436c9c00) > │ │ ├──1,677,088 B (04.26%) -- runtime > │ │ ├────… > │ │ ├────428,080 B (01.09%) ++ zone(0x446c5000) > │ │ ├────194,292 B (00.49%) ++ zone(0x446c5800) We have bug opened to either merge the self-hosted zone with atoms zone (see Bug 921213 comment 1) or share the atoms compartment with the main thread. > And I've been wondering for a while whether our atoms-tables could be made > smaller. I've never quite worked out exactly which strings get atomized and > why. I am not sure if Bug 921176 is related to the atoms table, but the problem sounds similar.

Flags: needinfo?(nicolas.b.pierron)

David Teller [:Yoric] - still alive but not very active

Comment 26

•

11 years ago

The use of require() might be to blame for some of the memory usage. This function works as follows: - load the source code of the module with a XHR; - do trivial rewriting on that source code to enforce modularization; - create object URL on the rewritten source code; - importScripts() on said object URL; - revoke object URL. This is certainly inefficient in terms of source caching. Once bug 568953 has landed for good, we should be able to either rewrite require() using ES6 modules or move away from require() altogether.

David Teller [:Yoric] - still alive but not very active

Updated

•

11 years ago

Whiteboard: [MemShrink:P1][tarako][~5MB] → [MemShrink:P1][tarako][~5MB][Async:blocker]

Andreas Gal :gal

Comment 27

•

11 years ago

I removed require.js. It minimally reduces script sources (10%).

Andreas Gal :gal

Comment 28

•

11 years ago

Why are we not sharing the script sources cache between the two workers?

Till Schneidereit [:till]

Comment 29

•

11 years ago

(In reply to Nicholas Nethercote [:njn] from comment #17) > The obvious thing: do we really need two workers? Because a ton of that > stuff is entirely duplicated. The script-sources are identical, the > script-data is identical, the atoms-tables are probably identical or almost > identical, etc, etc. In the long run, bug 941818 should enable us to share much more of this. We can only share script-sources and bytecode within a runtime, and have one atoms compartment per runtime, so not creating a runtime per worker would get our memory usage down by a lot, I bet. > > Also, I've lost track of the various source/parsing optimizations we have > going now... can we say "don't bother storing the source for this worker, I > promise I won't call toSource() on it"? We can't right now. As nbp says, whether it's a good idea depends on how much lazy parsing and relazification (bug 886193) happens in such a worker: if most of the scripts are going to be run infrequently, not saving the source will cost us memory because it disables relazification and thus collection of JSScripts and their bytecode - which usually is a lot larger than the source. > Relatedly, I bet that script source > isn't compressed because we only have one core on these phones. We should really switch to LZ4 for this compression (bug 837110), at least for the single-core case. While we now have LZ4 in MFBT, I abandoned that project last time around because LZ4 doesn't yet support interruptible compression. However, we could use it right now by compressing JS source in independent chunks - which has the advantage of allowing us to decompress individual chunks instead of the whole source for a file each time a function's source is requested. This is quite a bit of work, however, because we need to teach the parser to deal with the chunks, first. (In reply to Nicholas Nethercote [:njn] from comment #21) > Bytecode is either in "script-data" (in the zone) or > "scripts/malloc-heap/data" (in the compartments)... I don't recall how that > split works now. Till would know. It's script-data in the runtime: all bytecode is shared per-runtime, so that identical functions from different globals (say, dozens of jquery usages) share the same bytecode. > So maybe the lazy-loading isn't working, or isn't enabled in that > repository, or maybe something is causing it to be loaded? (In reply to Andreas Gal :gal from comment #24) > pref("jsloader.reuseGlobal", true); so we are supplying NO_SOURCE here The other option, LAZY_SOURCE, wouldn't help us here, either: LAZY_SOURCE means that we don't do lazy parsing or relazification, but that the devtools *might* get to the source, and toSource *might* work. A source hook is invoked, but it's not guaranteed to provide the source: if it doesn't, "[no source]" is returned as the function's body. The only ways to enable lazy functions here would be to change this to SAVE_SOURCE and eat the source memory usage, or use LOCAL_SOURCE after the patches in bug 944659 have landed. Whether that's a good idea remains to be seen, however: the jank caused by retrieving the sources from, maybe on-disk, archives might be too much. nbp already mentioned bug 944659 comment 59.

Status: REOPENED → NEW

Till Schneidereit [:till]

Comment 30

•

11 years ago

(In reply to Andreas Gal :gal from comment #28) > Why are we not sharing the script sources cache between the two workers? Because it's not trivial. Ideally, we would share this kind of read-only data between processes, even. Especially on b2g, but also with e10s, that would probably save a lot of memory. Last May, I proposed a scheme for doing just that in bug 876173. It's certainly even less trivial, so doing more sharing across workers might be a good intermediate step. Or doing bug 941818.

Andreas Gal :gal

Comment 31

•

11 years ago

Can you please talk to naveed to give this priority. Clearly this is very important for us so we can continue to use JS inside gecko for FFOS.

Till Schneidereit [:till]

Comment 32

•

11 years ago

Naveed, some of the tasks mentioned here need prioritization. Specifically, and in no particular order, the following would probably help quite a bit: - lz4 compression of JS source on single-core systems (bug 837110) - lazy-parsing for self-hosted JS (bug 900292) - don't store source for local scripts, but still lazily parse them (bug 944659) - sharing read-only data between workers or even processes (no bug for the first, bug 876173 for the second) - Implement DOM workers using JS helper threads (bug 941818) Theoretically, I could work on the first two, but that'd mean neglecting Shumway stuff. The third is almost ready, but might not work out performance-wise. The last two are large projects, and probably things someone from the e10s team would be best suited for.

Flags: needinfo?(nihsanullah)

Nicholas Nethercote [inactive]

Comment 33

•

11 years ago

Thanks for the summary, Till. Fabrice also mentioned bug 960894 which will merge N RIL workers into one; that bug appears to be making good progress and so is likely to be the first big win here.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 34

•

11 years ago

Also, you are all wonderful people. Thanks for looking into this.

Joe Cheng [:jcheng] (please needinfo)

Comment 35

•

11 years ago

remove [tarako] whiteboard for tarako we will track bug 960894

Whiteboard: [MemShrink:P1][tarako][~5MB][Async:blocker] → [MemShrink:P1][~5MB][Async:blocker]

Nicholas Nethercote [inactive]

Comment 36

•

11 years ago

I think this is basically covering the same ground as bug 916021. Since that bug has more dependencies, I'm going to mark this as a duplicate of that one.

Nicholas Nethercote [inactive]

Updated

•

11 years ago

Status: NEW → RESOLVED

Closed: 12 years ago → 11 years ago

No longer depends on: 916021, 941783

Resolution: --- → DUPLICATE

Nicholas Nethercote [inactive]

Comment 38

•

11 years ago

> I think this is basically covering the same ground as bug 916021. Since that > bug has more dependencies, I'm going to mark this as a duplicate of that one. Actually, this bug is more general, since JSRuntime is only one part of a worker's memory consumption. So I'll reverse the direction of the duplication shortly, assuming nobody objects. Sorry for the bugspam.

Nicholas Nethercote [inactive]

Updated

•

11 years ago

Blocks: 941783

Status: RESOLVED → REOPENED

Depends on: 876173, 900292, 921176, 921213, 941786, 941804, 941805, 944659, 964057, 964059, 886193, 921140, 921224

Resolution: DUPLICATE → ---

Mike Lee [:mlee]

Updated

•

11 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=962585

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Depends on: 970177

Nicholas Nethercote [inactive]

Comment 40

•

11 years ago

For those who aren't CC'd to the blockers, bhackett's been making some great progress on worker runtime size in bug 964057 and bug 964059. See bug 964059 comment 13 for some numbers.

Nicholas Nethercote [inactive]

Comment 41

•

10 years ago

This seems like less of a problem now, and lots of the blocking bugs have been fixed... downgrading to MemShrink:P2. Please comment if you disagree.

Whiteboard: [MemShrink:P1][~5MB][Async:blocker] → [MemShrink:P2][~5MB][Async:blocker]

Till Schneidereit [:till]

Updated

•

9 years ago

Flags: needinfo?(nihsanullah)

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

3 years ago

Assignee: nihsanullah → nobody

BMO Automation

Updated

•

2 years ago

Severity: normal → S3