Closed
Bug 1384236
Opened 7 years ago
Closed 6 years ago
Consider caching l10n resources differently in L10nRegistry
Categories
(Core :: Internationalization, enhancement, P2)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
mozilla63
Tracking | Status | |
---|---|---|
firefox63 | --- | fixed |
People
(Reporter: zbraniecki, Assigned: zbraniecki)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
As per bug 1333980 comment 29:
Initially, L10nRegistry caches whole messagecontext based on the combination of sources and resourceIDs.
That means that opening two windows with browser.xul localized using Fluent will result in the same MessageContext being shared between them.
It may make sense to consider caching the parsed AST of the FTL file instead.
That would mean that if any other Localization context wants to use the same file (say "brand.ftl") they wouldn't need to parse it.
The cost is that we'd be creating a new MessageContext for each new browser.xul, but we wouldn't be parsing FTL for it, just reuse the cache.
It's something I'd like to decide on before we start using Fluent in Firefox but not block the landing of L10nRegistry, hence this bug.
Assignee | ||
Updated•7 years ago
|
Updated•7 years ago
|
Priority: -- → P3
Assignee | ||
Comment 1•7 years ago
|
||
While working on L10nRegistry, :mossop and I discussed different levels of caching in L10nRegistry.
At the moment we have two:
1) We cache files as they're loaded.
So, if a file "brand.ftl" is loaded from the same source for the same locale multiple times, starting from the second attempt it'll just return a cached resolved promise.
2) We cache resolved contexts.
This, in result means that, for example, if we open a new window, the browser.xul gets a cached context rather than a new one (since the first browser.xul generated the context already).
This dual-caching is a bit unfortunate because it means that:
a) we cache the same bits twice - once in a raw string (1), and then parsed in context (2).
b) we return the same context between different documents. I'm not sure if it bears any security consequences, but I believe it to be pretty unexpected since contexts are stateful.
From the perspective of the flow, it seems that a more "natural" fit would be to cache parsed FTL files and construct a new context for each new call.
This would mean that we cache things once, we avoid re-parsing (fix (a)), and we create new stateful context each time (fix (b)).
As far as I understand, we originally decided to try other approaches to keep the MessageContext API clean and avoid having to standardize AST of the parsed FTL as part of the API.
I now believe that it would be valuable to expose two additional methods (maybe for Gecko+L10nRegistry use only?):
1) A method to parse FTL file
2) A sister method to `MessageContext.addMessages` that accepts the parsed data structure rather than a string
:mossop, :stas, :pike - Thoughts?
Flags: needinfo?(stas)
Flags: needinfo?(l10n)
Flags: needinfo?(dtownsend)
Comment 2•7 years ago
|
||
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #1)
> 1) We cache files as they're loaded.
> 2) We cache resolved contexts.
I don't think this is necessarily a bad design. It separates concerns: #1 saves the IO time and #2 saves the CPU/parsing time. These two caching mechanisms serve different purposes and as such they don't struck me as redundant.
> b) we return the same context between different documents. I'm not sure if
> it bears any security consequences, but I believe it to be pretty unexpected
> since contexts are stateful.
If we're worried about MessageContext's mutability, we could expose a freeze() or a seal() method which forbids any more calls to addMessages(). Being able to re-use the same MessageContext instance is a feature of its design. If there's a problem with that design, let's fix it.
> As far as I understand, we originally decided to try other approaches to
> keep the MessageContext API clean and avoid having to standardize AST of the
> parsed FTL as part of the API.
> I now believe that it would be valuable to expose two additional methods
> (maybe for Gecko+L10nRegistry use only?):
>
> 1) A method to parse FTL file
> 2) A sister method to `MessageContext.addMessages` that accepts the parsed
> data structure rather than a string
I'm opposed to this idea. I don't want the API to expose the internal representation of parsed messages to the outside world. That already happens with MessageContext.getMessage and I'm not happy with it.
If we still do something like this, I would ask that we explicitly not standardize the internal implementation and make it implementation dependent.
Flags: needinfo?(stas)
Comment 4•7 years ago
|
||
(In reply to Staś Małolepszy :stas from comment #2)
> (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #1)
>
> > 1) We cache files as they're loaded.
> > 2) We cache resolved contexts.
>
> I don't think this is necessarily a bad design. It separates concerns: #1
> saves the IO time and #2 saves the CPU/parsing time. These two caching
> mechanisms serve different purposes and as such they don't struck me as
> redundant.
It doubles the memory used by locale data though doesn't it?
> > b) we return the same context between different documents. I'm not sure if
> > it bears any security consequences, but I believe it to be pretty unexpected
> > since contexts are stateful.
>
> If we're worried about MessageContext's mutability, we could expose a
> freeze() or a seal() method which forbids any more calls to addMessages().
> Being able to re-use the same MessageContext instance is a feature of its
> design. If there's a problem with that design, let's fix it.
>
> > As far as I understand, we originally decided to try other approaches to
> > keep the MessageContext API clean and avoid having to standardize AST of the
> > parsed FTL as part of the API.
> > I now believe that it would be valuable to expose two additional methods
> > (maybe for Gecko+L10nRegistry use only?):
> >
> > 1) A method to parse FTL file
> > 2) A sister method to `MessageContext.addMessages` that accepts the parsed
> > data structure rather than a string
>
> I'm opposed to this idea. I don't want the API to expose the internal
> representation of parsed messages to the outside world. That already happens
> with MessageContext.getMessage and I'm not happy with it.
>
> If we still do something like this, I would ask that we explicitly not
> standardize the internal implementation and make it implementation dependent.
I would prefer this too.
Flags: needinfo?(dtownsend)
Assignee | ||
Comment 5•7 years ago
|
||
Would you accept that as an internal performance optimization for Gecko?
Flags: needinfo?(stas)
Comment 6•7 years ago
|
||
(In reply to Dave Townsend [:mossop] from comment #4)
> It doubles the memory used by locale data though doesn't it?
It does. But a solution to this may be for instance to drop the first type of caching :)
MessageContext already is a thin-wrapper around its parsed messages. The difference between it and what this bug is proposing is that MessageContext is a wrapper around multiple FTL files at once.
A quick grep through mozilla-central suggests that it's not very common to have more than one DTD file in a XUL document. It's mostly brand.dtd (for obvious reasons) and global.dtd (I guess to get the direction of the current locale) that get included the most.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #5)
> Would you accept that as an internal performance optimization for Gecko?
I think it would be helpful to know which problem we're trying to solve before we commit to optimizing performance. My preferred route would be to not change anything for now and measure performance without any caching, with #1 enabled, with #2 enabled, and with both enabled. It may turn out that the simplest optimization available to us is to only IO-cache brand.dtd and go with type-2 caching for everything else.
If the numbers show that we can save milliseconds by doing what this bug proposes, here's another idea. We could add a static MessageContext.from(...contexts) method which takes a variable number of other MessageContexts and merges them into a new one. It throws if the source contexts are not in the same locale. We could then cache MessageContexts per localization file and dynamically create MessageContexts for XUL/HTML files from them and avoid re-parsing. This would make use re-create Intl objects for every MessageContext though.
Flags: needinfo?(stas)
Assignee | ||
Comment 7•7 years ago
|
||
> My preferred route would be to not change anything for now and measure performance without any caching, with #1 enabled, with #2 enabled, and with both enabled.
Sounds good. I don't expect to get to perf measuring before Fx57 branches off, but we then have over a month to iron this out.
> If the numbers show that we can save milliseconds by doing what this bug proposes, here's another idea. We could add a static MessageContext.from(...contexts) method which takes a variable number of other MessageContexts and merges them into a new one. It throws if the source contexts are not in the same locale. We could then cache MessageContexts per localization file and dynamically create MessageContexts for XUL/HTML files from them and avoid re-parsing. This would make use re-create Intl objects for every MessageContext though.
I like it. My gut feeling is that we should even be able to make the Intl objects a non-issue if we don't instantiate them prematurely.
Then L10nRegistry would create a MessageContext per FTL file and cache those, and build document's MessageContexts out of those. As long as the memory overhead is close to zero, I think it may be a decent solution.
Assignee | ||
Updated•6 years ago
|
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 10•6 years ago
|
||
Based on All Hands discussion with the stakeholders, we decided to introduce `FluentResource` and `FluentResource.fromString`. This patch makes use of that to limit parsing of resources and cache each resource separately.
Comment 11•6 years ago
|
||
mozreview-review |
Comment on attachment 8991203 [details]
Bug 1384236 - Cache l10n resources differently in L10nRegistry.
https://reviewboard.mozilla.org/r/256172/#review263126
LGTM. Is `intl/l10n/fluent.js.patch` still maintained? does that change as part of this patch?
Attachment #8991203 -
Flags: review?(l10n) → review+
Assignee | ||
Comment 12•6 years ago
|
||
Thank you for the review Axel!
Unfortunately, there's a perf regression with this patch: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=2b16262842dc&newProject=try&newRevision=8a55e7949f65&framework=1
Will work on it now.
Assignee | ||
Comment 13•6 years ago
|
||
Ok, as expected, the regression is caused by lack of shared intl memoizer. Stas - are you ok with adding an optional shared memoizer to MessageContext constructor?
Flags: needinfo?(stas)
Assignee | ||
Comment 14•6 years ago
|
||
And there was a bug in our caching logic in FileSource. I'm going to separate out the three projects:
- this bug will be only about caching using FluentResource
- a separate bug for using shared intl memoizer
- bug 1464156 about changing the behavior for partial contexts
This will make it easier to track regressions both in perf and memory.
Flags: needinfo?(stas)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 17•6 years ago
|
||
Got a green try and talos: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=2b16262842dc&newProject=try&newRevision=ebd39258e147&framework=1
Pike, setting ni? to give you a chance to take another look since I removed some changes from this patch.
I'll post separate patches in the respective bugs.
Flags: needinfo?(l10n)
Assignee | ||
Comment 19•6 years ago
|
||
We used to have:
```
if (this.cache[fullPath] === false) {
return false;
}
```
which meant that we returned cache *only* if it was a currently-resolving promise.
I now replaced it with:
```
if (this.cache[fullPath] !== true) {
return this.cache[fullPath];
}
```
we now return cache for all values of cache except of `true` (because `true` means that IndexedFileSource listed it, but we don't have a value for it yet) - so for two cases a) cache is a Promise currently being resolved, and b) cache is an already resolved value.
Thank you!
Comment 20•6 years ago
|
||
Pushed by zbraniecki@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/24a8cf2131d1
Cache l10n resources differently in L10nRegistry. r=Pike
Comment 21•6 years ago
|
||
Backed out changeset 24a8cf2131d1 (bug 1384236) for failing at extensions/test/xpcshell/test_webextension_langpack.js on a CLOSED TREE
Backout link: https://hg.mozilla.org/integration/autoland/rev/d40742f21b92f26d50e9e2c8ee213d886fe81cb7
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=24a8cf2131d1964a65b96c770728f03467818dce
Log link: https://treeherder.mozilla.org/logviewer.html#?job_id=188074176&repo=autoland&lineNumber=1876
Log snippet:
[task 2018-07-13T16:53:32.580Z] 16:53:32 INFO - TEST-START | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_webextension_langpack.js
[task 2018-07-13T16:53:35.655Z] 16:53:35 WARNING - TEST-UNEXPECTED-FAIL | xpcshell.ini:toolkit/mozapps/extensions/test/xpcshell/test_webextension_langpack.js | xpcshell return code: 0
Flags: needinfo?(gandalf)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 23•6 years ago
|
||
The reason langpack test failed is because it tested for skipping an incomplete context. In such case the current code returns `null`. L10nRegistry tests tested for the same scenario but didn't verify that a MessageContext was returned.
I fixed the code and added more specificity to the test.
Flags: needinfo?(gandalf)
Comment 24•6 years ago
|
||
Pushed by zbraniecki@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cf60869ce62a
Cache l10n resources differently in L10nRegistry. r=Pike
Comment 25•6 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox63:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
You need to log in
before you can comment on or make changes to this bug.
Description
•