Memory leak and high CPU when opening prefs/options/settings or on startup, due to language packs matching build locales with older translations that lack some strings causing excessive locale loading
Categories
(Core :: Internationalization, defect, P2)
Tracking
()
People
(Reporter: fayolle-florent, Assigned: dminor)
References
(Blocks 1 open bug)
Details
(Keywords: nightly-community, perf:resource-use, Whiteboard: [MemShrink])
Attachments
(16 files, 2 obsolete files)
(deleted),
image/png
|
Details | |
(deleted),
application/json
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
application/x-xpinstall
|
Details | |
(deleted),
application/x-xpinstall
|
Details | |
(deleted),
application/x-javascript
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details |
With my own profile, after start Firefox nightly, it starts hanging about 10s after. I have ~10 pinned tabs and a tab showing about:home.
Interestingly, it seems that the issue occurs only with French builds, not English ones.
I bissected manually and discovered the regression appeared between 2020-05-29-21-46-31 and and 2020-05-30-09-46-43.
I'd be glad to help diagnosing the issue, but I am not sure where to start:
- what in my profile cause the issue?
- why not with Engligh builds?
At your disposal for more information.
Florent
Comment 1•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Could that be related to bug 1636067 ?
Comment 3•4 years ago
|
||
It'd take more to investigate.
On the surface it doesn't seem to, because the report is about about:home
, and contains a regression window much newer than bug 1636067.
On the other hand, since we don't really understand the nature of bug 1636067 it could be that it is triggered by some cofactors between changes to DOM, cache, extensions, and localization, and the regression window reported is a red herring.
or maybe it is a different bug. I think it's too early to tell.
Comment 4•4 years ago
|
||
Can you record a performance profile? I can't really tell from comment #0 if the browser starts out usable and then hangs, or if it hangs on startup. The instructions in https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem should work if it's usable initially and then hangs for a bit; otherwise,t he steps at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler#Profiling_Firefox_Startup would help to profile startup itself (for "the add-on", read "the builtin profiler", see instructions for enabling that on the first page)
The pushlog for the window described in comment #0 is https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a58cc68b0c51645d34cbc7f8f8a30edae77ab396&tochange=8aaca63ec5c6dd73365ba31d1972771cb847d5bc . I don't see anything super obvious there, but equally we don't really know what area the problem is in right now...
Reporter | ||
Comment 5•4 years ago
|
||
Can you record a performance profile?
Yes! Hopefully I could!
Here it is: https://share.firefox.dev/2U3FXDU
Florent
Comment 6•4 years ago
|
||
Great! Yes, the profile looks related to fluent like the other issues...
Are there any errors in the browser console (not the regular devtools, the browser one, see the specific entry in the web developer menu) about missing strings or anything else mentioning "fluent"? You may need to toggle the errors/warnings/logs buttons at the top of the browser console to see them.
Comment 7•4 years ago
|
||
Oh, and one more question - in about:config, if you filter for disable_xul
, does anything show up?
Reporter | ||
Comment 8•4 years ago
|
||
Are there any errors in the browser console (not the regular devtools, the browser one, see the specific entry in the web developer menu) about missing strings or anything else mentioning "fluent"? You may need to toggle the errors/warnings/logs buttons at the top of the browser console to see them.
Yes, there are! See the attached screenshot.
Oh, and one more question - in about:config, if you filter for disable_xul, does anything show up?
Nope :)
Cheers!
Reporter | ||
Comment 9•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 10•4 years ago
|
||
Could you please post your about:support text?
Updated•4 years ago
|
Comment 11•4 years ago
|
||
Am I reading the screenshot correctly that there are 999 (likely more than 1k?) warnings for missing translations?
That doesn't make a lot of sense, given that French is missing maybe 20 strings in that timeframe.
Is the problem still there with the latest Nightly?
Comment 12•4 years ago
|
||
The error message seems to indicate that the missing string is for tab-context-undo-close-tabs (the 999 ocurrences) but it also says lower in the log that it is also missing in en-US.
Reporter | ||
Comment 13•4 years ago
|
||
It is very difficult to have Firefox functionning. I'll do my best though.
Also worth mentioning (I discovered that lately, sorry): I don't have this slowness when running Firefox in safe-mode. And I wonder if that extension couldn't be the cause of my issues: https://github.com/eoger/tabcenter-redux/
I'll take you inform and try posting about:support
. May it be OK if I collect the info of this page in safe mode?
Cheers!
Comment 14•4 years ago
|
||
(In reply to Florent Fayolle from comment #14)
It is very difficult to have Firefox functionning. I'll do my best though.
Also worth mentioning (I discovered that lately, sorry): I don't have this slowness when running Firefox in safe-mode. And I wonder if that extension couldn't be the cause of my issues: https://github.com/eoger/tabcenter-redux/
It's possible but unlikely; safe mode has other consequences that seem to influence whether this issue occurs.
I'll take you inform and try posting
about:support
. May it be OK if I collect the info of this page in safe mode?
Sure.
(In reply to Francesco Lodolo [:flod] from comment #12)
Am I reading the screenshot correctly that there are 999 (likely more than 1k?) warnings for missing translations?
That doesn't make a lot of sense, given that French is missing maybe 20 strings in that timeframe.
Is the problem still there with the latest Nightly?
The issue seems to be that the code complaining about missing strings just runs repeatedly. That's also why things are slow. The problem is that we don't really understand why this code is running repeatedly. See the fx-desktop-dev matrix discussion between Markus and Zibi over the last 12 hours or so.
Reporter | ||
Comment 15•4 years ago
|
||
The about:support as requested.
It's possible but unlikely; safe mode has other consequences that seem to influence whether this issue occurs.
You are right. I uninstalled my addons (in my copied profile) and nothing changed.
Updated•4 years ago
|
Reporter | ||
Comment 16•4 years ago
|
||
Good news, it seems that I don't encounter the issue anymore!
You may resolve as FIXED if you consider you don't need more information about what happened.
Comment 17•4 years ago
|
||
(In reply to Florent Fayolle from comment #17)
Good news, it seems that I don't encounter the issue anymore!
I'm happy for you, but this issue has come and then gone for other people who have encountered it, too... (cf. https://bugzilla.mozilla.org/show_bug.cgi?id=1636067#c45 )
So unclear if this is really fixed. Is it possible for you to try if you can still reproduce with a slightly older version of nightly, and if so, perhaps use mozregression --find-fix
(pointing it to your profile) to figure out when this got fixed?
Reporter | ||
Comment 18•4 years ago
|
||
(In reply to :Gijs (he/him) from comment #18)
(In reply to Florent Fayolle from comment #17)
Good news, it seems that I don't encounter the issue anymore!
I'm happy for you, but this issue has come and then gone for other people who have encountered it, too... (cf. https://bugzilla.mozilla.org/show_bug.cgi?id=1636067#c45 )
So unclear if this is really fixed. Is it possible for you to try if you can still reproduce with a slightly older version of nightly, and if so, perhaps use
mozregression --find-fix
(pointing it to your profile) to figure out when this got fixed?
Unfortunately, we miss locales with mozregression (bug1401649). Are you sure it could be a good tool to investigate with?
However, if you look at the browser console, you see lots of warning related to the tab-context-undo-close-tabs
l10n key. That key has been introduced here: https://hg.mozilla.org/mozilla-central/rev/3d607fcf4deef85cd48f6cd7d0507b62244a0e5c
So I guess the bug is solved only in appearance, because the French translation for this key has been probably integrated (still this is only an hypothesis).
The "good" news is that I still have the issue when I go to about:preferences
for the same reason (see the attached screenshot). So there is a room for more investigations :).
Reporter | ||
Comment 19•4 years ago
|
||
Comment 20•4 years ago
|
||
(In reply to Florent Fayolle from comment #19)
(In reply to :Gijs (he/him) from comment #18)
(In reply to Florent Fayolle from comment #17)
Good news, it seems that I don't encounter the issue anymore!
I'm happy for you, but this issue has come and then gone for other people who have encountered it, too... (cf. https://bugzilla.mozilla.org/show_bug.cgi?id=1636067#c45 )
So unclear if this is really fixed. Is it possible for you to try if you can still reproduce with a slightly older version of nightly, and if so, perhaps use
mozregression --find-fix
(pointing it to your profile) to figure out when this got fixed?Unfortunately, we miss locales with mozregression (bug1401649). Are you sure it could be a good tool to investigate with?
I assumed you had the locale installed in the profile as a langpack, but I guess that was a mistake...
But yes, it would appear that providing the missing entity fixed things... that is interesting. The warnings are sort of strange in that the expectation is that there is 1 warning, not 1000 - the large number of extra ones indicate something went wrong, but it is not clear what. And in a clean profile, AIUI from the other bugs, the problem does not reproduce (for reasons we don't understand).
I don't suppose you are familiar with the devtools / debugger? The broken code is JS, so for the about:preferences case, it's possible it can be debugged with the browser toolbox, to find out what is going wrong...
Updated•4 years ago
|
Comment 21•4 years ago
|
||
According to comment 16, he's on Nightly, so no reliable language pack support there.
I tried installing the same French build on macOS as comment 16
http://archive.mozilla.org/pub/firefox/nightly/2020/06/2020-06-03-09-29-05-mozilla-central-l10n/
As I expected, I'm only getting one warning for tab-context-undo-close-tabs
, and one for sitedata-cookies-exceptions
when I open Preferences.
Same situation on Ubuntu with that build.
@Florent
Which version/distribution of Linux are you using?
Reporter | ||
Comment 22•4 years ago
|
||
@Florent
Which version/distribution of Linux are you using?
I am using Debian Sid. I attempt to do a dist-upgrade if, by chance, an update changes the issue.
I don't suppose you are familiar with the devtools / debugger? The broken code is JS, so for the about:preferences case, it's possible it can be debugged with the browser toolbox, to find out what is going wrong...
I am quite familiar with the devtools as I use them nearly every day. I also used to contribute to the Devtools and Firebug (though it was years ago), so I am also a bit familiar with Browser Debugger.
But I don't understand why I cannot make the breakpoints halt on the functions the profiler highlights (especially generateBundles
and generateResourceSetsForLocale
). There might be a bad recursion in generateResourceSetsForLocale
, though I am far from being confident about this hypothesis.
For what it's worth, another performance profile (the preference tab is open at 6 seconds) : https://share.firefox.dev/377OuL8
I will try the dist-upgrade and if it doesn't work anyway, I may ask for some help debugging. I am on Riot in the nightly channel, if I may ping you for that.
Comment 23•4 years ago
|
||
(In reply to Florent Fayolle from comment #23)
I am quite familiar with the devtools as I use them nearly every day. I also used to contribute to the Devtools and Firebug (though it was years ago), so I am also a bit familiar with Browser Debugger.
But I don't understand why I cannot make the breakpoints halt on the functions the profiler highlights (especially
generateBundles
andgenerateResourceSetsForLocale
). There might be a bad recursion ingenerateResourceSetsForLocale
, though I am far from being confident about this hypothesis.
Hm, odd, setting breakpoints on lines inside generateBundles
or generateResourceSetsForLocale
works for me, if I do it prior to opening about:preferences . It's possible something's broken and it may work if you quit the debugger, then restart the browser, and then open the debugger and set the breakpoints, then open about:preferences ? Also make sure you get the "right" generateBundles - you want the L10nRegistry.jsm one.
Some recursion is expected, though the code isn't the easiest to follow...
Reporter | ||
Comment 24•4 years ago
|
||
I'll make a screencast of what I do with the Browser Debugger if that may help. I may miss something important…
Also worth to mention that the breakpoint is hit when I open a new tab.
Comment 25•4 years ago
|
||
One more question, since I realize about:support doesn't list language packs.
If you open about:addons, do you have a Languages pane on the left? Are there any languages installed?
Reporter | ||
Comment 26•4 years ago
|
||
If you open about:addons, do you have a Languages pane on the left? Are there any languages installed?
Yes. French and American ones.
Reporter | ||
Comment 27•4 years ago
|
||
Reporter | ||
Comment 28•4 years ago
|
||
Comment 29•4 years ago
|
||
(In reply to Florent Fayolle from comment #27)
If you open about:addons, do you have a Languages pane on the left? Are there any languages installed?
Yes. French and American ones.
OK, that shouldn't happen.
- Do you remember installing them?
- How did you install Nightly on your system?
We had a bug recently that caused 77.* language packs to be compatible with Nightly, but it's fixed now, so those should be marked as incompatible.
We also updated about:support to include information about language packs, could you update your Nightly, then copy and paste the raw data from about:support to this bug?
Reporter | ||
Comment 30•4 years ago
|
||
OK, that shouldn't happen.
- Do you remember installing them?
- How did you install Nightly on your system?
I reset my firefox profile around April, so I guess I should have installed the language pack during that time.
I probably have installed it through this page : https://addons.mozilla.org/fr/firefox/language-tools/
Through the contextual menu ("Languages ⇒ Add dictionnary").
We also updated about:support to include information about language packs, could you update your Nightly, then copy and paste the raw data from about:support to this bug?
Ooops, I should have read your request before. I may be able to restore my Firefox profile with the language pack, hang on!
Florent
Comment 31•4 years ago
|
||
I think I might have found a way to replicate this, it would be great if someone else can try.
I'm using a recent Italian Nightly build, but I would expect this to work on an English build as well (I'm on 20200629154604, 80.0a1), given the language pack should override it.
-
Enable unsigned language pack by setting
extensions.langpacks.signatures.required
to false, and enable UI language switcher setting tointl.multilingual.enabled
true. -
Install the two language packs that I'll attach to the bug (en-US and fr). They're old language packs (e.g. for English) that I've unpacked, removed signatures, set compatibility to 80.a1, and added an empty featureGates file to avoid localization falling back for lack of a FTL file in the context.
-
Set French as first requested locale, and English as second locale. Restart the browser. In my case I had to force quit the browser.
Comment 32•4 years ago
|
||
Comment 33•4 years ago
|
||
Comment 34•4 years ago
|
||
NI to see if you can reproduce. In case, I guess I can also share the whole profile, since it doesn't contain any personal info.
Updated•4 years ago
|
Comment 36•4 years ago
|
||
I can reproduce. Investigating.
Comment 37•4 years ago
|
||
OK, my first confusion is that I'm seeing multiple calls to generateBundles
from the window being opened. I added logging in https://searchfox.org/mozilla-central/rev/31d8600b73dc85b4cdbabf45ac3f1a9c11700d8e/intl/l10n/Localization.jsm#462 and it shows:
Creating cached bundles instance for branding/brand.ftl, browser/branding/sync-brand.ftl, browser/branding/brandings.ftl, toolkit/global/textActions.ftl, browser/browser.ftl, browser/browserContext.ftl, browser/browserSets.ftl, browser/menubar.ftl, browser/protectionsPanel.ftl, browser/appmenu.ftl, preview/interventions.ftl, browser/sidebarMenu.ftl, browser/allTabsMenu.ftl, browser/places.ftl, sync, not-eager
4 times on startup (there's only 1 browser window).
All of them are directly from C++, with stacks:
> xul.dll!mozilla::intl::Localization::Activate(const bool aEager) Line 99 C++
xul.dll!mozilla::dom::HTMLSharedElement::DoneAddingChildren(bool aHaveNotified) Line 64 C++
[Inline Frame] xul.dll!mozilla::dom::PrototypeDocumentContentSink::CloseElement(mozilla::dom::Element * aElement) Line 400 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalkInternal() Line 450 C++
[Inline Frame] xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalk() Line 405 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::OnPrototypeLoadDone(nsXULPrototypeDocument * aPrototype) Line 258 C++
[Inline Frame] xul.dll!std::_Func_class<void>::_Empty() Line 1283 C++
[Inline Frame] xul.dll!std::_Func_class<void>::operator()() Line 421 C++
xul.dll!nsXULPrototypeDocument::NotifyLoadDone() Line 419 C++
> xul.dll!mozilla::intl::Localization::OnChange() Line 189 C++
xul.dll!mozilla::intl::Localization::AddResourceId(const nsTSubstring<char16_t> & aResourceId) Line 201 C++
xul.dll!mozilla::dom::Document::LocalizationLinkAdded(mozilla::dom::Element * aLinkElement) Line 0 C++
xul.dll!mozilla::dom::HTMLLinkElement::BindToTree(mozilla::dom::BindContext & aContext, nsINode & aParent) Line 134 C++
xul.dll!nsINode::InsertChildBefore(nsIContent * aKid, nsIContent * aChildToInsertBefore, bool aNotify) Line 1519 C++
[Inline Frame] xul.dll!nsINode::AppendChildTo(nsIContent * aKid, bool aNotify) Line 855 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalkInternal() Line 493 C++
[Inline Frame] xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalk() Line 405 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::OnPrototypeLoadDone(nsXULPrototypeDocument * aPrototype) Line 258 C++
[Inline Frame] xul.dll!std::_Func_class<void>::_Empty() Line 1283 C++
[Inline Frame] xul.dll!std::_Func_class<void>::operator()() Line 421 C++
xul.dll!nsXULPrototypeDocument::NotifyLoadDone() Line 419 C++
> xul.dll!mozilla::intl::Localization::Activate(const bool aEager) Line 99 C++
[Inline Frame] xul.dll!mozilla::dom::PrototypeDocumentContentSink::CloseElement(mozilla::dom::Element * aElement) Line 400 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalkInternal() Line 450 C++
[Inline Frame] xul.dll!mozilla::dom::PrototypeDocumentContentSink::ResumeWalk() Line 405 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::OnPrototypeLoadDone(nsXULPrototypeDocument * aPrototype) Line 258 C++
[Inline Frame] xul.dll!std::_Func_class<void>::_Empty() Line 1283 C++
[Inline Frame] xul.dll!std::_Func_class<void>::operator()() Line 421 C++
xul.dll!nsXULPrototypeDocument::NotifyLoadDone() Line 419 C++
> xul.dll!mozilla::intl::Localization::Activate(const bool aEager) Line 102 C++
[Inline Frame] xul.dll!mozilla::dom::Document::OnL10nResourceContainerParsed() Line 3970 C++
xul.dll!mozilla::dom::Document::OnParsingCompleted() Line 3980 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::MaybeDoneWalking() Line 593 C++
xul.dll!mozilla::dom::PrototypeDocumentContentSink::StyleSheetLoaded(mozilla::StyleSheet * aSheet, bool aWasDeferred, nsresult aStatus) Line 674 C++
xul.dll!mozilla::css::Loader::NotifyObservers(mozilla::css::SheetLoadData & aData, nsresult aStatus) Line 1546 C++
xul.dll!mozilla::SharedStyleSheetCache::LoadCompleted(mozilla::SharedStyleSheetCache * aCache, mozilla::css::SheetLoadData & aData, nsresult aStatus) Line 280 C++
Zibi, why is that? It seems... bad.
(Going to keep looking, but the steps from flod's comment mean I cannot use the JS debugger because the problem happens too early in startup, which is making this a bit tricky.)
Comment 38•4 years ago
|
||
Set French as first requested locale, and English as second locale. Restart the browser. In my case I had to force quit the browser.
In case it matters, I can reproduce without even doing this; after installing the locales, I could no longer open the prefs. So it's just trying to localize into English in my debugging.
The other (viz comment 38) issue here is, I think, to do with the combination of the approach we use to generate bundles, and the fact that the langpacks always come first (in terms of what file sources we use to look up ftl files and thus messages) but in this case are older ("more incomplete") than the app pack.
When we first enter formatWithFallbackSync
for the initial translation of browser.xhtml, we want some 400-odd strings, from 15 files. We go through 1025 (!) "bundles" until we find a combination that provides all the necessary strings. That (and creating/finding all those bundles) is what's slow. On a profile without the langpack, we manage in our first "bundle".
This is on top of the fact that both langpacks and the app resources get represented as "file sources" twice - once for browser/
and once for toolkit/
, and we apparently cannot pre-determine where to look for a given file, so "even" in the "good" case, we have some false starts finding this bundle in the generateResourceSetsForLocale
method: we first try to find everything in browser
, but toolkit/global/textActions.ftl
lives in the toolkit package, which means we then have to "backtrack" and try again after looking for that file in the toolkit package. It's not clear to me why we cannot associate files ("resource ids") to "types" of file sources. This would speed the process up by at least half, as far as I can tell.
Still, the fundamental issue seems to be about preferring langpacks and what happens if strings do not exist in those langpacks (and then we end up falling back to the app resource if available in the relevant locale). I don't really know how to fix this - although we know which messages are missing, we also don't know what file they're "supposed" to come from, so we cannot target the search so that we only try fallback bundles for the "relevant" messages.
We can fix the particular occurrence I'm seeing by down-prioritizing the langpacks if they are for the app locale and predate the app (based on dates / build ids). I don't know if that fixes the case from comment #32 where the preferred locale does not match the build locale.
Zibi, any ideas?
Comment 39•4 years ago
|
||
and we apparently cannot pre-determine where to look for a given file
We can. When working on L10nRegistry I was particularly concerned about the cost of using I/O to "try" for files all around, while we know whether the file is there or not at build time (of langpack or omni.ja).
In order to have this option available I introduced FileSource
and IndexedFileSource
. We don't use IndexedFileSource
but we could switch to it and provide a list of available file per source to speed up lookups.
That avenue was originally especially promising for "smaller" sources which override just several files and would have them whitelisted and allow for fast-bail for all others.
We can fix the particular occurrence I'm seeing by down-prioritizing the langpacks if they are for the app locale and predate the app (based on dates / build ids).
That seems to be bug 1420173. At the time when I was working on it, this has been deprioritized and I was asked to move to work on other things.
I don't know if that fixes the case from comment #32 where the preferred locale does not match the build locale.
I'm concerned that there's some remaining underlying bug. Even in the worst case scenario, as far as I understand, we should still not blow up the search to that magnitude. If your langpack has missing files, L10nRegistry should just add another fallback, not 1000 of them.
Zibi, why is that? It seems... bad.
It does, but it's not the same document. We load more windows during startup, so if you want to verify that, you'd need to dump document uri's. If all four were from the same document (browser.xhtml) then I'd be worried.
Comment 40•4 years ago
|
||
(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #40)
I don't know if that fixes the case from comment #32 where the preferred locale does not match the build locale.
I'm concerned that there's some remaining underlying bug. Even in the worst case scenario, as far as I understand, we should still not blow up the search to that magnitude. If your langpack has missing files, L10nRegistry should just add another fallback, not 1000 of them.
There aren't missing files, just missing strings, so all the bundles are valid, but formatWithFallbackSync
continues going through the bundle iterator (until it finds one with 0 missing translations). All the files exist in at least 2 filesources - the langpack and the app. browser.xhtml asks for 15 ftl files. So for each of those files, there are 2 options. So the total number of possible bundles is on the order of 2 to the power of 15. The fact that we find one that works after 2 to the power of 10 "already" presumably has to do with whether the messages missing from the langpack are in the first or the 15th file in the list. But I don't see any reason way in which we could "shortcut" this generation process, because the registry is oblivious to the messages required from each resource and thus can't guide its search for a "winning" bundle combination. Am I missing something?
Comment 41•4 years ago
|
||
That makes sense, sadly.
Here's an idea, and I might be able to hack something up.
It's perfectly fine to put all resources for one language into one bundle. It might actually be good. It used to be frowned upon, and we have comments everywhere that says we should error, but also, we don't have to.
What this does is it allows us to generate one bundle per language, and then do fallback IO on missing strings only. We re-yield the same bundle instance in that case.
That way, we can reduce the generator to only go through each source once, instead of counting to 100 by counting 1 a hundred times.
In this case, the loop would drop to 30 yields of one bundle.
That would also eliminate the memory problem, as we reiterate over the same bundle, and don't cache 2**15 different ones (which would explain the memory problem part)
Comment 42•4 years ago
|
||
It's OK to put all resources for one language into a single bundle.
If we fall back for missing strings, load all resources from the
next source, as we don't know which resource we need to fall back
for anyway.
Comment 43•4 years ago
|
||
Gonna punt this to Axel whose patch seems to fix the issue, and who's opened a discourse thread on the exact way forward here.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 47•4 years ago
|
||
It's perfectly fine to put all resources for one language into one bundle.
I agree with Axel that putting all locales from one language into one bundle would be an option, but I also agree that it would be a quite significant change in the nature of bundles and may have consequences beyond of scope of this bug.
One alternative proposal that may be less invasive would be to hard separate packaged and non-packaged locales.
Basically, the issue is that we have a packaged locale with two FileSource's - toolkit and browser - one per omni.ja for a locale X.
Then, we have a langpack with the same locale (third source).
If the langpack misses any string, L10nRegistry tries to construct the right fallback chain combining all available permutations of files between those three sources.
While toolkit and browser mostly don't overlap, langpack and packaged overlap a lot. In most cases they overlap 100% in terms of files.
If we were to augment L10nRegistry to not mingle between langpack of locale X and packaged source of locale X, we'd be reducing this to basically two variants:
- FileBundle with 15 files from langpack-x
- FileBundle with 15 files from toolkit omni.ja and browser omni.ja
This would address the memory and performance issues and wouldn't require us to redefine what a FileBundle is meant to be and wouldn't limit our fallbacking since I don't think we really want to fallback between langpack and packaged of the same locale.
The simplest way I imagine to achieve such model would be to add a new field to the FileSource
which would indicate whether the source is packaged
or langpack
and then not mix between them.
Comment 48•4 years ago
|
||
I agree that now is not a good time to drive change through the bundle implementations.
Given my patch depends on that, I'll abandon that and unassign myself.
If we revisit the idea of adding Remote Settings, we'll need to clearly understand what that's expected to be in the light of our findings here.
Updated•4 years ago
|
Comment 49•4 years ago
|
||
(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #48)
It's perfectly fine to put all resources for one language into one bundle.
I agree with Axel that putting all locales from one language into one bundle would be an option, but I also agree that it would be a quite significant change in the nature of bundles and may have consequences beyond of scope of this bug.
One alternative proposal that may be less invasive would be to hard separate packaged and non-packaged locales.
Basically, the issue is that we have a packaged locale with two FileSource's - toolkit and browser - one per omni.ja for a locale X.
Then, we have a langpack with the same locale (third source).If the langpack misses any string, L10nRegistry tries to construct the right fallback chain combining all available permutations of files between those three sources.
Four sources - the langpack also gets split up into toolkit and browser bits.
While toolkit and browser mostly don't overlap, langpack and packaged overlap a lot. In most cases they overlap 100% in terms of files.
If we were to augment L10nRegistry to not mingle between langpack of locale X and packaged source of locale X, we'd be reducing this to basically two variants:
- FileBundle with 15 files from langpack-x
What is a "filebundle" ?
- FileBundle with 15 files from toolkit omni.ja and browser omni.ja
This would address the memory and performance issues and wouldn't require us to redefine what a FileBundle is meant to be and wouldn't limit our fallbacking since I don't think we really want to fallback between langpack and packaged of the same locale.
The simplest way I imagine to achieve such model would be to add a new field to the
FileSource
which would indicate whether the source ispackaged
orlangpack
and then not mix between them.
Not mix where?
Given Axel has unassigned, are you able to drive this?
Comment 50•4 years ago
|
||
Four sources - the langpack also gets split up into toolkit and browser bits.
Correct, I was wrong in my description.
So, we'd end up with:
toolkit-langpack-x-browser
+browser-langpack-x-browser
0-toolkit
+5-browser
the only permutations would happen for files that overlap between toolkit
and browser
within the same source
(langpack
vs packaged
)
What is a "filebundle" ?
I meant FluentBundle
.
Not mix where?
Not mix when producing iterations in the L10nRegistry generator.
Given Axel has unassigned, are you able to drive this?
I am not available at the moment.
Comment 51•4 years ago
|
||
Given that we now know the problem, I am flabbergasted that we have neither moved this bug to a component, nor prioritized it in any way.
I have no idea if I am moving it to the right one, but Firefox/Untriaged is definitely the wrong one.
Updated•4 years ago
|
Comment 52•4 years ago
|
||
This could belong to various different qf:* classifications, picking up resource for now. Sounds like p1 for the cases when this happens.
Zibi, do you think you could find someone to take a look at this?
Comment 53•4 years ago
|
||
I've been working on an unrelated project for Shirley, but my team is planning to pick up L10nRegistry conversion to Rust, so we'll look into this. I can't promise any ETA or even when we'll start as of yet.
Comment 55•4 years ago
|
||
The severity field is not set for this bug.
:m_kato, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 58•4 years ago
|
||
Here's a minimized test case that reproduces the hang in JS scenario. (I'll update it to also support Rust variant once I get the PRs in bug 1660392 to support registerSources/removeSources).
Comment 59•4 years ago
|
||
A proposal on how to solve it in l10nregistry-rs - https://github.com/zbraniecki/l10nregistry-rs/issues/12
Comment 60•4 years ago
|
||
Here's an updated test that tests against the Rust L10nRegistry. It uses the list of resIds from Preferences which is the pathological case (21 resIds). browser.xhtml is smaller (16 resIds)
The uncommented resIds is the largest number before JS OOMs, and the performance is roughly 2x better with Rust than JS, but that's definitely not enough :)
The current algorithm, for all 21 resIds, will compute 2097152 (two million) permutations.
The algorithm suggested in https://github.com/zbraniecki/l10nregistry-rs/issues/12 will shrink that to 2 (two).
Comment 61•4 years ago
|
||
(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #61)
The current algorithm, for all 21 resIds, will compute 2097152 (two million) permutations.
The algorithm suggested in https://github.com/zbraniecki/l10nregistry-rs/issues/12 will shrink that to 2 (two).
I assume there are some trade-offs to this approach. Could you help me understand which combinations we're giving up? Mostly trying to figure out what's the possible impact for users.
Comment 62•4 years ago
|
||
There are, and i had it listed in my original comment and then lost that part while editing. Thank you for calling it out!
The cost is that we lose a fallback scenario where we construct a localization bundle out of resources mixed between packaged and language pack.
Imagine a scenario where 20 resources are in langpack, and one is missing.
Currently one of the fallbacks would be those 20 from langpack plus the remaining one from packaged.
In the new model, we will load all 21 from packaged since we could not find all 21 in langpack.
(With a caveat that we also will gain flexibility in identifying how many missing resources invalidate a bundle, so maybe one missing file will not and then we will create a bundle with 20 locales from langpack, localize what we can, and then create a bundle with 21 from packaged and localize the rest).
The bottom line is that we'll lose a "mixed" bundle with some resources from langpack and some from packaged. But we will not lose ability to end up with translations on a screen where what was available came from langpack, and the rest came from packaged, so i think the impact is acceptable.
Lmk if you'd like me to document the trade-off better - it's a bit tricky to describe but i can try to draw some diagrams :)
Comment 63•4 years ago
|
||
(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #63)
There are, and i had it listed in my original comment and then lost that part while editing. Thank you for calling it out!
The cost is that we lose a fallback scenario where we construct a localization bundle out of resources mixed between packaged and language pack.
Imagine a scenario where 20 resources are in langpack, and one is missing.
Uh, so just to be clear, ISTR the failures we encountered before were not because an entire file was missing, but because a single message could not be translated with the given combination of resources (because some of them were old), see comment 41. This also meant we kept re-parsing all of these resources (which is what caused the memory leak).
Is this addressed by your new strategy? Because your comment talks about entire files missing.
And, orthogonal question: what's the timeline for a fix landing here?
Comment 64•4 years ago
|
||
Is this addressed by your new strategy?
Yes, that's precisely what the algorithm proposal solves.
Because your comment talks about entire files missing.
Flod asked what's the trade-off. I described what fallback scenarios we will not use when the new model is deployed.
In other words, the tradeoff is that on the pros side we will not be constructing millions of Bundles, on the cons, we will construct fewer permutations of fallbacks.
I tried to explain that while we will miss some potential fallback scenarios, they are quirky (combined packages+langpack).
what's the timeline for a fix landing here?
The only remaining blocker, bug 1660392 is in the review, and my hope is that we'll work on the fix for this bug in February.
Comment 65•4 years ago
|
||
Hi, I found this bug after a "quick search" and it seems like I might have this exact bug. I posted a screen recording on Reddit with some information. I'm using Canadian English dictionary + language pack and the French dictionary + language pack on Firefox 85 & Firefox Developer Edition 86.
Comment 66•4 years ago
|
||
I narrowed it down to this specific language pack : https://addons.mozilla.org/en-CA/firefox/addon/english-ca-language-pack/
Comment 67•4 years ago
|
||
(In reply to Nato Boram from comment #67)
I narrowed it down to this specific language pack : https://addons.mozilla.org/en-CA/firefox/addon/english-ca-language-pack/
@Nato
I'm sorry that you're having this issue, but the root of the problem is known (it's not a specific language pack, it's a set of conditions), and just needs resources to be fixed. Happy to follow-up on the Reddit thread.
Comment 69•3 years ago
|
||
This is now unblocked and :dminor has a patch that should address teh issue in https://github.com/mozilla/l10nregistry-rs/pull/14 .
He's on PTO until the end of next week, but after that I'll work with him on finding a way to validate it and land into Gecko.
NI on Dan to remind him to take a look at it when he's back :)
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 74•3 years ago
|
||
Assignee | ||
Comment 75•3 years ago
|
||
Depends on D125235
Assignee | ||
Comment 76•3 years ago
|
||
Depends on D125236
Assignee | ||
Comment 77•3 years ago
|
||
Depends on D125237
Assignee | ||
Comment 78•3 years ago
|
||
Depends on D125238
Assignee | ||
Comment 79•3 years ago
|
||
Depends on D125239
Assignee | ||
Comment 80•3 years ago
|
||
Depends on D125240
Assignee | ||
Comment 81•3 years ago
|
||
Depends on D125241
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 83•3 years ago
|
||
Comment 84•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e46c1762af52
https://hg.mozilla.org/mozilla-central/rev/624153727c52
https://hg.mozilla.org/mozilla-central/rev/55b7540af357
https://hg.mozilla.org/mozilla-central/rev/3b37fd3ee78e
https://hg.mozilla.org/mozilla-central/rev/c973e0fbc5ee
https://hg.mozilla.org/mozilla-central/rev/d750cef3fb6a
https://hg.mozilla.org/mozilla-central/rev/b1c1070d30b5
https://hg.mozilla.org/mozilla-central/rev/e7f3e1f5b625
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 86•3 years ago
|
||
Can we uplift this to esr91? We have a bunch of reports of this causing problems in Thunderbird. (Yet unclear if it's the only issue around lang-packs.)
Comment 87•3 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #87)
Can we uplift this to esr91? We have a bunch of reports of this causing problems in Thunderbird. (Yet unclear if it's the only issue around lang-packs.)
The issues are very severe indeed, ranging from crashes (OOM | small crashes in version 91 are up 50%, and 30% have lang language packs installed), to users unable to access add-on manager in order to remediate by disabling language packs (unless they have started in safe mode).
Updated•3 years ago
|
Comment 88•3 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #87)
Can we uplift this to esr91? We have a bunch of reports of this causing problems in Thunderbird. (Yet unclear if it's the only issue around lang-packs.)
bug 1660392 is in the "depends on" list, and AIUI is required for this fix, and didn't make ESR. Uplifting that and all its dependencies, which ends up meaning a bunch of FTL feature work that landed in 92, feels like it's a really sizable change to be making on ESR... Zibi, how feasible would this be?
It's probably worth having a separate bug tracking the TB breakage and trying to analyze what is causing it (e.g. are there particular locales where this is predominantly happening, and/or are there somehow more TB users who have both a langpack and a built-in package of localization for a given locale, and/or is TB code simply asking for strings that do not exist, which could AIUI trigger similar catastrophic fallback behaviour with the previous fluent strategy for finding suitable language information).
Comment 89•3 years ago
|
||
Zibi, how feasible would this be?
Safe, but very involved. It's not just bug 1660392 but also bug 1672317. The total volume of change is a result of over year of work.
It's probably worth having a separate bug tracking the TB breakage and trying to analyze what is causing it
Agree. I'm wondering why we see it in Tb 91 and we haven't before. Is there a small-scale fix that would save us (like, hardcoding an ftl resource that is missing, or fine-tuning the old L10nRegistry.js to not fallback on a particular missing file?).
Assignee | ||
Comment 91•3 years ago
|
||
I think this would be too large a change to uplift to ESR, given the dependencies.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 92•3 years ago
|
||
Thunderbird can track this in bug 1725955, bug 1674132, and bug 1737922.
Comment 93•3 years ago
|
||
(In reply to :Gijs (he/him) from comment #89)
It's probably worth having a separate bug tracking the TB breakage
Found a way to reproduce and at least work around one case in bug 1728744.
Updated•3 years ago
|
Updated•3 years ago
|
Description
•