Closed Bug 1182037 Opened 9 years ago Closed 7 years ago

RTL 2.5: Eastern Arabic Numerals

Categories

(Firefox OS Graveyard :: Gaia, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nefzaoui, Assigned: nefzaoui)

References

Details

Attachments

(2 files)

We should try to implement Eastern Arabic Numerals into the System since they are core elements of the Arabic language in some Arabic countries.
Blocks: 1182033
Currently prototyping an extension to l20n.js that auto-processes those numerals and converts them to Eastern, leveraging the Intl APIs. Will come up soon with a PR.
Assignee: nobody → nefzaoui
Comment on attachment 8632847 [details] [gaia] anefzaoui:eastern-arabic-numerals-experiment > mozilla-b2g:master Zibi, Stas, what do you think of this as an elementary step towards the bug's goal? I initially wanted to integrate this into either l10n or l20n.js but then thought it would be better off to be kept standalone as it could be extended to cover Bug 1182036. Thanks!
Attachment #8632847 - Flags: feedback?(stas)
Attachment #8632847 - Flags: feedback?(gandalf)
I'll look more into the code, but first thing I see is that you want to use Intl. Currently, we're waiting for bug 1172609 to land in order to start preloading Intl in bug 1171856. Without that, whichever app will start using this first will see a perf hit. I'm setting a dependency on bug 1171856 so that we first have it loaded during system start and then we can use it in apps.
Blocks: 1171856
Great job, Ahmed. I like how small and self-contained this is. I do wonder, however, if this wouldn't make a nice addition to the localization library. In the current implementation, the numeral transformation happens after the DOM is localized and consequently each affected element is modified twice. The question is: should this be specific to the DOM translation, or inherent to the localization library? If we're using l20n.js to localize a CLI node app, will we want numerals to be transformed to Easter Arabic as well? We already did something similar in bug 1154438 for embedded text. Should we just use the same technique and detect numerals in the resolver while the string is being composed? Is it at all possible? Are there cases when you'd want to show Western Arabic numerals for Arabic text which would normally use Eastern? Do we need an explicit hint to the localization lib to apply the transformation logic or can this be deduced, for instance by checking the contents of the translation string or the contents of the interpolated placeable?
(In reply to Staś Małolepszy :stas from comment #5) > Great job, Ahmed. I like how small and self-contained this is. I do wonder, > however, if this wouldn't make a nice addition to the localization library. I'm totally up for that, one library to rule them all! :) > In the current implementation, the numeral transformation happens after the > DOM is localized and consequently each affected element is modified twice. > > The question is: should this be specific to the DOM translation, or > inherent to the localization library? I like its specificity to the DOM, but I'm in favour of it being related to the localization library, however I will need lots of help to know how and where to integrate it exactly, l20n.js is too big at first glance. > If we're using l20n.js to localize a > CLI node app, will we want numerals to be transformed to Easter Arabic as > well? > As far as I'm concerned, and also AFAIK, only Firefox OS needs eastern numerals for now. I can't imagine a case where we want to show them in a CLI app. > We already did something similar in bug 1154438 for embedded text. Should > we just use the same technique and detect numerals in the resolver while the > string is being composed? Is it at all possible? The case with numerals is a little bit different from l10n key strings which has somewhere to be imported from, then processed. In the case of numerals, they are already there in the DOM tree. > Are there cases when > you'd want to show Western Arabic numerals for Arabic text which would > normally use Eastern? Yes, ideally only system-related numerals should be converted, anything else (e.g. numbers that are manually entered by the user) should not be altered. > Do we need an explicit hint to the localization lib > to apply the transformation logic or can this be deduced, for instance by > checking the contents of the translation string or the contents of the > interpolated placeable? We should manually fire the script once per app, so yes, a hint. -- What do you think?
Status update: Added a simple test and fixed most of the logic and perf bugs in the lib. I don't think there's more to add for now except trying to use and test it more extensively across gaia. So we'll just have to keep waiting for bug 1171856. Also, as far as I understand this bug depends on 1171856, not block it.
No longer blocks: 1171856
Depends on: 1171856
Attached image Screenshots w/ patch applied (deleted) —
Some screenshots from across the OS with the patch applied.
Need-info'ing Zibi for perf results of this patch.
Flags: needinfo?(gandalf)
Just an update from me. I didn't have time to work on that this week, but it's at the top of my list for Monday. I like the patch, but am a little worried about performance impact, I'd like to see if we can optimize the behavior in a few ways: 1) The current code initializes MutationObserver for all locales. It seems that it should be only enabled for a limited number of languages, so NumeralsHelper.init should probably just set up event listener for languages change and only turn on MO if language matches. 2) As Stas pointed out, it seems that we'll be localizing/internationalizing elements. I think that not only for performance reasons, but also from the perspective of the developer API, it would be nice not to have to understand two systems. On the other hand, the dialpad case is one where we don't need l10n, we just need the numbers. So that's more interesting - it's something similar to Intl.DateTime formatting or Intl.Number formatting where we sometimes do this as part of l10n and sometimes separately. I'm wondering if this could/should be even part of Intl.NumerFormatting... I mean, there doesn't seem to be the case where we'd need to translate ١٢٣ to 123 in the UI. How about we just start passing digits through Intl.NumberFormat the same was as we pass dates through Intl.DateTimeFormat? 3) The regexps in _convertSingleElement seem expensive. It'd like to look closer at them I'm keeping the NI, I'll toy with this next week
Ok. So I spent some time thinking about it more, and I'm staying with my conclusion from Comment 10. We should treat numbers the same way as we treat date and time - store it in JS as variables and format for display. In other words, instead of the NumeralHelper I want dialer to do sth like this: function updateKeypad() { for (var key of document.querySelectorAll('.keypad-key')) { key.textContent = Intl.NumberFormat(navigator.languages).format(key.getAttribute('data-value')); } } And fire it for window.onlanguagechange change. Ahmed, what do you think?
Flags: needinfo?(gandalf) → needinfo?(nefzaoui)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #10) > Just an update from me. I didn't have time to work on that this week, but > it's at the top of my list for Monday. > > I like the patch, but am a little worried about performance impact, I'd like > to see if we can optimize the behavior in a few ways: > > 1) The current code initializes MutationObserver for all locales. It seems > that it should be only enabled for a limited number of languages, so > NumeralsHelper.init should probably just set up event listener for languages > change and only turn on MO if language matches. Agreed. Will update. > 2) As Stas pointed out, it seems that we'll be localizing/internationalizing > elements. I think that not only for performance reasons, but also from the > perspective of the developer API, it would be nice not to have to understand > two systems. I agree it should probably be merged into l20n.js. I'll need help with that. > On the other hand, the dialpad case is one where we don't need l10n, we just > need the numbers. > > So that's more interesting - it's something similar to Intl.DateTime > formatting or Intl.Number formatting where we sometimes do this as part of > l10n and sometimes separately. I'm wondering if this could/should be even > part of Intl.NumerFormatting... As you said, it would be nice not to have to understand two systems. I think if we're going to handle Arabic numerals we should unify our methods, and since this library is an experiment that could become the solution, I think we should stick with it only handling numbers across gaia. > I mean, there doesn't seem to be the case where we'd need to translate ١٢٣ > to 123 in the UI. How about we just start passing digits through > Intl.NumberFormat the same was as we pass dates through Intl.DateTimeFormat? > > 3) The regexps in _convertSingleElement seem expensive. It'd like to look > closer at them They do seem expensive to me too. I appreciate the help. > I'm keeping the NI, I'll toy with this next week (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #11) > Ok. So I spent some time thinking about it more, and I'm staying with my > conclusion from Comment 10. > > We should treat numbers the same way as we treat date and time - store it in > JS as variables and format for display. > > In other words, instead of the NumeralHelper I want dialer to do sth like > this: > > > function updateKeypad() { > for (var key of document.querySelectorAll('.keypad-key')) { > key.textContent = > Intl.NumberFormat(navigator.languages).format(key.getAttribute('data- > value')); > } > } > > And fire it for window.onlanguagechange change. > > Ahmed, what do you think? Hence, my reply. Still think we should unify how to handle those numbers all across gaia. Or expand this helper library to put exceptions like dialerpad into consideration.
Flags: needinfo?(nefzaoui) → needinfo?(gandalf)
(In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #12) > As you said, it would be nice not to have to understand two systems. I think > if we're going to handle Arabic numerals we should unify our methods, and > since this library is an experiment that could become the solution, I think > we should stick with it only handling numbers across gaia. (...) > > In other words, instead of the NumeralHelper I want dialer to do sth like > > this: > > > > > > function updateKeypad() { > > for (var key of document.querySelectorAll('.keypad-key')) { > > key.textContent = > > Intl.NumberFormat(navigator.languages).format(key.getAttribute('data- > > value')); > > } > > } > > > > And fire it for window.onlanguagechange change. > > > > Ahmed, what do you think? > > Hence, my reply. Still think we should unify how to handle those numbers all > across gaia. Or expand this helper library to put exceptions like dialerpad > into consideration. I don't think I understand your response. What I'm saying is that Intl.NumberFormat seems sufficient enough to handle number formatting into any system, including eastern arabic. Which removes the need for a regexp, for a Mutation Observer and for eastern->western transformations. The same way as we don't do those three things for datetimes, and we will be doing for other things Intl support - collations and currencies. Are you ok with not pursuing NumberHelper and instead focusing on using raw Intl API for your goal? (potentially using the IntlHelper - bug 1191011 - for retranslations)
Flags: needinfo?(gandalf) → needinfo?(nefzaoui)
My understanding is that intl_helper.js from bug 1191011 handles elements one at a time. The whole purpose of numerals_helper is to not have to look up for each element separately. By following l10n.js approach (adding a data-intl-numerals to all the elements needing conversion). So yes I don't mind which road to take to accomplish conversion to a single element. My goal here is to make it automatic and with the least possible additions to either the DOM or js. What do you think?
Flags: needinfo?(nefzaoui)
(In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #14) > My understanding is that intl_helper.js from bug 1191011 handles elements > one at a time. intl_helper only deals with caching of the Intl formatters, it doesn't handle elements. > The whole purpose of numerals_helper is to not have to look up for each > element separately. By following l10n.js approach (adding a > data-intl-numerals to all the elements needing conversion). > So yes I don't mind which road to take to accomplish conversion to a single > element. My goal here is to make it automatic and with the least possible > additions to either the DOM or js. > > What do you think? I struggle to understand why data-intl-numerals should be handled any differently from data-intl-datetime and data-intl-currency. For that reason, I prefer not to store information about the content type of the element in form of HTML attribute, but rather expect them to come from JS and be stored as variables. So, instead of: <span data-intl-numerals>1</span> and numeral helper Use: <span id="keypad-1"></span> and var formatter = Intl.NumberFormat(navigator.languages); document.getElementById('keypad-1').textContent = formatter.format(1); which is what we do for datetimes and will do for currencies and collations. Does it make sense?
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #15) > (In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #14) > > My understanding is that intl_helper.js from bug 1191011 handles elements > > one at a time. > > intl_helper only deals with caching of the Intl formatters, it doesn't > handle elements. > > > The whole purpose of numerals_helper is to not have to look up for each > > element separately. By following l10n.js approach (adding a > > data-intl-numerals to all the elements needing conversion). > > So yes I don't mind which road to take to accomplish conversion to a single > > element. My goal here is to make it automatic and with the least possible > > additions to either the DOM or js. > > > > What do you think? > > I struggle to understand why data-intl-numerals should be handled any > differently from data-intl-datetime and data-intl-currency. > > For that reason, I prefer not to store information about the content type of > the element in form of HTML attribute, but rather expect them to come from > JS and be stored as variables. > > So, instead of: > > <span data-intl-numerals>1</span> and numeral helper > > Use: > > <span id="keypad-1"></span> and > > var formatter = Intl.NumberFormat(navigator.languages); > document.getElementById('keypad-1').textContent = formatter.format(1); > > > which is what we do for datetimes and will do for currencies and collations. > > Does it make sense? My question is: does your suggestion have the possibility to automatically convert *all* the numerals in a page (presumably, app) without calling them *one by one* manually ?
(In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #16) > My question is: does your suggestion have the possibility to automatically > convert *all* the numerals in a page (presumably, app) without calling them > *one by one* manually ? Oh, no. Same as dates, it requires the front-end to be provided programmatically from JS. Same as with dates, times and currencies. What I'm arguing is that it should be like that, because this way you control how you store the numbers that you're displaying. My approach is that we should not store that content in HTML. It should be stored in JS and formatted on display. This means that the same way you format numbers and dates and time and currencies irrelevant of if they are part of a localized message or if they are the full content of the element. This also means that as a developer you choose how you format that number using Intl API formatting options. Lastly, when we'll start enabling formatting of numbers/dates/times in L20n for localizers, they will decide how they format those variables.
Stas, Pike, opinions?
Flags: needinfo?(stas)
Flags: needinfo?(l10n)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #17) > (In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #16) > > My question is: does your suggestion have the possibility to automatically > > convert *all* the numerals in a page (presumably, app) without calling them > > *one by one* manually ? > > Oh, no. Same as dates AFAIK, the amount of numbers across gaia is way more than the amount of dates, which requires more manual work. > What I'm arguing is that it should be like that, because this way you > control how you store the numbers that you're displaying. > > My approach is that we should not store that content in HTML. It should be > stored in JS and formatted on display. This means that the same way you > format numbers and dates and time and currencies irrelevant of if they are > part of a localized message or if they are the full content of the element. > > This also means that as a developer you choose how you format that number > using Intl API formatting options. > > Lastly, when we'll start enabling formatting of numbers/dates/times in L20n > for localizers, they will decide how they format those variables. I'm unopinionated, here. Not that I'm insisting on my approach, it's just a try for an automated way that amounts to the size of an operating system that keeps expanding with time where we have to keep watching every bit of change or addition to its code to make sure those changes/additions numerals-proof. TBH, I fail to see the advantage of manual, but then again if you guys think manual is better then I'm totally up to that.
(In reply to Ahmed Nefzaoui [:Nefzaoui] from comment #19) > AFAIK, the amount of numbers across gaia is way more than the amount of > dates, which requires more manual work. I'd like to verify that. It feels to me that the amount of numerical elements is actually not that high. Like, there are literally 11 on dialer screen, 10 of which are keys that can be looped and one is the typed-number screen that is already provided from JS. > TBH, I fail to see the advantage of manual, but then again if you guys think manual is better > then I'm totally up to that. I understand that you don't find the arguments I listed above convincing then. Oh well :) On top of them, I'll add several more: * Programmatic approach costs less memory/CPU cycles, because we don't have to handle Mutation Observer on the whole document * It allows for more specific input, since developers (and localizers in the future) can format numbers to their needs * It unifies approach to number formatting irrelevant of the output (so, if the data goes to external API - like Bluetooth, NFC, etc - developers format it the same way as if it's coming to HTML) * It works the same way as date and time and currency formatting * It allows us to dive deeper into Intl API to get more of CLDR data (like numerical ranges) * It allows for proper number formatting that are put as placeables into localization strings without costly RegExp * It unifies how we treat all number formats, not just ar-EG as a special case. I'll stop here and wait for Stas' and Axel's opinions :)
Completely orthogonal thought: Could we just use a font to achieve this? My general concern with these transcriptions is that we actually want the binary code point for a 9 if we dial a 9, regardless on wether we display it in eastern arabic or bengali numerals.
Flags: needinfo?(l10n)
I don't know if we could, but I don't think it would solve the problem of number formatting. Number formatting is not only about the glyphs we display, but also decimal separators, thousand separators etc. It feels to me that formatting numbers via Intl.NumberFormat gives us exactly that. For all cases, all languages.
Comment on attachment 8632847 [details] [gaia] anefzaoui:eastern-arabic-numerals-experiment > mozilla-b2g:master Passing by — I don’t like the fact that the font-size is being adjusted “magically” for Arabic. As I mentioned on the github PR: * if the same font is used for English and Arabic alphabets, then the font should be fixed — not the stylesheets; * if another font is used specifically for the Arabic alphabet, then we must make sure that this font-size adjustment is applied to all languages that use this font — not just Arabic (thinking of Farsi, Urdu…). Leaving an f- to keep an eye on the solution you’ll find.
Attachment #8632847 - Flags: feedback-
Just an update on my part - I'm reviving :ahmed's idea in a form of more generalized Intl-DOM binding using `data-intl-format`, `data-intl-value` and `data-intl-options`. That should help with non-l10n related intl formatted elements (including numbers, durations, datetimes, units etc.) I don't have a bug for this yet, so I'll keep this open but once I have something working I'll post it here.
Clearing an old needinfo on me. Let me know if I can help with this in the future.
Flags: needinfo?(stas)
Attachment #8632847 - Flags: feedback?(stas)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Comment on attachment 8632847 [details] [gaia] anefzaoui:eastern-arabic-numerals-experiment > mozilla-b2g:master I still like the idea, but we don't see enough of raw-intl fields in Firefox UI to justify writing MutationObserver-based approach to handle them. If we will at some point, I'd like to see it more generic handling date/relativetime/number etc.
Attachment #8632847 - Flags: feedback?(gandalf)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: