Closed Bug 114365 (mathvariant) Opened 23 years ago Closed 11 years ago

[MathML2.0] Support for the 'mathvariant' attribute

Categories

(Core :: MathML, defect, P4)

defect

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
relnote-firefox --- 28+

People

(Reporter: rbs, Assigned: jkitch)

References

(Blocks 3 open bugs)

Details

(Keywords: dev-doc-complete, Whiteboard: [documentation: see comments 112 and 116][mentor=fredw][lang=c++][parity-webkit])

Attachments

(6 files, 28 obsolete files)

(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
[See also 65951] MathML 2.0 has introduced a new attribute which amongst other things allows referencing characters in plane-1 (remember the artefact of <font face="Symbol"> a </font> to get alpha). These characters are currently mapped to PUA assignments in Mozilla, and users can access them via their entity references. Since plane-1 characters break nearly all current applications (including James Clark's nsgmls which sits behind most validating XML parsers), |mathvariant| provides a work-around that some people find useful. Here is a screenshot: http://www.w3.org/Math/testsuite/Presentation/TokenElements/mi/mimathvariant13.html Another peculiarity of |mathvariant| attribute is that it is a flag for style invariant characters (bug 65951), so I guess this attribute needs to be supported at some stage. I am filing this bug to jot down some thougths as I was reading. Some values of |mathvariant| can be mapped directly to CSS in mathml.css, others require back-end code. (maybe the CSS rules should inlcude the |font| shorthand to reset everything first in order to meet the style invariant requirement). <mi mathvariant='normal'>a</mi> -- can be mapped directly to CSS <mi mathvariant='bold'>a</mi> -- ditto <mi mathvariant='italic'>a</mi> -- ditto <mi mathvariant='bold-italic'>a</mi> -- ditto <mi mathvariant='double-struck'>a</mi> -- CSS + code in GFX, specifically use [mathvariant='double-struck'] { font-family: -moz-double-struck /* then, detect this font-family in GFX and */ /* simply convert the character indices */ /* to the expected PUA code points */ } <mi mathvariant='bold-fraktur'>a</mi> -- ditto here + |font-weight: bold| <mi mathvariant='script'>a</mi> -- similar tweaking here and below <mi mathvariant='bold-script'>a</mi> <mi mathvariant='fraktur'>a</mi> <mi mathvariant='sans-serif'>a</mi> -- these can be mapped directly to CSS <mi mathvariant='bold-sans-serif'>a</mi> <mi mathvariant='sans-serif-italic'>a</mi> <mi mathvariant='sans-serif-bold-italic'>a</mi> <mi mathvariant='monospace'>a</mi> <mi mathvariant='bold'>ab</mi>
I checked in a partial fix for (CSS rules) while I was doing the work to support MathML styling attributes.
Possible plan of action: CSS === - add recognition of new -moz values for |font-variant|: font-variant: -moz-script | -moz-fraktur | -moz-double-struck (currently |font-variant| only supports |small-caps|). - add CSS rules in mathml.css: [mathvariant="double-struck"] { font-variant: -moz-double-struck; } [mathvariant="fraktur"], [mathvariant="bold-fraktur"] { font-variant: -moz-fraktur; } [mathvariant="script"], [mathvariant="bold-script"] { font-variant: -moz-script; } MathML ====== - merge mi/mtext/ms into a single token class: nsMathMLTokenFrame - make mo inherits from the token class - make the token class set a frame state bit in their child nsTextFrame (to flag that since the parent is MathML-related, the child might need special processing - see below). nsTextFrame / nsTextTranformer ============================== For platforms that support surrogate pairs, when font.variant has one of the new values, transform the text to the corresponding surrogate representation and let the font code deal with the rest. For platforms that don't support surrogate pairs, transform the text to the corresponding PUA values and let the font code deal with these. ucvmath ======= It might be necessary to setup converters that work with surrogate input. If not, the transformation stage would have to output PUA values. In fact, a first implementation could be entirely based on PUA values (i.e., the current converters are retained). Then, the migration to surrogate (on platforms that support it) could be another stage. issues ====== Since the text might already been using the surrogate representation of these plane-1 characters, it might be necessary to transform this text to the existing PUA values so that the font code can deal with them. With a quick test to see if the MathML frame state bit is set, a scan can be done to attempt to perform the transformation. This way, the penalty of the scan won't affect non-MathML frames.
Filed the following sub-tasks: bug 162403: Extend conveter APIs to surrogate pairs bug 162405: Add support for font-variant: -moz-script | -moz-fraktur | -moz-double-struck bug 162412: Merge mi/mtext/ms into a single token class: nsMathMLTokenFrame
Depends on: 162403, 162405, 162412
Attached patch fixup some CSS rules in mathml.css (obsolete) (deleted) — Splinter Review
The "font" shorthand doesn't play nice with the propagation of the scriptlevel. Using "font" required to use "font-size: inherit" as well, and it was initially thought that this would have recovered the size that "font" would have reset, but it actually means that the element inherits the size of the parent... that's not what is needed in a fraction, superscript, etc, as can be seen in this example: <msub> <mi>x</mi> <mi mathvariant="bold">a</mi> </msub> So, the fix now is to stay clear of "font" and manually reset all the font-xxx properties that have to be reset when mathvariant is specified.
Attachment #194890 - Flags: superreview?(bzbarsky)
Attachment #194890 - Flags: review?(bzbarsky)
Attachment #194890 - Flags: superreview?(bzbarsky)
Attachment #194890 - Flags: superreview+
Attachment #194890 - Flags: review?(bzbarsky)
Attachment #194890 - Flags: review+
Comment on attachment 194890 [details] [diff] [review] fixup some CSS rules in mathml.css This one was checked in.
Attachment #194890 - Attachment is obsolete: true
The following mathvariant test does not display properly in Firefox 2.0.0.11: http://www.w3.org/Math/testsuite/testsuite/Presentation/TokenElements/mi/mimathvariant14.xml Checking for updates (just now) with this version results in a message that none are available. The About Box with this version yields the following information: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11 In addition, I just downloaded and re-installed the MathML fonts suggested for windows; i.e., http://web.mit.edu/is/topics/webpublishing/mathml/fonts-win.html
Depends on: 413115
http://www.w3.org/TR/2007/WD-MathML3-20071214/chapter3.html#presm.symbolchars "Processing applications that accept SMP characters are required to treat the corresponding BMP and attribute combinations identically." I guess this means that copying the text should provide the Plane 1 characters, and that might require modifying the content rather than performing a transformation for the TextRuns.
Depends on: 449396
QA Contact: ian → mathml
Blocks: mathml-2
Assignee: rbs → nobody
Alias: mathvariant
Is there any progress on this? Supporting mathvariant would be helpful for use with applications targeting MathML 3.0 such as the popular MathJax: http://www.mathjax.org MathJax makes heavy use of mathvariant.
Please fix this bug! Math is no fun without \mathbb, \mathcal and \mathfrac.
(In reply to Brad Bell from comment #6) > The following mathvariant test does not display properly in Firefox 2.0.0.11: > http://www.w3.org/Math/testsuite/testsuite/Presentation/TokenElements/mi/ > mimathvariant14.xml > ... snip ... The web page for this test has changed to: http://www.w3.org/Math/testsuite/mml2-testsuite/Presentation/TokenElements/mi/mimathvariant13.xml
(In reply to Brad Bell from comment #10) > The web page for this test has changed to: > http://www.w3.org/Math/testsuite/mml2-testsuite/Presentation/TokenElements/ > mi/mimathvariant13.xml You may want to try the build at http://www.wg9s.com/mozilla/firefox/ with MathJax fonts installed (bug 701758)
Using latest nightly and all the suggested fonts at https://developer.mozilla.org/en/Mozilla_MathML_Project/Fonts the webpage for this test: http://www.w3.org/Math/testsuite/mml2-testsuite/Presentation/TokenElements/mi/mimathvariant13.xml displays correctly. Perhaps this bug should be marked as fixed.
Only the most important characters in double-struck, script or fraktur should now render correctly and only with MathJax fonts installed.
I got feedback on this bug again. Relying on existing CSS properties and adding new properties as suggested in bug 162405 does not seem the right approach. mathvariant is supposed to map existing characters to their corresponding bold, fraktur, sans-serif etc counterpart, if they exist. So maybe a better implementation is the following: - use a single CSS property -moz-mathvariant with values default, normal, bold, italic, bold-italic, double-struck, bold-fraktur, script, bold-script, fraktur, sans-serif, bold-sans-serif, sans-serif-italic, sans-serif-bold-italic, monospace. - map the mathvariant attribute to this property in nsMathMLElement - do the char remapping in layout/generic, in the files that handle the text I'm wondering if we could also do something in layout/generic to handle at the same time the cases of single <mi>x</mi> and token elements with leading/trailing spaces <mtext> text </mtext>, that would solve bugs like bug 518592, bug 770710, bug 527201 etc Of course this approach won't address the issue mentioned in comment 7.
(In reply to Frédéric Wang (:fredw) from comment #14) > - do the char remapping in layout/generic, in the files that handle the text Have a look at nsTextRunTransformations, which does something similar. > I'm wondering if we could also do something in layout/generic to handle at > the same time the cases of single <mi>x</mi> and token elements with > leading/trailing spaces <mtext> text </mtext>, that would solve bugs like > bug 518592, bug 770710, bug 527201 etc I don't know exactly what would be involved here, but perhaps something similar could also make text frames return intrinsic widths (or perhaps frame bounds, even) that include glyph overflow to fix the problems with math content overflowing in matrices.
(In reply to Karl Tomlinson (:karlt) from comment #15) > > I'm wondering if we could also do something in layout/generic to handle at > > the same time the cases of single <mi>x</mi> and token elements with > > leading/trailing spaces <mtext> text </mtext>, that would solve bugs like > > bug 518592, bug 770710, bug 527201 etc > > I don't know exactly what would be involved here, but perhaps something > similar could also make text frames return intrinsic widths (or perhaps > frame bounds, even) that include glyph overflow to fix the problems with > math content overflowing in matrices. Interesting. My idea was to use two other -moz-mathvariant properties "math-text" (for non-mi token) and "math-identifier" which would be set in mathml.css by mtext, mo, ms, mn { -moz-mathvariant: math-text; } mi { -moz-mathvariant: math-identifier; } I'm not exactly sure about the priority between CSS rules applied in nsMathMLElement and mathml.css. So we may need to map the mathvariant attribute to -moz-mathvariant in mathml.css instead: [mathvariant="fraktur"] { -moz-mathvariant: fraktur; } etc And I don't remember exactly about the priority of CSS rules but we must do that in a way that the mathvariant rules override the rules for token elements (perhaps adding more rules). Now, in layout/generic we add MathML token elements specific stuff (leading/trailing space removal, computation of intrinsic widths) when -moz-mathvariant is not "default". When -moz-mathvariant is "math-identifier" we render the text as italic when the mi content is made of a single variant char. When -moz-mathvariant is not "default", "math-identifier" or "math-text" we perform the appropriate mathvariant transformation. If we can detect that the text frame type is MathML in layout/generic, one of the value "default" or "math-text" can be removed.
I forgot to say that we'll keep the [mathvariant] rule to reset the CSS font properties and won't use font rules for other mathvariant attributes. So in my opinion, mathvariant="bold" should only affect the characters for which a corresponding bold Unicode code point exists in contrast to "font-weight: bold" which puts all characters in bold. BTW, I think that a -moz-mathvariant property to distinguish MathML tokens will enable to fix bug 560100 without adding new CSS properties. In general, I expect that it will simplify nsMathMLTokenFrame a lot.
Depends on: 785956
Mass change: setting priority to 2 for bugs that would be nice fix if Gecko's MathML support is enabled by default in MathJax but that are not in my opinion strictly required or for which a workaround could be written in the MathJax code.
Priority: -- → P2
Keywords: helpwanted
Whiteboard: [mentor=fredw][lang=c++]
Priority: P2 → P4
Blocks: 838509
Whiteboard: [mentor=fredw][lang=c++] → [mentor=fredw][lang=c++][parity-webkit]
Assignee: nobody → jkitch.bug
@James Kitchener: that would definitely great if you could work on this bug. Here are my latest thoughts on this: 1) The TEXT_FORCE_TRIM_WHITESPACE flags are not correctly set by nsMathMLTokenFrame::ForceTrimChildTextFrames. See Patch V5 from bug 415413 for how it could be fixed and changed to a more general TEXT_IS_IN_TOKEN_MATHML flag. I guess the function could also be modified to set another flag TEXT_IS_IN_SINGLE_CHAR_MI when you are in an <mi> element with only a single char. 2) A -moz-mathvariant CSS property should be implemented. The mathvariant attribute will be mapped to that property in content/mathml/ as usual. It should be allowed on token elements and math/mstyle. 3) As Karl suggested, nsTextRunTransformations should be modified to do the mapping when TEXT_IS_IN_TOKEN_MATHML is set. Basic rules are: - if there is an explicit -moz-mathvariant value, use it. - otherwise single mi maps to "italic" ; multiple mi and other token elements to "normal". Use the TEXT_IS_IN_SINGLE_CHAR_MI as a help. - the mapping should only be done for characters where a transformation exists: latin/greek alphabet and digits (see https://en.wikipedia.org/wiki/Mathematical_alphanumeric_symbols) 4) You'll have to clean up the old CSS rules in mathml.css, the invariant char and mi code etc You can have a look at my experimental patches that I wrote some time ago. These are probably not quite correct but will give you an idea of the places to modify: https://github.com/fred-wang/MozillaCentralPatches/blob/master/mathml-token-1.diff https://github.com/fred-wang/MozillaCentralPatches/blob/master/mathml-token-2.diff https://github.com/fred-wang/MozillaCentralPatches/blob/master/mathml-token-3.diff https://github.com/fred-wang/MozillaCentralPatches/blob/master/mathml-token-4.diff https://github.com/fred-wang/MozillaCentralPatches/blob/master/mathml-token-5.diff Note that this is what the spec says, but there is also the question of whether we should keep using font-weight and font-style to emulate bold, italic and bolditalic. Indeed some people may not have fonts with characters from the math alpha num block and will see "missing char" symbols...
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Other than some slight rearranging, there has been little change since the experimental patches. One thing I am stuck on is having -moz-mathvariant override fontstyle and fontweight. layout/style/nsRuleNode.cpp appears to be a possible place to make the change, but I am unable to access both text (mathvariant) and font (fontstyle/fontweight) properties at the same time.
Attachment #807752 - Flags: feedback?(fred.wang)
Latin, Greek and number transforms are implemented. Arabic? ones are not. I've tested this code and it works properly, but I'm not sure if the algorithm is ideal. There are a large number of exceptions which results in a long switch statement. I'm not certain that the TEXT_IS_IN_SINGLE_CHAR_MI flag has an appropriate bit, but the reserve areas are already full and I couldn't find a user of 59.
Attachment #807758 - Flags: feedback?(fred.wang)
At most superficial changes since the experimental version.
Attached patch Part 3: MathML frame changes (obsolete) (deleted) — Splinter Review
Attachment #807760 - Flags: feedback?(fred.wang)
Attached patch Part 4: Remoe legacy stuff (obsolete) (deleted) — Splinter Review
Little change from the experimental patch.
Attached patch Part 5: tests (obsolete) (deleted) — Splinter Review
Tests 1 and 2 work, test 3 doesn't as the functionality is not yet implemented. What else should I be testing for?
Attachment #807763 - Flags: feedback?(fred.wang)
Thank you James. That's great that you can take over my work on this! - (In reply to James Kitchener from comment #22) > Latin, Greek and number transforms are implemented. Arabic? ones are not. I think Arabic variants still don't have any unicode positions reserved... so let's ignore the arabic mathvariant for now. > I've tested this code and it works properly, but I'm not sure if the > algorithm is ideal. There are a large number of exceptions which results in > a long switch statement. Yes, unfortunately there are exceptions in these Unicode positions. The only thing I can think right now would be to rely on hash tables to remap these characters. But not sure if it's worth doing so, given that there are not so many exceptions. > I'm not certain that the TEXT_IS_IN_SINGLE_CHAR_MI flag has an appropriate > bit, but the reserve areas are already full and I couldn't find a user of 59. I think 59 is ok, per bug 785720 comment 18.
(In reply to James Kitchener from comment #25) > Created attachment 807761 [details] [diff] [review] > Part 4: Remoe legacy stuff > > Little change from the experimental patch. I think this is one of my old attempts for bug 415413. Could you please open a new bug that will block both bug 114365 and bug 415413? And extract from this patch the TEXT_FORCE_TRIM_WHITESPACE => TEXT_IS_IN_TOKEN_MATHML renaming + the changes to nsMathMLTokenFrame::ForceTrimChildTextFrames (renamed MarkTextFramesAsTokenMathML). These changes should be pretty trivial and thus easily accepted.
(In reply to James Kitchener from comment #26) > Created attachment 807763 [details] [diff] [review] > Part 5: tests > > Tests 1 and 2 work, test 3 doesn't as the functionality is not yet > implemented. > > What else should I be testing for? Some ideas: - default value of mathvariant for single-char mi and multi-char mi (tested by mi-mathvariant-1.xhtml) - single-char mi for unicode characters that are already italic (tested by mi-mathvariant-2.xhtml) - single-char mi for unicode characters for which there is no italic variant in Unicode (should not have any effect) - mathvariant on single-char mi (the mathvariant property should override the default italicness) - normal mathvariant mapping on mtext (mathvariant-1.html) - mathvariant exceptions on mtext (mathvariant-2.html) - mathvariant vs fontweight/fontstyle (mathvariant-3.html) - mathvariant on characters that are already in the Mathematical Alphanumeric Symbols or are exceptions (should not have any effect). - mathvariant on characters for which there is no equivalent mathvariant form in Unicode (should not have any effect) - mathvariant on multi-char token elements (should apply to all the characters) - mathvariant on mstyle (should apply to all token element descendants like single-char mi, mtext etc) Also, it might be possible that fixing this bug is all what we need to address the _moz- attribute issue in the html5lib test suite. So you may want to try Henri's patch from bug 527201. (In reply to James Kitchener from comment #21) > Created attachment 807752 [details] [diff] [review] > One thing I am stuck on is having -moz-mathvariant override fontstyle and > fontweight. layout/style/nsRuleNode.cpp appears to be a possible place to > make the change, but I am unable to access both text (mathvariant) and font > (fontstyle/fontweight) properties at the same time. I think it's not a too serious issue if you are not able to fix that. fontstyle/fontweight are deprecated and hopefully are not really used in combination with mathvariant. Also, people can set font-style and font-weight via CSS and in theory mathvariant should protect from style change too ; but this should not happen too often either. As comparison, MathJax or WebKit do not handle these cases very well.
Comment on attachment 807752 [details] [diff] [review] Part 0: CSS related changes Review of attachment 807752 [details] [diff] [review]: ----------------------------------------------------------------- Let's remove the Arabic mathvariant for now. What about making mathvariant a font property rather than a text property? I originally made it a text property since that looked more a character change than a style change, but I'm not quite sure what the rules are (BTW, you might want to check https://developer.mozilla.org/en-US/docs/Adding_a_new_style_property)
Attachment #807752 - Flags: feedback?(fred.wang) → feedback+
Comment on attachment 807763 [details] [diff] [review] Part 5: tests Review of attachment 807763 [details] [diff] [review]: ----------------------------------------------------------------- I haven't checked the details. You might want to check indentation and remove trailing whitespace. See also comment 29. ::: layout/reftests/mathml/mathvariant-2.html @@ +5,5 @@ > + </head> > + <body> > + <math> > + <mrow> > + <mtext mathvariant="italic ">h</mtext> extra space in the attribute value
Attachment #807763 - Flags: feedback?(fred.wang) → feedback+
Comment on attachment 807760 [details] [diff] [review] Part 3: MathML frame changes Review of attachment 807760 [details] [diff] [review]: ----------------------------------------------------------------- That looks good to me. Some things to remember to remove: - in mathml.css, the _moz-math-font-style and mathvariant rules. - in mathfont.properties: the mathvariant remappings.
Attachment #807760 - Flags: feedback?(fred.wang) → feedback+
(In reply to James Kitchener from comment #26) > What else should I be testing for? Also, you could test dynamic changes: - adding/removing/modifying mathvariant on an mstyle/token element - modifying the text content of an mi element (i.e. changing single-char <=> multi-char). In particular, I'm wondering if TEXT_IS_IN_SINGLE_CHAR_MI is set/reset correctly in that case.
Depends on: 919164
No longer depends on: 785956
Blocks: 69409
So strictly speaking, in MarkTextFramesAsTokenMathML you should only set TEXT_IS_IN_SINGLE_CHAR_MI when the nsBlockFrame child has only *one* nsTextFrame child and when this child has only one single char (modulo the whitespace trimming)
(In reply to Frédéric Wang (:fredw) from comment #29) > Also, it might be possible that fixing this bug is all what we need to > address the _moz- attribute issue in the html5lib test suite. So you may > want to try Henri's patch from bug 527201. > I've tried a modified version of the testcase. _moz-math-columnalighn and _moz-math-rowalign are still present but there are no _moz- attributes for various combinations of fontweight, fontstyle and mathvariant. (My copy of the tree is a few days old, so it may have been fixed in the meantime).
(In reply to James Kitchener from comment #35) > (In reply to Frédéric Wang (:fredw) from comment #29) > I've tried a modified version of the testcase. > > _moz-math-columnalighn and _moz-math-rowalign are still present but there > are no _moz- attributes for various combinations of fontweight, fontstyle > and mathvariant. > > (My copy of the tree is a few days old, so it may have been fixed in the > meantime). Yes, these attributes will be removed when bug 731667 is fixed. I just meant that the html5lib test suite is likely to not contain any columnalign/rowalign attributes so perhaps it's not necessary to wait for bug 731667 to remove the workaround in parser_datreader.js (cf Henri's patch) and that your patch will be enough. BTW, your patch may also allow to pass test 32 of http://fred-wang.github.io/AcidTestsMathML/acid3/
(In reply to Frédéric Wang (:fredw) from comment #27) > Thank you James. That's great that you can take over my work on this! > > - (In reply to James Kitchener from comment #22) > > Latin, Greek and number transforms are implemented. Arabic? ones are not. > > I think Arabic variants still don't have any unicode positions reserved... > so let's ignore the arabic mathvariant for now. They were added to Unicode 6.1, in the U+1EE00–U+1EEFF block.
(In reply to Khaled Hosny from comment #37) > > I think Arabic variants still don't have any unicode positions reserved... > > so let's ignore the arabic mathvariant for now. > > They were added to Unicode 6.1, in the U+1EE00–U+1EEFF block. Ah thanks, I was reading the MathML3 draft and didn't see any mention http://www.w3.org/Math/draft-spec/chapter3.html I'll send a mail to the MathML mailing list.
So the Math WG has updated the draft and added a reference to the Unicode document: http://www.unicode.org/charts/PDF/U1EE00.pdf IIUC, "Double-struck symbols 1EEA1 ب ARABIC MATHEMATICAL DOUBLE-STRUCK BEH ≈ <font> 0628  arabic letter beh" means that <mtext mathvariant="double-struck">&#x0628;</mtext> will map to &#x1EEA1; but I'm not really sure about those that have an additional arrow: "Isolated symbols 1EE00 ا ARABIC MATHEMATICAL ALEF → FE8D  arabic letter alef isolated form ≈ <font> 0627  arabic letter alef" would mean that <mtext mathvariant="isolated">&#x0627;</mtext> maps to &#x1EE00; or &#xFE8D?
Blocks: 518592
(In reply to Frédéric Wang (:fredw) from comment #39) > would mean that <mtext mathvariant="isolated">&#x0627;</mtext> maps to > &#x1EE00; or &#xFE8D? U+FE8D is a compatibility character (and so are the whole Arabic Presentation Forms A and B blocks) and should not be used for anything: http://www.unicode.org/faq/middleeast.html#1.
Thanks Khaled. @James: David updated the XML Entity Declarations for Characters draft, so the mapping can also be found here: http://lists.w3.org/Archives/Public/www-math/2013Sep/0012.html Regarding the tests, I'm wondering if rather than just testing a few mappings like in mathvariant-1 and mathvariant-2 we should rather have an exhaustive list, similar to what Martin Robinson started to do in WebKit: https://bugs.webkit.org/attachment.cgi?id=186282&action=diff#a/LayoutTests/mathml/presentation/mathvariant.html_sec1
Depends on: 923890
Status update: Feedback up to comment 36 has been addressed. Revised patches will be uploaded when I am back. Support for Arabic mathvariant is blocked on actually supporting the new 1EExx characters. (I have filed bug 923890 for this). I have a plan for implementing it, but at present all it would accomplish is unreadable equations.
(In reply to Frédéric Wang (:fredw) from bug 923890 comment #3) > OK, I see. It seems to me that this should not block the mathvariant bug, we > can just do the mapping to the plane 1 characters and expect that people > will have the appropriate Arabic fonts installed. This is true for the > double-struck or fraktur latin characters too, so I don't think the Arabic > one should be handled differently. One of the concern I have though, is for > the very common case of single-char mi like <mi>x</mi> that currently > renders correctly without math fonts but might render incorrectly if we rely > on the Math Alpha Num block instead of the style=italic style. I'm not quite > sure what would be the best solution, here. It looks like I can selectively apply font-style and font-weight within nsMathVariantTextRunFactory::RebuildTextRun, without resorting to CSS. It should work on a character by character basis, excluding inappropriate characters (e.g. non-alphanumeric symbols). I'm planning on using it for bold, italic and bold-italic mathvariants.
No longer depends on: 923890
According to http://www.w3.org/2003/entities/2007doc/double-struck.html, dāl (0xO62F) maps to "ARABIC MATHEMATICAL DOUBLE STRUCK DAL" (U+1EEA3) But there appears to be a similar character (ḏāl (0x0630)). Is the second one (0x0630) left unchanged or would both be transformed to the same character?
(In reply to James Kitchener (:jkitch) from comment #44) > According to http://www.w3.org/2003/entities/2007doc/double-struck.html, dāl > (0xO62F) maps to "ARABIC MATHEMATICAL DOUBLE STRUCK DAL" (U+1EEA3) > > But there appears to be a similar character (ḏāl (0x0630)). > > Is the second one (0x0630) left unchanged or would both be transformed to > the same character? 0x0630 is there near the end of the table (the fourth from below).
(In reply to James Kitchener (:jkitch) from comment #43) > (In reply to Frédéric Wang (:fredw) from bug 923890 comment #3) > > OK, I see. It seems to me that this should not block the mathvariant bug, we > > can just do the mapping to the plane 1 characters and expect that people > > will have the appropriate Arabic fonts installed. This is true for the > > double-struck or fraktur latin characters too, so I don't think the Arabic > > one should be handled differently. One of the concern I have though, is for > > the very common case of single-char mi like <mi>x</mi> that currently > > renders correctly without math fonts but might render incorrectly if we rely > > on the Math Alpha Num block instead of the style=italic style. I'm not quite > > sure what would be the best solution, here. > > It looks like I can selectively apply font-style and font-weight within > nsMathVariantTextRunFactory::RebuildTextRun, without resorting to CSS. It > should work on a character by character basis, excluding inappropriate > characters (e.g. non-alphanumeric symbols). I'm planning on using it for > bold, italic and bold-italic mathvariants. Karl, any idea about these bold, italic and bold-italic mathvariants? Should we follow the spec and use the characters from the math alpha num char? Or keep using CSS style to guarantee correct display even when people don't have the appropriate fonts?
Flags: needinfo?(karlt)
(In reply to Frédéric Wang (:fredw) from comment #46) > (In reply to James Kitchener (:jkitch) from comment #43) > > (In reply to Frédéric Wang (:fredw) from bug 923890 comment #3) > > > OK, I see. It seems to me that this should not block the mathvariant bug, we > > > can just do the mapping to the plane 1 characters and expect that people > > > will have the appropriate Arabic fonts installed. This is true for the > > > double-struck or fraktur latin characters too, so I don't think the Arabic > > > one should be handled differently. One of the concern I have though, is for > > > the very common case of single-char mi like <mi>x</mi> that currently > > > renders correctly without math fonts but might render incorrectly if we rely > > > on the Math Alpha Num block instead of the style=italic style. I'm not quite > > > sure what would be the best solution, here. > > > > It looks like I can selectively apply font-style and font-weight within > > nsMathVariantTextRunFactory::RebuildTextRun, without resorting to CSS. It > > should work on a character by character basis, excluding inappropriate > > characters (e.g. non-alphanumeric symbols). I'm planning on using it for > > bold, italic and bold-italic mathvariants. > > Karl, any idea about these bold, italic and bold-italic mathvariants? Should > we follow the spec and use the characters from the math alpha num char? Or > keep using CSS style to guarantee correct display even when people don't > have the appropriate fonts? If we are considering OpenType math fonts (with MATH table) in the future, then following the spec is really the only option to get proper rendering.
So the "single char MI without mathvariant" case should be rendered using fontstyle, everything else uses the the normal character mapping?
(In reply to Khaled Hosny from comment #47) > If we are considering OpenType math fonts (with MATH table) in the future, > then following the spec is really the only option to get proper rendering. I'm not following the connection here. I didn't expect the MATH table to map bold, italic and bold-italic mathvariants. I thought mathvariant was a MathML quirk to support SMP characters using only the BMP plane, and to support these characters before math fonts were widely available. (In reply to Frédéric Wang (:fredw) from comment #46) > Karl, any idea about these bold, italic and bold-italic mathvariants? Should > we follow the spec and use the characters from the math alpha num char? Or > keep using CSS style to guarantee correct display even when people don't > have the appropriate fonts? MacOS 10.7 ships STIX, and Vista, and I assume later NT systems, have Cambria Math, so that leaves XP systems that don't have MS Office 2007 or PowerPoint Viewer, most of which systems will be unsupported after April. So, we could transform all these variants/characters to their mathematical alphanumeric symbols. But is there an advantage in doing that over using style and weight from within the
Flags: needinfo?(karlt)
... text run transformations?
Even if the advantage is just that it is simpler, that might be good enough reason. I don't feel strongly, but staying with style/weight feels safe. If code is added to support the italic case, then I assume it wouldn't be too difficult to also support bold?
The italics case actually consists of two subcases. 1. Only single char <mi> italics are supported 2. All mathvariant=italic usage is supported The difference between the two depends on whether the text-run can consist of both transformable and non-transformable characters. Assuming case 1, adding bold support will take some effort (the same amount of effort as supporting case 2 actually), but I know what needs to be done. For case 2, supporting bold is trivial.
Two precisions: - In any case, I think the "single char MI without mathvariant" must be treated the same as mathvariant="italic" (otherwise the corresponding reftest will fail anyway). And that italic/bold-italic/bold must use consistent methods. - When I said "keeping CSS style", I just meant applying style/weight somewhere in the code path rather than performing a unicode change. If James is able to do that directly from within the text run transformation code, then perhaps it is better (to group the mathvariant implementation in the same place and ensure that mathvariant inhibits other CSS style/weight rules). Personally, I'd prefer to follow the specification and do something clear and consistent ; using different methods for mathvariant has always been the source of confusion. I think it is safe to think that the Desktop systems will have the appropriate fonts installed. However, my concern is for the mobile platforms like Android or FirefoxOS, given that there have not been any progress on https://code.google.com/p/android/issues/detail?id=36011 or bug 775060...
Yes, thanks, mobile platforms are a good reason to use style/weight, and the idea of doing it from the transformation code sounds good to me. (In reply to James Kitchener (:jkitch) from comment #52) > The difference between the two depends on whether the text-run can consist > of both transformable and non-transformable characters. I would not expect a mix of transformable and non-transformable characters to be common, and transforming only some of the characters within one token seems odd to me. I'd be quite happy with an easy way out for this case, probably not transforming at all unless all characters should be transformed in the same way.
(In reply to Karl Tomlinson (:karlt) from comment #49) > (In reply to Khaled Hosny from comment #47) > > If we are considering OpenType math fonts (with MATH table) in the future, > > then following the spec is really the only option to get proper rendering. > > I'm not following the connection here. > I didn't expect the MATH table to map bold, italic and bold-italic > mathvariants. > > I thought mathvariant was a MathML quirk to support SMP characters using > only the BMP plane, and to support these characters before math fonts were > widely available. My understanding of using CSS styles here is that italic, bold etc. will use font styles just like regular text italic and bold. If true, this wouldn’t work with an OpenType math implementation since there is usually only a single math font with italic and bold etc. expected to be taken from the Math block, and glyphs there will have the proper italic correction, accent placement info, sub/superscript placement info and optical size variants, which all are required for proper rendering (not to mention that math italic can have different design from text italics).
Yes, I guess we are confronted to what to choose between the ideal situation with a single Open Type MATH and the intermediary hack using mixed fonts / styles. In the future, I think we want to avoid mixing fonts for stretchy characters. It seems that we should push to fix bugs regarding math font availability / autoinstallation or to better document how to use Web fonts. On a side note, a similar issue is that the mathml.css stylesheet contains default font-family for STIX or MathJax. If the text of the Web page is different, this may lead to inconsistency between text / math fonts ; especially the font-size difference may be significant. MathJax has hacks to try to recompute the font-size of the math elements and make it match the surrounding text, but ideally I'd like to have a clean solution here too and remove the font-family from the stylesheet in the future. Of course for all these problems, having a better support for math font would help.
(In reply to Khaled Hosny from comment #55) > (In reply to Karl Tomlinson (:karlt) from comment #49) > My understanding of using CSS styles here is that italic, bold etc. will use > font styles just like regular text italic and bold. If true, this wouldn’t > work with an OpenType math implementation since there is usually only a > single math font with italic and bold etc. expected to be taken from the > Math block, and glyphs there will have the proper italic correction, accent > placement info, sub/superscript placement info and optical size variants, > which all are required for proper rendering (not to mention that math italic > can have different design from text italics). Thank you. I had forgotten that math fonts may not have italic and bold members on the font family. The font system will slant and embolden glyphs when requested, but it will not be as good as using the intended glyphs. If sites know that math fonts are being used, perhaps because the site is providing the font, they can specify the appropriate Unicode code points for the mathematical alphanumeric symbols, but it is unfortunate that these are rendered differently. I guess we could eventually check whether the primary font supports mathematical alphanumeric symbols before choosing between a unicode or style transformation, but I don't want to hold back the work here waiting for such a solution. Moving the mathvariant style selection to the textrun transformation seems a step in the right direction, which can be improved later.
Attached patch bug-114365-testjs.patch (obsolete) (deleted) — Splinter Review
Attached patch bug-114365-testjs.patch (obsolete) (deleted) — Splinter Review
Attachment #816923 - Attachment is obsolete: true
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Minor changes since the last patch. Mathvariant is now a font property, rather than a text property and now overrides fontweight and fontstyle.
Attachment #807752 - Attachment is obsolete: true
Attachment #820245 - Flags: feedback?(fred.wang)
Attachment #820245 - Attachment description: css.diff → Part 0: CSS related changes
Changes since the last patch: 1. Arabic now supported 2. bold/italic now performed through fontweight/fontstyle rather than character transformations. The code also enforces the mathvariant overriding fontweight/fontstyle behaviour. 3. A hashtable is used for Arabic and Latin lookups. Any comments before I send this off to review?
Attachment #807758 - Attachment is obsolete: true
Attachment #807758 - Flags: feedback?(fred.wang)
Attachment #820249 - Flags: feedback?(fred.wang)
fontstyle and fontweight now report a deprecation warning.
Attachment #807759 - Attachment is obsolete: true
Attachment #820250 - Flags: review?(fred.wang)
Attached patch Part 3: MathML frame changes (obsolete) (deleted) — Splinter Review
Comments have been applied
Attachment #807760 - Attachment is obsolete: true
Attachment #820251 - Flags: review?(fred.wang)
Attached patch Part 4: Remove legacy stuff (obsolete) (deleted) — Splinter Review
Correct patch this time. This mostly removes the previous mathvariant implementation. There is some rearrangements made to mathml.css which were copied from your experimental patches, but I don't think there are any behavioural differences.
Attachment #807761 - Attachment is obsolete: true
Attachment #820253 - Flags: review?(fred.wang)
Attached patch Part 5: tests (obsolete) (deleted) — Splinter Review
Dynamic and exhaustive tests have been added.
Attachment #807763 - Attachment is obsolete: true
Attachment #820255 - Flags: review?(fred.wang)
Comment on attachment 820245 [details] [diff] [review] Part 0: CSS related changes Review of attachment 820245 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/style/nsRuleNode.cpp @@ +3286,5 @@ > const nsCSSValue* weightValue = aRuleData->ValueForFontWeight(); > + if (aFont->mMathVariant != NS_MATHML_MATHVARIANT_NONE) { > + // -moz-math-variant overrides font-weight > + aFont->mFont.weight = NS_FONT_WEIGHT_NORMAL; > + } if (eCSSUnit_Enumerated == weightValue->GetUnit()) { do you mean } else if { here?
Attachment #820245 - Flags: feedback?(fred.wang) → feedback+
Comment on attachment 820249 [details] [diff] [review] Part 1: TextFrame and Textrun transformation changes Review of attachment 820249 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/generic/nsTextRunTransformations.cpp @@ +534,5 @@ > + } > + if (aMathVar > NS_MATHML_MATHVARIANT_STRETCHED) { > + // Illegal value > + return aCh; > + } So if this is never reached, should it raise some kind of assertions? @@ +540,5 @@ > + if ('A' <= aCh && aCh <= 'Z') { > + baseChar = aCh - 'A'; > + varType = kIsLatin; > + } > + else if ('a' <= aCh && aCh <= 'z') { I think here and elsewhere the coding style is } else if { (I haven't checked if this is to be consistent with the rest of the file) @@ +541,5 @@ > + baseChar = aCh - 'A'; > + varType = kIsLatin; > + } > + else if ('a' <= aCh && aCh <= 'z') { > + baseChar = MATH_BOLD_SMALL_A-MATH_BOLD_UPPER_A + aCh - 'a'; Perhaps here and elsewhere, you should explain why you consider the difference MATH_BOLD_SMALL_A-MATH_BOLD_UPPER_A. @@ +550,5 @@ > + varType = kIsNumber; > + } > + else if (GREEK_UPPER_ALPHA <= aCh && aCh <= GREEK_UPPER_OMEGA) { > + if (aCh == HOLE_GREEK_UPPER_THETA) > + // Nothing at this code point is transformed Perhaps the exceptions HOLE_GREEK_UPPER_THETA, GREEK_LETTER_DIGAMMA && aMathVar == NS_MATHML_MATHVARIANT_BOLD etc that return immediately should considered at the beginning (after the aMathVar > NS_MATHML_MATHVARIANT_STRETCHED check). That way, you won't have to bother with them in the rest of the code. @@ +561,5 @@ > + + aCh-GREEK_LOWER_ALPHA; > + varType = kIsGreekish; > + } > + else if (0x0600 <= aCh && aCh <= 0x06FF) { > + no need for this blank line. @@ +567,5 @@ > + } > + else { > + > + switch(aCh) { > + case GREEK_UPPER_THETA: bad indent. The "case" statement should be shift by 2 spaces with respect to the "switch" statement. @@ +629,5 @@ > + > + varType = kIsGreekish; > + } > + > + if (varType == kIsNumber) { Can you add a comment to explain that how the Unicode code points for digits form contiguous blocks in the order bold, double-struck etc Similarly elsewhere and how it relates to the computation of baseChar above. @@ +677,5 @@ > + } > + // exclude _NONE and _NORMAL > + return baseChar + MATH_BOLD_UPPER_ALPHA + > + multiplier*(MATH_ITALIC_UPPER_ALPHA - MATH_BOLD_UPPER_ALPHA); > + The is extra space on this blank line and after the "MATH_BOLD_UPPER_ALPHA +". @@ +680,5 @@ > + multiplier*(MATH_ITALIC_UPPER_ALPHA - MATH_BOLD_UPPER_ALPHA); > + > + } > + > + extra blank line @@ +707,5 @@ > + else { > + > + // Must be Latin > + if (aMathVar > NS_MATHML_MATHVARIANT_MONOSPACE) { > + //Latin doesn't support the arabic mathvariants missing space between "//" and "Latin" @@ +711,5 @@ > + //Latin doesn't support the arabic mathvariants > + return aCh; > + } > + multiplier = aMathVar - 2; // Offset to avoid _NONE and _NORMAL variants > + tempChar = baseChar + MATH_BOLD_UPPER_A + extra space at the end of the line @@ +716,5 @@ > + multiplier*(MATH_ITALIC_UPPER_A - MATH_BOLD_UPPER_A ); > + } > + > + if (!gMathVarCharTable) { > + InitMathVarTable(); bad indent @@ +722,5 @@ > + uint32_t newChar; > + bool found = gMathVarCharTable->Get(tempChar, &newChar); > + if (found) { > + return newChar; > + } else if ( varType == kIsLatin) { extra space before varType @@ +1100,5 @@ > + nsAutoTArray<nsStyleContext*,50> styleArray; > + nsAutoTArray<uint8_t,50> canBreakBeforeArray; > + bool mergeNeeded = false; > + > + bool singleCharMI = extra space at end of line @@ +1107,5 @@ > + uint32_t length = aTextRun->GetLength(); > + const PRUnichar* str = aTextRun->mString.BeginReading(); > + nsRefPtr<nsStyleContext>* styles = aTextRun->mStyles.Elements(); > + > + uint8_t mathVar; extra space at end of line @@ +1131,5 @@ > + > + if (ch == ch2 && ch != 0x20) { > + // Don't perform the transformation if a character cannot be transformed. > + // There is an exception for whitespace as it both common and innocuous. > + doTransform = false; So doTransform is only for bold/italic/bold-italic, right? If so it would need a better name. There are other space chars that could happen for example "U+00A0 no-break space" is frequent in MathJax. Perhaps it's safer and more simple to keep the current behavior and always perform the CSS style... We will fix that later when we implement mathvarian bold/italic/bold-italic correctly. @@ +1135,5 @@ > + doTransform = false; > + } > + if (mathVar == NS_MATHML_MATHVARIANT_BOLD || > + mathVar == NS_MATHML_MATHVARIANT_BOLD_ITALIC || > + mathVar == NS_MATHML_MATHVARIANT_ITALIC) { bad indent @@ +1136,5 @@ > + } > + if (mathVar == NS_MATHML_MATHVARIANT_BOLD || > + mathVar == NS_MATHML_MATHVARIANT_BOLD_ITALIC || > + mathVar == NS_MATHML_MATHVARIANT_ITALIC) { > + // Undo the change as it will be handled as a font styling. Can you open a follow-up bug to implement mathvariant with only char-transforms in the future? And mention the bug here? @@ +1140,5 @@ > + // Undo the change as it will be handled as a font styling. > + ch2 = ch; > + } > + > + if (ch2 == uint32_t(-1)) { I'm not exactly sure when this happens, but using unsigned and -1 looks weird... @@ +1223,5 @@ > + } > + > + if (mergeNeeded) { > + // Now merge multiple characters into one multi-glyph character as required > + // and deal with skipping deleted accent chars The reference to "accent chars" that you copy from the case transform routine is not relevant here.
Attachment #820249 - Flags: feedback?(fred.wang) → feedback+
Comment on attachment 820250 [details] [diff] [review] Part 2: Parse the mathvariant, fontstyle and fontweight properties Review of attachment 820250 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/mathml/content/src/nsMathMLElement.cpp @@ +652,5 @@ > + // "Specified the font style to use for the token. Deprecated in favor of > + // mathvariant." > + // > + // values: "normal" | "italic" > + // default: normal (except on <mi>) extra space at end of line.
Attachment #820250 - Flags: review?(fred.wang) → review+
Comment on attachment 820251 [details] [diff] [review] Part 3: MathML frame changes Review of attachment 820251 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/mathml/nsMathMLTokenFrame.cpp @@ +59,1 @@ > } I think all these cases can now be grouped in one big if statement that returns eMathMLFrameType_ItalicIdentifier? @@ +65,5 @@ > { > + nsIFrame* child = nullptr; > + uint32_t childCount = 0; > + > + // Set flags on child text frames extra space at end of line @@ +68,5 @@ > + > + // Set flags on child text frames > + // - to force them to trim their leading and trailing whitespaces. > + // - Indicate which frames are suitable for mathvariant > + // - flag single charancter <mi> frames for special italic treatment typo
Attachment #820251 - Flags: review?(fred.wang) → review+
Comment on attachment 820253 [details] [diff] [review] Part 4: Remove legacy stuff Review of attachment 820253 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/mathml/mathml.css @@ +74,5 @@ > + - scriptminsize -> -moz-script-min-size > + - scriptlevel -> -moz-script-level > + - mathsize -> font-size > + - mathcolor -> color > + - mathbackground -> background} extra }
Attachment #820253 - Flags: review?(fred.wang) → review+
Comment on attachment 820255 [details] [diff] [review] Part 5: tests Review of attachment 820255 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, I haven't checked the details, but that looks a great set of tests. Can you remove the mathvariant-5.html.rej file? ::: layout/reftests/mathml/mathvariant-4.html @@ +7,5 @@ > + <math> > + <mrow> > + <mtext mathvariant="bold">&#x1d49c;</mtext> > + <mtext mathvariant="bold">&#x212c;</mtext> > + <mtext mathvariant="bold">&#x00e1;</mtext> So perhaps choose another mathvariant if we keep the current behavior for bold/italic/bold-italic.
Attachment #820255 - Flags: review?(fred.wang) → review+
Thanks James, the approach looks good to me (I'm still not sure about the best approach for part 1, though) and it's great to see mathvariant finally implemented properly and old hacks removed. I think you should ask David Baron to review part 0 and Karl Tomlinson to review part 1, as I'm less familiar about these parts of the code.
(In reply to Frédéric Wang (:fredw) from comment #67) > Comment on attachment 820249 [details] [diff] [review] > Part 1: TextFrame and Textrun transformation changes > > Review of attachment 820249 [details] [diff] [review]: > ----------------------------------------------------------------- > There are other space chars that could happen for example "U+00A0 no-break > space" is frequent in MathJax. Perhaps it's safer and more simple to keep > the current behavior and always perform the CSS style... We will fix that > later when we implement mathvarian bold/italic/bold-italic correctly. The existing implementation ensures that only valid characters are transformed. If it was changed to transform everything, it will break layout/reftests/mathml/mi-mathvariant-2.xhtml and regress bug 413115. The robust solution is to divide the textRun into transformable and non-transformable subStrings and process them piecewise. nsFontVariantTextRunFactory::RebuildTextRun() does this so I have an example implementation from which to work.
(In reply to Frédéric Wang (:fredw) from comment #67) > Comment on attachment 820249 [details] [diff] [review] > Part 1: TextFrame and Textrun transformation changes > > Review of attachment 820249 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +1136,5 @@ > > + } > > + if (mathVar == NS_MATHML_MATHVARIANT_BOLD || > > + mathVar == NS_MATHML_MATHVARIANT_BOLD_ITALIC || > > + mathVar == NS_MATHML_MATHVARIANT_ITALIC) { > > + // Undo the change as it will be handled as a font styling. > > Can you open a follow-up bug to implement mathvariant with only > char-transforms in the future? And mention the bug here? > I've opened 930504 for removing the fontstyle/fontweight special case. I'll also try to structure this patch so that doing so is easy.
Blocks: 930504
(In reply to James Kitchener (:jkitch) from comment #73) > (In reply to Frédéric Wang (:fredw) from comment #67) > > Comment on attachment 820249 [details] [diff] [review] > > Part 1: TextFrame and Textrun transformation changes > > > > Review of attachment 820249 [details] [diff] [review]: > > ----------------------------------------------------------------- > > There are other space chars that could happen for example "U+00A0 no-break > > space" is frequent in MathJax. Perhaps it's safer and more simple to keep > > the current behavior and always perform the CSS style... We will fix that > > later when we implement mathvarian bold/italic/bold-italic correctly. > > The existing implementation ensures that only valid characters are > transformed. If it was changed to transform everything, it will break > layout/reftests/mathml/mi-mathvariant-2.xhtml and regress bug 413115. > > The robust solution is to divide the textRun into transformable and > non-transformable subStrings and process them piecewise. > nsFontVariantTextRunFactory::RebuildTextRun() does this so I have an example > implementation from which to work. OK, I see. So let's keep it that way until bug 930504 is fixed. But please add at least the non-breaking space &#xA0; since it is mentioned on http://www.w3.org/TR/MathML/chapter2.html#fund.collapse
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
CSS related additions to support mathvariant. -moz-math-variant is an implementation detail which is explicitly inaccessible. People wishing to determine mathvariant status can query the mathvariant attribute within a MathML element.
Attachment #820245 - Attachment is obsolete: true
Attachment #822284 - Flags: review?(dbaron)
Feedback from comment 67 has been addressed and detailed comments describing the algorithm have been added. All or nothing transformation has been implemented for bold, italic and bold-italic with an exception for whitespace (at least for the moment. They will eventually adopt the standard behaviour in bug 930504). The remaining mathvariant options transform valid characters and pass untransformable characters unchanged (in this instance it is the easy option). Should the all or nothing behaviour be adopted here as well? I've looked at the standard, and it doesn't mention this situation.
Attachment #820249 - Attachment is obsolete: true
Attachment #822301 - Flags: review?(karlt)
Attached patch Part 3: MathML frame changes (obsolete) (deleted) — Splinter Review
Attachment #820251 - Attachment is obsolete: true
Attached patch Part 4: Remove legacy stuff (obsolete) (deleted) — Splinter Review
Attachment #820253 - Attachment is obsolete: true
Attached patch Part 5: tests (obsolete) (deleted) — Splinter Review
Attachment #820255 - Attachment is obsolete: true
(In reply to James Kitchener (:jkitch) from comment #77) > Created attachment 822301 [details] [diff] [review] > The remaining mathvariant options transform valid characters and pass > untransformable characters unchanged (in this instance it is the easy > option). Should the all or nothing behaviour be adopted here as well? I've > looked at the standard, and it doesn't mention this situation. I think only the characters that can be transformed should be transformed and the others should remain unchanged. IIUC, that it is what your patch does (except for italic/bold-italic/bold for now).
Attachment #817164 - Attachment is obsolete: true
Blocks: 941607
Attachment #822301 - Flags: review?(roc)
Attachment #822284 - Flags: review?(roc)
Attachment #822284 - Flags: review?(roc)
Attachment #822284 - Flags: review?(dbaron)
Attachment #822284 - Flags: review?(cam)
Comment on attachment 822301 [details] [diff] [review] Part 1: TextFrame and Textrun transformation changes Review of attachment 822301 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/generic/nsTextFrame.h @@ +24,5 @@ > // This state bit is set on children of token MathML elements > #define TEXT_IS_IN_TOKEN_MATHML NS_FRAME_STATE_BIT(32) > > +// XXX Not in reserved space. Need to find a permanent home? > +#define TEXT_IS_IN_SINGLE_CHAR_MI NS_FRAME_STATE_BIT(59) Need to document what this flag means! Actually I don't see why we need a frame state bit here. Can you explain that? ::: layout/generic/nsTextRunTransformations.cpp @@ +312,5 @@ > > +/* > + Entries for the mathvariant hash table. The data is divided into pairs. > + The first entry in each pair represents the hashtable key, the second > + contains the value. Please describe what the keys and values actually mean! It seems simpler and more efficient just to binary-search the table in-place (and make it pre-sorted) instead of building the hash table. Squeezing out every last drop of performance here just isn't worth it. @@ +315,5 @@ > + The first entry in each pair represents the hashtable key, the second > + contains the value. > + Table must be terminated with an 0 to avoid an infinite loop. > +*/ > +static uint32_t mathVarMappingTable[] = { static const @@ +1120,5 @@ > } > > void > +nsMathVariantTextRunFactory::RebuildTextRun(nsTransformedTextRun* aTextRun, > + gfxContext* aRefContext) I'd like to put nsMathVariantTextRunFactory into its own file(s) please.
Attachment #822301 - Flags: review?(roc)
Attachment #822301 - Flags: review?(karlt)
Attachment #822301 - Flags: review-
Comment on attachment 822284 [details] [diff] [review] Part 0: CSS related changes Review of attachment 822284 [details] [diff] [review]: ----------------------------------------------------------------- I think we want to treat this new property like the MathML properties in that if MathML is disabled, we want to be able to pretend that a whole nsStyleFont's worth of properties was specified even if -moz-math-variant wasn't. So please add a check for this new property in AreAllMathMLPropertiesUndefined in nsRuleNode.cpp. r=me with that and the comments below addressed. ::: layout/style/nsCSSPropList.h @@ +3352,5 @@ > + VARIANT_HK, > + kMathVariantKTable, > + CSS_PROP_NO_OFFSET, > + eStyleAnimType_None) > +#endif // !defined(CSS_PROP_LIST_EXCLUDE_INTERNAL) Please place this within the !defined(CSS_PROP_LIST_ONLY_COMPONENTS_OF_ALL_SHORTHAND) block just above so that an "all: reset" property doesn't reset -moz-math-variant. ::: layout/style/nsRuleNode.cpp @@ +3249,5 @@ > + // -moz-math-variant: enum, inherit, initial > + SetDiscrete(*aRuleData->ValueForMathVariant(), aFont->mMathVariant, > + aCanStoreInRuleTree, > + SETDSC_ENUMERATED, aParentFont->mMathVariant, > + NS_MATHML_MATHVARIANT_NONE, 0, 0, 0, 0); Since this is an inherited property, use the SETDSC_UNSET_INHERIT flag so that "unset" gets handled like "inherit". ::: layout/style/nsStyleConsts.h @@ +519,5 @@ > // defaults per MathML spec > #define NS_MATHML_DEFAULT_SCRIPT_SIZE_MULTIPLIER 0.71f > #define NS_MATHML_DEFAULT_SCRIPT_MIN_SIZE_PT 8 > > +// See nsStyleText nsStyleFont ::: layout/style/nsStyleStruct.cpp @@ +90,5 @@ > nsStyleFont::nsStyleFont(const nsFont& aFont, nsPresContext *aPresContext) > : mFont(aFont) > , mGenericID(kGenericFont_NONE) > , mExplicitLanguage(false) > + , mMathVariant(NS_MATHML_MATHVARIANT_NONE) How about we initialize mMathVariant inside nsStyleFont::Init, like the other MathML properties are.
Attachment #822284 - Flags: review?(cam) → review+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #83) > ::: layout/generic/nsTextFrame.h > @@ +24,5 @@ > > // This state bit is set on children of token MathML elements > > #define TEXT_IS_IN_TOKEN_MATHML NS_FRAME_STATE_BIT(32) > > > > +// XXX Not in reserved space. Need to find a permanent home? > > +#define TEXT_IS_IN_SINGLE_CHAR_MI NS_FRAME_STATE_BIT(59) > > Need to document what this flag means! > > Actually I don't see why we need a frame state bit here. Can you explain > that? > Concretely, for most token frames the mapping @mathvariant to -moz-mathvariant is straightforward: <mtext>x</mtext> => normal <mtext mathvariant="fraktur">x</mtext> => fraktur except for <mi> whose default value is a bit special: <mi>x</mi> => italic <mi> x </mi> => italic <mi>sin</mi> => normal that is the default is "italic" for single-char <mi> (with whitespaces trimmed) and "normal" for multi-char <mi>. The logic is slightly more involved so I'm not sure we can do that in mathml.css or in nsMathMLElement::MapMathMLAttributesInto. Hence I've suggested to James to set a frame bit in nsMathMLTokenFrame::MarkTextFramesAsTokenMathML (see attachment 822303 [details] [diff] [review]) where we already set the TEXT_IS_IN_TOKEN_MATHML bit. > ::: layout/generic/nsTextRunTransformations.cpp > @@ +312,5 @@ > > > > +/* > > + Entries for the mathvariant hash table. The data is divided into pairs. > > + The first entry in each pair represents the hashtable key, the second > > + contains the value. > > Please describe what the keys and values actually mean! > > It seems simpler and more efficient just to binary-search the table in-place > (and make it pre-sorted) instead of building the hash table. Squeezing out > every last drop of performance here just isn't worth it. > Yes, I don't know why I suggest that. Sorry, James!
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Review comments addressed.
Attachment #822284 - Attachment is obsolete: true
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Overlooked something.
Attachment #8338329 - Attachment is obsolete: true
Comment on attachment 8338333 [details] [diff] [review] Part 0: CSS related changes >diff --git a/layout/style/nsStyleStruct.h b/layout/style/nsStyleStruct.h >--- a/layout/style/nsStyleStruct.h >+++ b/layout/style/nsStyleStruct.h >@@ -109,16 +109,17 @@ public: > // size on this nsStyleFont? > bool mAllowZoom; // [inherited] > > // The value mSize would have had if scriptminsize had never been applied > nscoord mScriptUnconstrainedSize; > nscoord mScriptMinSize; // [inherited] length > float mScriptSizeMultiplier; // [inherited] > nsCOMPtr<nsIAtom> mLanguage; // [inherited] >+ uint8_t mMathVariant; // [inherited] > }; Sorry, should have mentioned this before: do you mind moving mMathVariant just beneath mScriptLevel to reduce the size of an nsStyleFont (because of alignment of its members)?
Review comments have been addressed. The hashtable has been converted into several arrays which are selected as needed. This removes the need to encode the data, ensuring that all values actually represent unicode character points.
Attachment #822301 - Attachment is obsolete: true
Attachment #8338340 - Flags: review?(roc)
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Done
Attachment #8338333 - Attachment is obsolete: true
Assorted whitespace fixes.
Attachment #8338340 - Attachment is obsolete: true
Attachment #8338340 - Flags: review?(roc)
Attachment #8338348 - Flags: review?(roc)
Attached patch Part 5: tests (obsolete) (deleted) — Splinter Review
This now includes a fix for mpadded-7, mpadded-8 and mpadded-9. Characters such as '|' and '_' are no longer transformed by mathvariant, breaking the test. It now uses style="font-family: monospace" which should be equivalent to the older mathvariant implementation. I've also added tests for mathvariant="monospace" (using 'i' and 'X', unless you can think of better representative characters).
Attachment #822305 - Attachment is obsolete: true
Attachment #8338347 - Attachment description: css.diff → Part 0: CSS related changes
Attachment #8338348 - Attachment description: transform.diff → Part 1: TextFrame and Textrun transformation changes
Attached patch Part 0: CSS related changes (obsolete) (deleted) — Splinter Review
Updated patch from James to fix a build error.
Attachment #8338347 - Attachment is obsolete: true
Attached patch Part 4: Remove legacy stuff (deleted) — Splinter Review
Refresh patch + remove _moz_fontstyle from GkAtomList.h
Attachment #822304 - Attachment is obsolete: true
Results from James: https://tbpl.mozilla.org/?tree=Try&rev=f1bd8d4d4f85 So the new assertions seems to be due to the verification added in AreAllMathMLPropertiesUndefined. I believe the SetDiscrete on mathvariant will set it not the MATHVARIANT_NONE even when MathML is not used. As a workaround, I also allowed aRuleData->ValueForMathVariant()->GetUnit() == eCSSUnit_Initial but perhaps there is a best way to fix that.
Comment on attachment 822303 [details] [diff] [review] Part 3: MathML frame changes Review of attachment 822303 [details] [diff] [review]: ----------------------------------------------------------------- ::: layout/mathml/nsMathMLTokenFrame.cpp @@ +46,5 @@ > + mathVariant == NS_MATHML_MATHVARIANT_ITALIC || > + mathVariant == NS_MATHML_MATHVARIANT_ITALIC || > + mathVariant == NS_MATHML_MATHVARIANT_BOLD_ITALIC || > + mathVariant == NS_MATHML_MATHVARIANT_SCRIPT || > + mathVariant == NS_MATHML_MATHVARIANT_BOLD_SCRIPT || So the remaining failing test on Linux for attachment 8338560 [details] seems to be due to the fact that MATHVARIANT_SCRIPT and MATHVARIANT_BOLD_SCRIPT are considered ItalicIdentifier rather than upright.
(In reply to Frédéric Wang (:fredw) from comment #95) > So the new assertions seems to be due to the verification added in > AreAllMathMLPropertiesUndefined. I believe the SetDiscrete on mathvariant > will set it not the MATHVARIANT_NONE even when MathML is not used. As a > workaround, I also allowed aRuleData->ValueForMathVariant()->GetUnit() == > eCSSUnit_Initial but perhaps there is a best way to fix that. Oh, you also need to skip that property in nsInitialStyleRule::MapRuleInfoInto (in nsStyleSet.cpp) if MathML is disabled.
Blocks: 923890
Blocks: 731667
Comment on attachment 8338470 [details] [diff] [review] Part 0: CSS related changes >+CSS_PROP_FONT( >+ -moz-math-variant, >+ _moz_math_variant, >+ MathVariant, For consistency with the other MathML properties (scriptminsize etc), shouldn't this be -moz-math-variant, math_variant, MathVariant, (without the _moz prefix)?
Attached patch Part 0: CSS related changes (deleted) — Splinter Review
Addresses comments 98 and 99.
Attachment #8338470 - Attachment is obsolete: true
Attached patch Part 3: MathML frame changes (deleted) — Splinter Review
Addresses comment 96.
Attachment #822303 - Attachment is obsolete: true
Comment on attachment 8338348 [details] [diff] [review] Part 1: TextFrame and Textrun transformation changes Review of attachment 8338348 [details] [diff] [review]: ----------------------------------------------------------------- Mostly cosmetic comments. Just about done :-) ::: layout/generic/MathVariantTextRunFactory.cpp @@ +3,5 @@ > + * License, v. 2.0. If a copy of the MPL was not distributed with this > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > + > +#include "MathVariantTextRunFactory.h" Unnecessary blank line @@ +7,5 @@ > +#include "MathVariantTextRunFactory.h" > +#include "nsStyleConsts.h" > +#include "nsStyleContext.h" > +#include "nsTextFrameUtils.h" > +/* Blank line after #includes @@ +18,5 @@ > + > +struct MathVarMapping > +{ > + uint32_t key; > + uint32_t replacement; mKey, mReplacement @@ +34,5 @@ > + As a replacement, 0 is reserved to indicate no mapping was found. > +*/ > +#define ARABICINITIALSIZE 20 > +static const MathVarMapping arabicInitialMapTable[ARABICINITIALSIZE] = { > + { 0x628, 0x1EE21 }, gArabicInitialMapTable (same below) @@ +191,5 @@ > + > +// Finds a MathVarMapping struct with the specified key (aKey) within aTable. > +// aTable must be an array, whose length is specified by aNumElements > +static uint32_t > +mathvarMappingSearch(uint32_t aKey, const MathVarMapping* aTable, uint32_t aNumElements) MathvarMappingSearch @@ +199,5 @@ > + while (high > low) { > + uint32_t midPoint = (low+high) >> 1; > + if (aKey == aTable[midPoint].key) { > + return aTable[midPoint].replacement; > + } else if (aKey > aTable[midPoint].key) { No need for "else" after return. @@ +255,5 @@ > +#define MATH_BOLD_PI_SYMBOL 0x1D6E1 > + > + > +static uint32_t > +MathVariant(uint32_t aCh, uint8_t aMathVar) Document what this function does @@ +258,5 @@ > +static uint32_t > +MathVariant(uint32_t aCh, uint8_t aMathVar) > +{ > + > + uint32_t baseChar; Delete blank line @@ +275,5 @@ > + return aCh; > + } > + if (aMathVar > NS_MATHML_MATHVARIANT_STRETCHED) { > + NS_ASSERTION(aMathVar <= NS_MATHML_MATHVARIANT_STRETCHED, > + "Illegal mathvariant value"); Just NS_ERROR since we know the condition is false @@ +375,5 @@ > + varType = kIsGreekish; > + } > + > + if (varType == kIsNumber) { > + switch(aMathVar) { Space before ( (same below) @@ +447,5 @@ > + * lookup table. > + */ > + case NS_MATHML_MATHVARIANT_INITIAL: > + mapTable = arabicInitialMapTable; > + numElements = ARABICINITIALSIZE; Instead of using a #define here, just do "numElements = ArrayLength(gArabicInitialMapTable);" (from mfbt/Utils.h). @@ +472,5 @@ > + } > + newChar = mathvarMappingSearch(aCh, mapTable, numElements); > + } else { > + > + // Must be Latin Remove blank line @@ +484,5 @@ > + // characters are located within their unicode block (less an offset to > + // avoid _NONE and _NORMAL variants) > + // See the kIsNumber case for an explanation of the following calculation > + tempChar = baseChar + MATH_BOLD_UPPER_A + > + multiplier*(MATH_ITALIC_UPPER_A - MATH_BOLD_UPPER_A ); Remove space before ) @@ +488,5 @@ > + multiplier*(MATH_ITALIC_UPPER_A - MATH_BOLD_UPPER_A ); > + // There are roughly twenty characters that are located outside of the > + // mathematical block, so so the spaces where they ought to be are used > + // as keys for a lookup table containing the correct character mappings. > + newChar = mathvarMappingSearch(tempChar, latinExceptionMapTable, LATINEXCEPTIONSIZE); Use ArrayLength here too @@ +528,5 @@ > + bool doMathvariantStyling = true; > + > + for (uint32_t i = 0; i < length; ++i) { > + > + int extraChars = 0; Remove blank line here @@ +549,5 @@ > + mathVar == NS_MATHML_MATHVARIANT_BOLD_ITALIC || > + mathVar == NS_MATHML_MATHVARIANT_ITALIC) { > + if (ch == ch2 && ch != 0x20 && ch != 0xA0) { > + // Don't perform the transformation if a character cannot be > + // transformed. There is an exception for whitespace as it both common "as it is" @@ +649,5 @@ > + // the data would break the cache. > + aTextRun->ResetGlyphRuns(); > + aTextRun->CopyGlyphDataFrom(child, 0, child->GetLength(), 0); > + } > + Delete blank line
Comments addressed. Also some minor whitespace and comment improvements.
Attachment #8338348 - Attachment is obsolete: true
Attachment #8338348 - Flags: review?(roc)
Attachment #8340708 - Flags: review?(roc)
https://tbpl.mozilla.org/?tree=Try&rev=a703caf18ba6 One test failure specific to Android 2.2 I suspect there is some difference in the bounding box or font metric calculations between style="font-style:italic" and the italic fontstyle set by the TextRunTransformation. Is this something worth chasing up? The test will eventually need changing to avoid using explicit font-style when bug 930504 lands, as style="font-style:italic" and mathvariant="italic" will likely give different results.
Comment on attachment 8340708 [details] [diff] [review] Part 1: TextFrame and Textrun transformation changes Review of attachment 8340708 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! ::: layout/generic/MathVariantTextRunFactory.cpp @@ +295,5 @@ > + if (aCh == GREEK_LETTER_DIGAMMA) { > + if (aMathVar == NS_MATHML_MATHVARIANT_BOLD) > + return MATH_BOLD_CAPITAL_DIGAMMA; > + else > + return aCh; Don't do "else" after "return". Just "return aCh;" after the "if" statement.
Attachment #8340708 - Flags: review?(roc) → review+
Curious that it's Android-only. (In reply to James Kitchener (:jkitch) from comment #104) > I suspect there is some difference in the bounding box or font metric > calculations between style="font-style:italic" and the italic fontstyle set > by the TextRunTransformation. But only on Android, which is odd. > Is this something worth chasing up? I think it's worth spending a little bit of time examining the possibilities. But not too much. > The test will eventually need changing to avoid using explicit font-style > when bug 930504 lands, as style="font-style:italic" and mathvariant="italic" > will likely give different results. In that case, it seems reasonable to modify the test appropriately now.
Comment 105 addressed. (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (away Nov 26-29) from comment #106) > But only on Android, which is odd. > This isn't the first time such behaviour has appeared in nsMathMLmmultiscriptsFrame related tests. Bug 827713 had one which would only appear with OSX10.6 (10.7+ were fine). I'll upload the test changes after a try run.
Attachment #8340708 - Attachment is obsolete: true
(In reply to James Kitchener (:jkitch) from comment #104) > https://tbpl.mozilla.org/?tree=Try&rev=a703caf18ba6 > > One test failure specific to Android 2.2 > > I suspect there is some difference in the bounding box or font metric > calculations between style="font-style:italic" and the italic fontstyle set > by the TextRunTransformation. I wonder whether the italic and normal fonts have different ascent/descent metrics on Android. If the font-style is set only in the TextRunTransformation, then that won't change the logical font metrics of the element, which are based on the element font-style, which is normal. That is OK, I think, but it would be nice to make the test continue to work. > The test will eventually need changing to avoid using explicit font-style > when bug 930504 lands, as style="font-style:italic" and mathvariant="italic" > will likely give different results. I wouldn't expect style="font-style:italic to help, but mathvariant="italic" in 414123-ref.xhtml should make it pass.
Attached patch Part 5: tests (deleted) — Splinter Review
Now fixes layout/reftests/mathml/414123 by comparing it with an explicit mathvariant, rather than faking it with font-style. Green try run https://tbpl.mozilla.org/?tree=Try&rev=a3143918816e
Attachment #8338352 - Attachment is obsolete: true
Keywords: checkin-needed
Depends on: 945509
Documentation changes: mathvariant is now fully implemented. It works by replacing the characters within the tag by new ones that correspond to the original character and the mathvariant selected which is located within the mathematical blocks of unicode. Each transformable character only supports a subset of the possible mathvariants. Characters without a valid representation are displayed unaltered. The lists of transformable characters and the mathvariant values they accept can be found at http://www.w3.org/TR/MathML2/chapter6.html#chars.letter-like-tables http://lists.w3.org/Archives/Public/www-math/2013Sep/0012.html A warning might be needed to say that users without fonts supporting the mathematical blocks (0x1D*** and 0x1EE**) will see missing character symbols instead. Note there is a special case for bold, italic and bold-italic. These are still implemented using font-style/fontweight to allow them to remain readable even with fonts lacking the mathematical blocks. The treatment of non-transformable characters is also handled differently. If a non-transformable character is found within the tag (other than certain whitespace characters), the transformation will not occur for any of the characters. This behaviour will eventually be standardised to follow the majority behaviour in bug 930504.
Jesse: please make sure this feature is covered by the MathML fuzzer.
Flags: sec-review?(jruderman)
(In reply to Frédéric Wang (:fredw) from comment #36) > BTW, your patch may also allow to pass test 32 of http://fred-wang.github.io/AcidTestsMathML/acid3/ FYI, I just checked that, and indeed we obtained one point more the MathML Acid 3 test!
> Jesse: please make sure this feature is covered by the MathML fuzzer. The attributes {mathvariant, fontweight, fontstyle} are already tested by my MathML-generation module. I was missing a few values for mathvariant, so I added those (fuzzing rev cbe72a97462a).
Flags: sec-review?(jruderman) → sec-review+
(In reply to James Kitchener (:jkitch) from comment #112) > Documentation changes: > > A warning might be needed to say that users without fonts supporting the mathematical blocks (0x1D*** and 0x1EE**) will see missing character symbols instead. I believe all the non-Arabic characters are already covered by the STIX fonts. Khaled Hosny is working on creating glyphs for his Amiri and XITS fonts (https://github.com/khaledhosny/xits-math/issues/36#issuecomment-29606565), so hopefully they will be available when Gecko 28 is released. We'll have to make these Arabic fonts available on MDN, in the font installers etc (bug 923890).
Whiteboard: [mentor=fredw][lang=c++][parity-webkit] → [documentation: see comments 112 and 116][mentor=fredw][lang=c++][parity-webkit]
Blocks: 1043358
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: