Closed Bug 215219 (CTL2) Opened 21 years ago Closed 17 years ago

add a bridge to Pango from Xft rendering code for CTL

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
mozilla1.7final

People

(Reporter: jshin1987, Assigned: jshin1987, NeedInfo)

References

Details

(Keywords: intl)

Attachments

(17 files, 15 obsolete files)

(deleted), text/plain
Details
(deleted), text/html
Details
(deleted), text/html
Details
(deleted), image/jpeg
Details
(deleted), text/html
Details
(deleted), image/jpeg
Details
(deleted), image/jpeg
Details
(deleted), image/png
Details
(deleted), text/html
Details
(deleted), text/plain
Details
(deleted), text/html
Details
(deleted), image/png
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), image/png
Details
(deleted), patch
Details | Diff | Splinter Review
There's a patch to take advantage of Pango (installed on the system) to render complex scripts at http://www.eng.buffalo.edu/%7Edsjoshi/mozilla-xft-ctl.html This is an alternative to bug 214715(going all the way to Pango) and Prabhat's approach summarized in bug 199857
I forgot to mention that this info. was kindly passed along by Aloc. Below are some thoughts of mine: A similar approproach was taken by SILA (http://sila.mozdev.org). We might be able go along a similar path on Windows (Uniscribe) and MacOS X (ATSUI). I just took a brief look at your patch and it looks fine (other than some factoring that need to be done and CTL range check that you wrote about). In addiiton to DrawString, text-dimension related functions have to be modified as well. > > --with-default-toolkit=gtk2). I don't see any reason that we need gtk2 for this patch. Is there any? > > xpcom module. It also bypasses the DrawingDataFillSpec funcition for > > these ranges (actually it is just a copy of the original > > DrawingDataFillSpec with minor modification to take care of changed > > glyph string lengths (for indic scripts ) > > xpcom module. It also bypasses the DrawingDataFillSpec funcition for > > these ranges (actually it is just a copy of the original > > DrawingDataFillSpec with minor modification to take care of changed > > glyph string lengths (for indic scripts ) We may be able to do some factoring here. In one aspect, I share Prabhat's concern. We have to extend this to support not just rendering but also 'editing' (cursor movement, selection, etc). Because Pango already offers APIs to get clustering info and there's Prabhat's implementation with PangoLite, it wouldn't be hard to find a wrapper for that to be used in nsTextFrame.cpp. Now we haev a couple of options in incorporating Pango with Mozilla: - bug 214715 : recast everything in Pango: it'd not be very hard to do this because Pango is very 'smart' (sometimes too smart). MathML support is a headache in this case. On the other hand, overall code would be a lot simpler and leaner. - the approach taken by Dhananjay : built upon the current code. MathML (and scripts not well supported by Pango but supported by Mozilla) can be supportd without fixing Pango. code would a bit more complex than now. The downside of both approaches is that Mozilla is dependent on Pango, but I guess it makes _a lot of_ sense to make use of it when available (the same is true of Uniscribe and ATSUI). Where it's not available, we can use PangoLite(as summarized in bug 199857) Perhaps, what we have to do is to make a consensus on the way we interact with/invoke Pango APIs (the system-installed or built-in PangoLite) so that from the consumers' (API users : renderer and nsTextFrame, etc) point of view, both are equivalent.
Summary: add a bridge to Pango from Xft rendering code for CTL : → add a bridge to Pango from Xft rendering code for CTL :
It seems like the patch was written in such a way that it can be an 'almost' drop-in replacement for PangoLite, which means that we can make either run-time or compile-time selection between two options. Reassinging to me for some polishing and impedance matching. I'm not able to work on this right now, but hoping to get back soon(?). BTW, I'm not yet sure which is better, the approach of bug 214715 or this. Perhaps in the long run, the former is better (assuming that we can solve MathML issue) than this. On the other hand, there are a couple of things we want to keep our grip on ('invisible' char. handling, transliteration, etc), for which the latter is better.
Assignee: prabhat.hegde → jshin
I more or less concur with what Jungshik says. However, before making a full commitment, i'd like to suggest to investigate ripping Indic OT layout from ICU. The reason being a number of OT layout bugs are fixed there (since we use it for Ooo) and also consistent testing on MSFT compatible OT fonts. The secondary reason is two-fold. One - we don't need additional work to support our non-Gnome friends. 2> Possibly can be re-used for windows 98 & 2k (where OT support does not exist for non Hindi scripts as far as i know.) I also propose restructuring nsULE so that edit operation API can be re-used. Thereby all that is needed will be shaping code. Secondly i would suggest a cache based performance enhancement which will eliminate the need to go through a clustering call for edit operations. prabhat.
I modify my comment earlier. Here is an option i would prefer. A> New compile-time switch to choose system pango. B> Use Dhanjay's approach (Re-use pango API for both shaping and edit operations). C> Possibly add new compile time switch *in case* we want to have an ICU layout to replace pango in Dhanjay's approach.
I worked on selection yesterday and this module works fine with prabhats cluster recognition code very well indeed. I have cleaned up the code a lot and fixed most bugs. please check the new screenshot at http://www.eng.buffalo.edu/~dsjoshi/mozilla-xft-ctl.html newer source code is also there. This module still needs ULE/ILE for selection but not dvng-x dsjoshi
Thanks for the update. I've given some thought to this bug on the road and I'm inclined toward a slightly different approach (more similar to what's done in SILA with Graphite fonts/rendering API). In SILA, Graphite fonts are identified by an 'invasive' method and a new font class is added (or it can be done by adding a 'flag' member variable in nsXftFontUnicode) for Graphite fonts. When a list of matching fonts (candidate fonts) is generated, Graphite fonts are idnetified and instances of the class for Graphite font are ctor'd. This font class has usual drawing and measuring methods implemented in terms of the Graphite API wrapper service(it's also written in xpcom). I had been pondering over this possibility with Pango, but hadn't come up with a way to identify 'Pango fonts' (there's no such thing as Pango fonts. What I meant is fonts that can render given char/strings with Pango APIs) until Dhananjay's patch 'inspired' me. Graphite fonts have a unique signature so that it's easy to identify them. What did I come up with? I don't have to identify them other than making sure that they're opentype fonts and meet other creteria (lang, family name, etc. If necessary, a static const precompiled CCMap can be added to nsFontMetricsXft.cpp for the Unicode range checking). We can either add a new class (say, nsFontXftOpenType) or flag it with a new boolean member variable in nsFontXftUnicode. Now here's a little problem. Unless there's a fontconfig API that lets us tell opentype fonts from other fonts, we have to give up 'lazy opening' of XftFont (GetXftFont) and resort to FT2 API to identify opentype fonts. In effect, Gfx:Win always 'looked into' fonts (not for identifying opentype fonts but for getting truetype cmaps. Therefore, ftang didn't have to worry about this when he implemented SILA. Anyway, this is why I used the word 'invasive' when describing SILA's idnetification of Graphite fonts. I'm adding Keith to CC to see what he thinks about adding an API to fontconfig for 'font type' identification if there isn't one yet. Incidentally, we can just go all the way to Pango (except for MathML and Korean - in case of the latter, until I fix Pango. In other words, except for 'custom fonts' listed in fontEncoding.properties file) and let Pango run the show. In that case, we don't need to worry about 'OTF' checking at all. This is an alternative to bug 214715 that enable us to keep a leverage (if/where necessary) while taking advantage of Pango's intelligence. I guess we still need 'Pango wrapper API' service(xpcom) because it's not only for rendering but also for editing/selection. Needless to say, there are a couple of 'combinations' of the above (run-time/compile-time Pango presence checking, ICU, etc...) I'm writing all this from memory(w/o actually looking at the code) so that there may be some loose ends.
in EnumarateGlyphs: A suitable font is determined. This font is passed to the DrawStringCTLCallback function along with the string. In DrawStringCallback: FcResult v; if(FcGetPattern(nsFontXFT::XftFont,FC_FAMILY,0,&v ==FcResultMatch) { //we got the family string v.u.s send it to pango with the unichar string //continue with the old function } in pango module use this family string to set the font for context this way we do not need to have anycode to identify fonts separately. Assumes: All the Fonts that are available to mozilla are also available to pango (actually pango gets to use all fonts, but mozilla only uses fonts that are particular rendering system xft/x. cause I cannot see any x fonts listed in the fonts prefs. only xft fonts. I might be wrong about this assumption though. dsjoshi
That's a safe assumption. At least in the Xft code we use fontconfig as does pango. Also, if we ever want to do downloadable fonts (I hope we do) we should just use the standard path code in fontconfig to add it to the path.
I dont' think my earlier comment was posted. //earlier comment I realised that in thai sometimes you get more glyphs than there are characters. So I now allocate 200% of the length to store glyph indices in DrawStringCTL callback function. So now it works for thai. Should work for other scripts too. Thai Screenshot and latest patch/source is at http://www.eng.buffalo.edu/~dsjoshi/mozilla-xft-ctl.html
I'd be concerned that there were characters that required more than two glyphs to render.
I Understood What blizzard is saying and consequently I have done a couple of things 1: at the expense of an additional function call ( first analyze text and find length) and next allocate memory accordingly and get the glyph indices. 2: wrote additional code to take care of the situation when pango_itemize returns more than one item earlier the patch was working for thai, only with the translation page on unicode webpage and not with bbc thai. After the additions it works for both satisfactorily (without crashing !! and without any assertions from pango) please check the screenshots and source at the same url as earlier. Jungshik could you please check if the screenshots of thai are correctly reordered. It is my belief that with minor modifications and optimizations it can be incorporated in the main trunk. dj
Sorry Thai was not rendered properly in your screenshot. The latest patch of yours has 'raghindi' hardcoded and I guess that's why it didn't work for Thai. BTW, I couldn't make 'configure' work with gtk2 + xft + ctl on RedHat 9 although I don't have any problem with gtk + xft + ctl (that can render Thai, Devanagari, Tamil and Korean well). For some reason, configure thinks that I don't have gmodule support. Anyway, I realized that to use Pango's opentype font support, we need gtk2/glib 2.x. So, this is for gtk2 port only.
Status: NEW → ASSIGNED
This is the 0th cut so that there are a lot of rough edges to smooth out. Because of the difficulty I have with xft+ctl+gtk2, I couldn't test this patch, which was written to implement what I mentioned earlier: > Incidentally, we can just go all the way to Pango (except for MathML and Korean > - in case of the latter, until I fix Pango. In other words, except for 'custom > fonts' listed in fontEncoding.properties file) and let Pango run the show. In > that case, we don't need to worry about 'OTF' checking at all. The directory organization under intl/ctl was on purpose to be distinct from the existing directories to avoid the 'collison' for the now. Obviously, we need to figure out the best way to organize things there. And, that's not the only thing but there are a lot to work out. Partly because we have to move around 'spacing vectors', APIs in nsICTLService are not so 'pretty'. Neiter are some changes made in nsFontMetricsXft.cpp. At the moment, 'spacing vectors' don't seem to be used often, but we can't ignore them because layout will certainly use them to fully implement CSS/XSL
I probably should have checked whether the new source was correctly uploaded. Now it is. Regarding: xft+ctl+gtk2 I had the same problem with the source version 1.4 of mozilla A workaround for this, I found out is to edit the configure script search for "gmodule support" remove the test program compilation in the script or set the glib flags manually I do not have the problem with version 1.5a
Attached patch patch v0.1 (that works more or less) (obsolete) (deleted) — Splinter Review
I still have a trouble getting 'configure' script to work for gtk2+ctl+xft (1.5b trunk). I worked around it for the now by making intl/ctl get compiled even when SUNCTL is not defined. This patch can render Devanagari, Tamil, Thai with opentype fonts by invoking Pango APIs (Pango on my Linux box seems to have a little problem with ZWJ/ZWNJ). For MathML, Tamil and Korean, custom-encoded fonts can be used as well. Because Arabic and Hebrew are handled by Mozilla's own BIDI code, Pango APIs are not invoked on them. A better way is to disable Mozilla's own BIDI code for Gtk2(+CTL). As in Dhananjay's latest patch, I made multiple 'items' taken care of. Todo: 1. implement translating 'spacing' info in PangoGlyphString to 'spacing' to use with |FillSpecBuffer|. Currently, a blind copy is made. 2. ASCII/8bit code path has some problems. Sometimes the width/text dimension come out wrong. In ISO-8859-1/Windows-1252 encoding, the text inside a button doesn't get rendered. In UTF-8 or other encodings, buttons present no problem. 3. Split 'GetGlyphsForUCS4String' and 'GetGlyphsForUTF16String' in nsICTLService into 3 or 4 methods : Analyze (similar to Dhananjay's), RetrieveGlyphs, RetrieveSpacing (only invoked where necessary), FreeGlyphs (may not need this). 4. Think more about nsICTLService interface. - Do we need it as an xpcom service? - If so, we also have to implement PrevCluster, NextCluster or similar methods for the caret movement and selection. - Do we want them to be dependent on the rendering APIs (e.g. Pango)? We can just implement the Unicode grapheme boundary 'algorithm' independently of the rendering 'backend' - Do we want to wrap Uniscribe(Win32), PangoLite, ICU, ATSUI(MacOS X) APIs with this interface as well?
Attachment #129850 - Attachment is obsolete: true
> 2. ASCII/8bit code path has some problems. It turned out that for some reason pango_shape fills up glyphString with invalid glyph indices when a PS type1 font is selected. With 'Nimbus Roman No. 1'(type 1) selected, apparently pango_shape returns '0x373'(well beyond the 'glyph index' upperlimit of a type1 font) for 'A'. When I changed Western fonts to truetype fonts (Luxi Sans, etc), everything worked fine. It's strange that it doesn't work for Type1 fonts because pango_shape() appears to invoke pango_xft_font_get_glyph (), which in turn calls XftCharIndex(). XftCharIndex() sure works for both type1 and ttf. Besides, gedit and other gtk2 applications run fine with type1 fonts... > Sometimes the width/text dimension come out wrong. To measure the width, nsRenderingContextGFX invokes the UTF16 version of nsFontMetricsXft::GetWidth[1] even for ASCII strings while it invokes the 8bit version of DrawString for 8bit strings. My patch doesn't invoke Pango APIs in the ASCII code path. Therefore, the aforementioned problem with Type1 fonts only showed up in the form of misaligned text (char. drawing is fine). It also showed up when 'spacing' is pre-calculated in nsTextFrame.cpp with nsIRenderingContext::GetWidth() (e.g. when rendering <p style="text-align:justified">......</p>.) As for the font type/feature ID issue with fontconfig (comment #6), I've been discussing off-line with Keith. It happened that SIL (the home of Graphite among many other things linguistic) people also wanted a similar feature to identify Graphite fonts (in SILA : http://sila.mozdev.org, the method that got Keith to 'shudder' is used to identify Graphite fonts) and it was filed as fontconfig b# 105 ( http://fontconfig.org/cgi-bin/bugzilla/show_bug.cgi?id=105) [1] Actually, there's no 8bit version of GetWidth in nsFontMetricsGTK/nsFontMetricsXft
Attached patch v 0.2 (obsolete) (deleted) — Splinter Review
In this patch, I implemented the 'realignment' of 'spacing' array with glyphs. nsTextFrame.cpp assigns 'spacing' value to each character, but after the conversion to glyphs from characters, these values can't be used because char -> glyph mapping is not one to one in general. It still has a problem because for some strange reason, on my Linux, glyph.geometry.x_offset is not scaled (with the font size) while glyph.geometry.width is scaled where 'glyph' is PangoGlyphInfo Another problem is that family name and size are not sufficient to fully specify a font. We also need weight, style, variant, etc before calling pango_font_description*(), which means we have to move across Xpcom boundary a whole bunch of font properties. Perhaps, we can wrap them up all in nsFont with fields filled with 'matched' values. Being v0.2, it's littered with debug statements, quick and dirty/ad-hocish code and #if 0'd block. I'm not thrilled at this patch, but I'm uploading it to put it somwwhere safe :-) BTW, I didn't split nsICTLService::GetGlyphsForUCS4String into three methods in this patch. Maybe I'll do it later. Another BTW, nsICTLService is not scriptable (no intend to make it scriptable) so that I may just as well write C++ header file instead of nsICTLService.idl.
Attachment #129921 - Attachment is obsolete: true
I just realized that this approach (of getting glyph strings from Pango and feed them to Xft APIs) if opentype GPOS table is used. In that case, we have to retrieve not only glyphs but also x and y position info. as well from Pango. Perhaps, it's doable, but nsICTLService APIs would get more complicated. In this light, the approach of bug 214715 looks more attractive (only if we can come up with an 'escape clause' for MathML and scripts that are not well-handled by Pango that Mozilla supports well) As for aligning the spacing array with glyphs, there's write-up on a related issue at http://people.redhat.com/otaylor/grid-fitting
Attached patch v 0.3 (see also comment #22) (obsolete) (deleted) — Splinter Review
I moved new system-pango stuffs out of intl/ctl and into intl/ctl2. intl/ctl2 was made to get compiled in when gtk2 and xft are enabled (without CTL enabled). I did that way partly because I had a trouble compiling with gtk2+xft+ctl and partly because I wanted to make this bug 'orthogonal to' bug 203406 and bug 204286. As a result, nsFontMetricsXft.cpp is littered with #ifdef'd lines. Anyway, applying this patch to the trunk would get complex scripts rendered by Pango (just enable xft and choose gtk2 toolkit). Users of South and SE Asian scripts may want to use it while the overal strategy for CTL in Mozilla is worked out. I haven't yet solved the problem with aligning 'spacing' array with glyphs so that 'justified' runs of text in complex scripts wouldn't come out right.
Attachment #130110 - Attachment is obsolete: true
I have added elementary support for editing operations in other languages. Edited the following file in main mozilla source editor/libeditor/html/nsHTMLEditor.cpp In the HandleKeyPress Function screenshot and source at earlier url. Currently I am not indentifying the 'isogroup' of the keymap that is currently active since I haven't yet found a way to identify it in mozilla. cut-copy-paste-selection works fine
patch for file widget/src/gtk2/nsGtkKeyUtils.cpp
Comment on attachment 130373 [details] [diff] [review] v 0.3 (see also comment #22) >Index: gfx/src/gtk/Makefile.in .... >+ifdef MOZ_ENABLE_XFT >+ifeq ($(MOZ_WIDGET_TOOLKIT),gtk2) >+REQUIRES += ctl >+endif >+endif In the above lines, 'ctl' should be 'ctl2'. Somehow I must have uploaded the second latest patch(or a patch made before I corrected the above bug) instead of the latest. Rajkumar S kindly pointed out that to me.
Attachment #130373 - Attachment description: v 0.3 → v 0.3 (see also comment #22)
Fixed 3 problems with Jungshik Shin's patches so that it can be applied on a new tree and compiled. I have put a tarball of mozilla with this and Dhananjay Joshi's input patches at http://www.linuxense.com/oss/render/mozilla/index.html raj
Attached file Simple test case for Malayalam (deleted) —
This is a simple test case for testing Malayalam rendering. I have not included all complex cases, but will add them eventually. Latest version of this page can be found at http://www.linuxense.com/oss/render/mozilla/index-ml.html A free (beer) OT font for testing Malayalam rendering can be found at http://www.supersoftweb.com/Unicode.htm raj
Attached patch v 0.3a (Makefile.in changes) (obsolete) (deleted) — Splinter Review
Identical to attachment 130373 [details] [diff] [review] except for three changes in Makefile.in's that fix compilation errors (I didn't notice them because I ran make separately in each directory, but Raj came across them while compiling from the scratch). Attachment 131127 [details] [diff] doesn't include new files so that I'm uploading this with all the files included.
Attachment #130373 - Attachment is obsolete: true
Attachment #131127 - Attachment is obsolete: true
Re comment #5: > Pango on my Linux box seems to have a little problem with ZWJ/ZWNJ. This is a known pango bug (http://bugzilla.gnome.org/show_bug.cgi?id=118302) that causes the inhibition of conjuct formation by ZWNJ and the representation of (independent) half-form with ZWJ not to work. Re comment #20 and comment #21 : dsjoshi, can you file (a) separate bug(s) for those issues and add me to CC?
Attached file Malayalam test page (deleted) —
I combined my Devanagari test page (with code points shifted to Malayalam range), Tamil test page(ditto) and several test cases in Raj's page. This is rather to reveal Pango's shortcomings[1] so that this is not for this bug, in a sense. Nonetheless, I thought this may be useful here as well. [1] gedit has identical problems on linux and Mozilla-Win renders most test cases the same way as Yudit does on Linux, which is an indication that the font used (ThoolikaUnicode) is fine. Curiously, MS IE 6(that directly invokes Uniscribe) has some obvious defects that Mozilla (that relies on ExtTextOutW) doesn't have.
It seems that Control characters like ZWNJ/ZWJ are being shown as blank boxes (http://www.peacefulaction.org/sayamindu/Screenshot.png) Can those be supressed? This does not happen with other apps that use Pango, I I think this has something to do with mozilla.
Suppresing it is not to deal with the problem at its root(it's just hiding it). As I wrote in comment #26, it's a known Pango bug that is to blame(see Gnome bugzilla bug I quoted for details). When that's fixed, Mozilla with this patch will get automatically fixed.
I applied this patch to moz 1.5 on my redhat 9 system. I am interested in seeing indic languages work. configure cmd: ./configure --prefix=/home/jatin/mozilla-1.5 --enable-default-toolkit=gtk2 --enable-debug --disable-tests --enable-crypto --enable-xft --with-system-zilb --enable-ctl The string to be rendered by nsRenderingContextGTK::DrawString consists of unicode (indic) and ascii characters. When the string is rendered, indic characters are rendered correctly, but ascii is totally incorrect. If the string is only ascii - it renders correctly - as it goes thru a different set of functions. So my thinking is that the glyphs returned by nsSystemPangoLE::GetGlyphsForUCS4String for ascii are incorrect - probably because of incorrect font being selected. i have seen that the same patch works fine on gnome 2.4 integrated into the morphix CD released by the IndLinux.org team. So could this be only a configuration mistake for the font subsystem? I would like pointers on how i should go further in this regard Thanks
re: comment #30 I haven't had a chance to work on this recently. I'll take a look again. BTW, you don't need 'enable-ctl' for this patch. (see comment #19). see also bug 189433. Hmm, you managed to build with gtk2 + ctl.
Attached patch a patch updated to the trunk (obsolete) (deleted) — Splinter Review
Nothing has changed since attachment 131157 [details] [diff] [review], but nsFontMetricsXft.cpp has changed since September so that I had to update it. With this applied to 1.7alpha, I can't reproduce the problem reported in comment #30 with Malaylalam + English text. Both Latin letters and Malaylalam letters are rendered fine. Could you attach a sample file and a screenshot and tell me what font you used? I'll also test Devanagari and Tamil, soon.
Attachment #131157 - Attachment is obsolete: true
Attached image screenshot ( justified rendering test) (deleted) —
This is to show where I'm now. Five different 'justification' options are tested. Without any justification, it works fine. With 'text-align="justify"', combining character sequences don't get rendered as they should. At least, I manage to get them rendered 'together' even with 'letter-spacing: 0.2em'. Note that there's no difference between the 4th line and the 3rd line as far as 'intra-syllable' ('intra-grapheme-cluster') spacing is concerned, which indicates that each grapheme cluster is treated as a unit. This is done at the 'drawing' stage, but it has to move up eventually to 'nsText*" where 'spacing' is added (see bug 229896) Fonts used are Raghindi for Devanagari and ThoolikaUnicode for Malaylalam.
Attached patch a new patch (now supports text justification) (obsolete) (deleted) — Splinter Review
This patch now can render 'justified' text as well.I'm gonna upload a screenshot soon. With this patch, Mozilla's rendering ability is only limited by Pango. That is, what Pango can render can be rendered by Mozilla, too. The exception is Arabic/Hebrew because at the moment, I don't invoke nsICTLService for Arabic/Hebrew assuming that they're 'pre-shaped'.
Attachment #138292 - Attachment is obsolete: true
This test case uses Raghindi for Devanagair and ThoolikaTraditionalUnicode for Malaylalam. Both are freely available opentype fonts with GSUB/GPOS tables.
Attached image screenshot rendering attachment 138332 (deleted) —
compare this with attachment 138315 [details] (sorry for the quality. I degraded it too much). In this screenshot, you can see that all graphemes in justified text (four cases) are rendered exactly the same way as unjustified text (the last case)
This appears to be using the mini-pango in moz instead of the system pango installation?
added to comment #37 Does it possible to removes libmozpango.so libmozpango-thaix.so libmozpango-dvngx.so ... from Mozilla, and uses system Pango instead? pros: - lesser module maintenance - smaller footprint - get fresher functionalities/fixes of Pango, including new languages (libmozpango updates is more delayed) cons: - broken if inappropriate version/none of Pango found.
re comment #37 & #38. This code does not use mini-pango, which is built ONLY when 'enable-ctl' is turned on. I can't even build with 'enable-ctl' and 'gtk2'. The very purpose of this bug is to take advantage of the pango installed on the system instead of using mozilla's own mini-pango. I wonder how that fact (which was made clear from the beginning, comment #0) could be missed. If it did use minpango, it would not render anything other than Thai and Devanagari because minipango supports only Thai and Devanagari with 'custom-encoded' fonts. However, my screenshot was taken of with __opentype_ fonts for Devanagari and Malaylalam (raghindi and ThoolikaUnicode).
This is to show that with this patch Mozilla-Xft (+gtk2) is better than Mozilla-Win on Win2k/XP in complex script rendering. Mozilla-Win currently uses TextOutExW (or something like that) instead of Uniscribe APIs (used by MS IE, a rough equivalent of Pango). When rendering justified text (text-align: justify and/or letter-spacing), Mozilla-Win hands over one Unicode character at a time to TextOutExW. Without the 'context', TextOutExW just draws it with the default glyph. Therefore, only the last example (unjustified) in attachment 1383328 is rendered correctly by Mozilla-Win on Win2k/XP because Mozilla passes a sequence of characters to TextOutExW() with 'context' preserved.
re comment #30 : jatin, I guess your problem is related to what I described in comment #16. However, I couldn't reproduce it any more with type1 fonts (such as Luxi Serif). I have about a month old CVS version of Pango, fontconfig and Xft on my Linux box. I'm not sure which and what, but you may try updating those libraries to the CVS head and see if that makes a difference.
Attached patch patch 0.5(now includes intl/ctl2 directory) (obsolete) (deleted) — Splinter Review
I've just realized that I had not included intl/ctl2 in attachment 138331 [details] [diff] [review], which may have confused blizzard. This patch includes intl/ctl2 (the heart of the patch). My tree was somehow messed up and firebird didn't even start (I clobbered and am building it again) so that I'm not 100% sure of this patch.
Attachment #138331 - Attachment is obsolete: true
Comment on attachment 138471 [details] [diff] [review] patch 0.5(now includes intl/ctl2 directory) I just confirmed that this patch works as intended (that is, I recovered successfully from a slightly messed-up tree) Indian Linux distribution builders can apply this patch to the cvs. 1.6branch would have a couple of rejects, but can be easily fixed manually.
Alias: CTL2
Summary: add a bridge to Pango from Xft rendering code for CTL : → add a bridge to Pango from Xft rendering code for CTL
well, that page doesn't have 'many' scripts covered. :-) The following page has links to several Unicode example pages with a lot more comprehensive script coverage. http://www.i18nguy.com/unicode/unicode-example-intro.html Some may not work for complex scripts because non-opentype fonts are specified for complex scripts.
Yeah, that's exactly why I was confused. Let me know when you're ready for me to look at this, jshin.
Attached patch patch updated to the trunk (obsolete) (deleted) — Splinter Review
Updated the patch to the trunk. It's become a moving target :-). blizzard, thanks. I'll let you know when I'm ready. BTW, is there any way to build gtk1 (as opposed to gtk2) and pango together? I guess not, but if that's possible, I can avoid littering with '#ifdef ... GTK2'. I also removed APIs for cursor movement and selection because I now think they should be implemented in an XP-way per UTR #29 (see bug 229896). There's a little problem in that approach, though because UTR #29 and actual platform-dependent drawing APIs may not agree on what makes up 'grapheme clusters'. Anyway, that's a bit off-topic here. With cursor movement/selection related APIs removed, there's little reason to use xpcom because what's left of nsICTLService can be folded into nsFontMetricsXft.cpp (unless we want to offer an alternative to Pango, e.g. ICU or built-in pango-lite).
Attachment #138471 - Attachment is obsolete: true
With reference to Hebrew and Arabic, if Pango's shaping is at least as good as our own (which seems likely, but I don't know) we could use it, setting a hint on the rendering context as we do in Windows so that our own routines don't get called. http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/nsRenderingContextWin.cpp#629
pango with gtk1.2 will never work, so don't bother using ifdefs.
The latest trunk code definitely makes it better. But can you guys see this page correctly? http://www.abhyuday.org/xprajna/html/jagat_prakash.php i am attaching a file of what I see in justified text.
Attached image justifies devnagri HTML page (deleted) —
This is how I see this page
Your screenshot looks odd. However, as far as I can see, my build renders the text exactly the same way as Pango does in gedit (needless to say, the *inter*-cluster spacing is different because it's justified and gedit doesn't have a notion of justification). BTW, why does the page have the following CSS? FONT-FAMILY: "Arial Unicode MS", Mangal, GH-Amrita; Arial Unicode MS has 'nominal' glyphs for Devanagari so that it'll deprive Mangal (a opentype font for Devanagari) of a chance to render Devanagari unless you set 'always use my fonts' in firebird (and __uncheck__ 'allows documents to use other fonts' in mozilla). Why don't you turn on 'always use my fonts' in firebird and set fonts for Devanagari to 'Raghindi' (or Mangal if you can use it) to see if there's any difference? The above CSS snippet should be Font-Family: Mangal, Raghindi, GH-Amrita, Arial MS Unicode, serif;
re comment #50, did you apply the latest patch (attachment 138670 [details] [diff] [review]) to the trunk source and build with gtk2 + xft?
Mozilla 1.6 with the patch in attachment 138471 [details] [diff] [review] applied is available at http://i18nl10n.com/mozilla/mozilla-i686-pc-linux-gnu-gtk2-pango.tar.gz It doesn't have an installer but just a gzipped tar ball (because I couldn't build mozilla-installer in a tree separate from the source tree). I'm hoping to move it to 'contrib' directory of ftp.mozilla.org.
Couple of notes here: The #ifdef GTK2 stuff is bordering on making a lot of this code unreadable. I want to find another way to do that or just say that it's time that we just say that if you want to use Xft you need to use gtk2 as well. The CTL interface to native pango seems like it doesn't need to be there. If we're going to use pango, and given my statement above, let's put it in the Xft file instead of using the interface. (Really, I want to have an nsFontMetricsPango.cpp instead, but history is going to leave us with a mis-named file.)
blizzard, I agree with you on both points. As for the second point, I already mentioned it in comment #47 (the last paragraph). We can always pull it out of nsFontMetricsXft.cpp if later we have another consumer (like printing code). > I want to find another way to do that or just say that it's time that we > just say that if you want to use Xft you need to use gtk2 as well As for the first point, do you have any idea as to how to avoid littering nsFontMetricsXft.cpp with #ifdef's without going with the second option? One way (not very attractive) would be to make a second copy of nsFontMetricsXft.* and name them nsFontMetricsPango.* The first copy is for gtk1+xft without Pango and the second is for gtk2+xft with Pango. Only one of them will be compiled depending on the configuration option. This makes it hard to maintain the code (we always have to do twice when we fix things). Anyway, there's a precedent, nsFontMetricsGtk.* and nsFontMetricsXlib.*. I'm for the second option if others are not against it. There should be no problem on Linux (but on other Unix, the situation may be different) That is, I can't think of anyone installing gtk2 but not pango on Linux. How about embedders and small devices? Then, I'm not sure. Their target markets may not require pango.
re comment #41 on the problem with PS type 1 font. My suspicion was sorta confirmed by Lin (linchear@rogers.com), the author of Pango module for Khmer (not yet committed to the Pango cvs). After upgrading Xft to the cvs trunk, the problem went away. BTW, he wrote to me that, as it should, my patch just works for Khmer with Pango supporting Khmer.
Just FYI, I made a 1.6 binary with my patch here applied and blizzard put it up at ftp.mozilla.org. It's available at ftp://ftp.mozilla.org/pub/mozilla.org/mozilla/releases/mozilla1.6/contrib/mozilla-i386-pc-linux-gnu-gtk2-pango.tar.gz Please, refer to ftp://ftp.mozilla.org/pub/mozilla.org/mozilla/releases/mozilla1.6/contrib/README-pango for more information about the build. While using it as my everyday browser, I found mozilla crashes when rendering _some_ (not all) query results at bugzilla (the surest way is to query bugzilla with layout-ctl as a component). With a debug build (identical to the optimized build at ftp.mozilla.org), I got a stack trace, but the result is not very much illuminating because it crashes where it seems to have nothing to do with Gfx (and layout). I'll try to figure out what's wrong.
I'd rather not fork the file again. I suspect that there are people out there using gtk1.2 and xft together (this is how the Mozilla in Red Hat 8 shipped, for example) but I think that aside from that one instance it's time to assume that if you've got a working xft setup you're going to have a working gtk2 setup as well. That being said, let's make the plunge. As for printing, let's talk about that for a bit. I would love to make this file the central location for turning a string into a glyph string that both printing and on-screen rendering can use. From what I've seen you've had a little more experience understanding what needs to happen on the postscript side (don't you need the characters as well as the fonts?) Do you have an opinions on how to do this all at once?
The above and below base marks (abvm and blwm) are not rendered correctly. Mozilla seems to ignore abvm and blwm anchor points in pango+gtk2+xft+ctl enabled build of mozilla. I never finished submitting the bug, You'll need my font at http://www.ece.uic.edu/~msripada/delme/Devanagari.ttf
I confirmed the problem reported in comment #61 while talking to Blade_X over IRC. I mentioned the problem in comment #18, but forgot about it partly because with Raghindi font, it's not so easy to tell the difference. Blade_X's font (Devanagari : his font name is the same as the name of the script) makes it easy to tell the difference. I guess we can move forward this with the understanding that GPOS issue has to be fixed. re: comment #59. 1. Re: making gtk2 a requirement for Xft build : I'm willing to do it although I have a little reservation about shutting down the door for embedders for small devices who may want to stay with gtk1 + xft. BTW, SuSE has been including Gtk1 + xft builds for as long as RedHat has been, but I guess they have no trouble switching to Gtk2 + Xft. I noticed that Fedora core1 has gtk2 + xft (1.4) 2. Re: printing. Phew, that's tough. I wish I had a far better understanding as to how PS printing works on Mozilla. A lot of things appear to get tangled and I haven't figured out a way to sort them out without the code duplication. Just as much as you do, I really like to make 'the file' the center of Unicode strings to glyphs conversion for the screen rendering and printing.
I don't know what's required specifically, either. However, if we have a list of glyphs, files for those glyphs and possibly the original string as well, that might be enough. Adding the local PS dude for some ideas on how we can use pango to choose our fonts and the glyphs for postscript rendering. What does postscript need? And in what form would that information be useful?
This probably doesn't answer all of your questions, but it's a start. I'd be happy to try to answer any followup questions you have. The basic PS printing module only knows about the 13 standard postscript fonts. All of the font-metric (character width) information is built into the PS module; there's some code to load FM data for additional fonts, but it's broken (bug 203794). Each font is assumed to be available to the printer; there's no provision to embed fonts in the postscript output. The freetype extension gives access to additional fonts, and it will embed fonts in the postscript if needed. For string width calculations, mozilla uses the builtin font-metric tables for characters below 0x00ff. For other characters it just uses one of two hardcoded width values (cryllic and everything else). 8-bit strings are printed by selecting the current western font (e.g. Times-Roman) with a latin1 encoding vector, copying the raw text of the string into the postscript, and issuing a PS "show" command to draw the text on the page. To print 16-bit strings, mozilla uses a procedure called "unicodeshow". unicodeshow receives the string as a series of 16-bit UCS-2 character codes; one character at a time it tries to find an appropriate glyph and print it. Western text is normally handled by looking up the character code in a hash table which maps codes to glyph names; if the code is in the table, unicodeshow will try to print the specified glyph from the current western font. Unicodeshow can also look for the character code in a user-specified CID font, and an additional user-specified font using a user-specified code-to-glyph map. Both of these features require the user to set some hidden prefs and to add information to the PS output before printing. I don't know how many people go to the trouble of setting this up. In short, mozilla only really knows about the current font for latin1 characters. It doesn't really look at the strings being printed (other than the string width calculations); it just copies them into the postscript. Mozilla doesn't know anything about the user-supplied fonts other than their names, and it has no logic to switch to special fonts for math symbols, or to embed a font containing the Euro character in the document when needed.
This should be discussed in bug 190031. I should have referred to it when I wrote the last paragraph of comment #62. I'm now adding my response there (bug 190031).
I removed intl/ctl2 and included 'string to glyph' converter in nsFontMetricsXft.cpp. It's not yet ready for review. I have to do some clean-up. configure.in was changed so that xft + gtk1 is not supported any more.
Attachment #138670 - Attachment is obsolete: true
Let me know when you're ready for me to look at this. I'm chomping at the bit.
Attached patch update (obsolete) (deleted) — Splinter Review
I'd not say it's easy to read this patch. Anyway, blizzard, can you take a look? After a couple of iterations, it'll get clearer.
Attachment #141453 - Attachment is obsolete: true
Blocks: 233462
I'm being ambitious here. If blizzard can review soon and we can iterate a couple of times, we might make it in 1.7final. If not, we'll do it in 1.8alpha.
Target Milestone: --- → mozilla1.7final
Blocks: 197649
Hello, While the Pango-enabled Mozilla 1.6 binary in contrib works wonderfully for complex text such as Devanagari, it tends to mess up right-to-left text such as Arabic. Please see the accompanying attachment arabic-rendering.png for an example. The upper screenshot is that of regular Mozilla 1.6 rendering the BBC home page in Arabic, while the lower one shows the Pango-enabled Mozilla 1.6. As can be seen, the Pango-enabled browser shows some of the Arabic text in a left-to-right order, and some other lines in the correct right-to-left order. Just a heads-up; once these patches hit mainstream, I suppose it wouldn't be a good thing if they broke RTL support.
We'll have to fix that, to be sure. (dbaron points out that there might be some good information in bug 83958 about this.) The BIDI code probably reorders the characters before passing it down to this level. Is pango re-reversing them again? It's very possible. dbaron claims that there's a flag to indicate if the OS supports setting the directionality of the text, but I haven't found it in the code yet. This also gets tricky because the way this patch is right now (if I'm reading it correctly,) we might end up using pango for some of the text and not for other parts of the text.
Oh, while I'm thinking of it, it would be helpful if you could attach a very small html test case that we could use that mis-renders with pango.
(In reply to comment #71) > We'll have to fix that, to be sure. ...good information in bug 83958 about this.) That's for sure. I'll take a look at it. > The BIDI code probably reorders the characters before passing it down to this > level. Is pango re-reversing them again? It's very possible. Yes, the BIDI code does that, which is why I tried to exclude Arabic and Hebrew characters from Pango processing. However, my simplistic test (IsPreShaped() ?)) may miss some cases or mess up some non-trivial BIDI cases. > dbaron claims that there's a flag to indicate if the OS supports setting > the directionality of the text, but I haven't found it in the code yet. There is. See comment #48. However, I didn't touch that part because my intention was to exclude Arabic/Hebrew from the Pango processing in this bug. Later we can turn on that flag (as we do on Windows) when with Pango, we can do the job as well as layout does. > This also gets tricky because > the way this patch is right now (if I'm reading it correctly,) we might end up > using pango for some of the text and not for other parts of the text. My intent is to avoid using Pango whenever it makes sense (e.g. ASCII code path). Anyway, that could mess up BIDI cases. BTW, as you wrote, it'd be nice to have a simple test to demonstrate the problem. I appreciated comment #70, but without a simple test case clearly indicating which part gets rendered wrong, I can't do much.
Yeah, when I looked at the patch IsPreShaped() seemed _really_ hackish to me. Looking at the the code, it looks like we can use that hint for arabic since we know that we're going to shape it. That would solve this particular problem. Do we need to add a couple more kinds of hints to indicate that we are going to not use pango for the ASCII case? Is that even a problem? My immediate response is that it isn't. What about for hebrew for ltr and the various indian languages that require shaping? All of those are going to be outside of the ascii code path. So the more that I think about it the more that I think that adding those hints is just the right way to go.
(In reply to comment #74) > Yeah, when I looked at the patch IsPreShaped() seemed _really_ hackish to me. > Looking at the the code, it looks like we can use that hint for arabic since we > know that we're going to shape it. IsPreShaped() already checks for Arabic as well as for Hebrew. That's why I need a simplified test case to see what's wrong with the test in IsPreShaped(). > Do we need to add a couple more kinds of hints to indicate that we are going to > not use pango for the ASCII case? Is that even a problem? My immediate > response is that it isn't. The current patch does use Pango if it's PRUnichar* we're dealing with no matter what. It doesn't use Pango if we deal with 'char *' (which is supposed to be a pure ASCII string.) I'm not sure what your 'it' or 'that' is. Are you worried that Pango is used when PRUnichar* is purely made up of ASCII characters ?
(In reply to comment #73) ... > > There is. See comment #48. However, I didn't touch that part because my > intention was to exclude Arabic/Hebrew from the Pango processing in this bug. > Later we can turn on that flag (as we do on Windows) when with Pango, we can do > the job as well as layout does. > I don't understand the Linux-related technicalities here (I know the Windows world much better), but perhaps you guys can clarify what I am looking for. (If I should be looking at another bug, please tell me which one.) My requirement, or at least hope, is that Mozilla should display the fully pointed and accented text of the Hebrew Bible, on Linux (and all other platforms) as well as it now can on Windows. 1) Is there currently any way to do this, with the standard Mozilla releases on Linux? I have been told there is not, but I can't confirm this. 2) Could such support be provided with Pango? Would this in principle work with an OpenType font like SBL Hebrew (http://www.sbl-site.org/Resources/default.aspx) or Ezra SIL (http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=EzraSIL_Home - which by the way does not have Graphite tables)? 3) Is the best hope to see this support in Mozilla for this bug to be fixed and then the restriction mentioned above (from comment #73) to be lifted?
I have enabled ctl2 on Mdk cooker Mozilla 1.6 package (using patch 0.5) and I'm getting random crashing when loading some pages (gnomedesktop.org or when loggin on bugzilla.gnome.org) with the following stacktrace (sorry, it is not a full debug build) : Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 24445)] 0x41d1d45a in nsFontXftUnicode::GetFamilyName() () from /usr/lib/mozilla-1.6/components/libgfx_gtk.so (gdb) bt #0 0x41d1d45a in nsFontXftUnicode::GetFamilyName() () from /usr/lib/mozilla-1.6/components/libgfx_gtk.so #1 0x08a327a8 in ?? () #2 0x41d2f5ab in nsICharRepresentable::GetIID()::iid () from /usr/lib/mozilla-1.6/components/libgfx_gtk.so
jshin: do you have a version of this patch which works with the current trunk? I applied the latest version attached with some modifications, but it crashes when bringing up a browser window: #0 0x0775e4be in pango_font_description_free () from /usr/lib/libpango-1.0.so.0 #1 0x07763a03 in pango_context_set_font_description () from /usr/lib/libpango-1.0.so.0 #2 0x012dec3d in ConvertUCS4ToGlyphs (aFamilyName=0x8b9b638 "Nimbus Roman No9 L", aFontSize=0, aSrc=0xbfe95114, aSrcSpacing=0x0, aSrcLen=1, aP2T=0, aDest=@0xbfe920f0, aDestSpacing=0x0, aDestLen=@0xbfe920ec) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:3121 #3 0x012dc0fb in nsFontXftUnicode::GetTextExtents32 (this=0x8c56338, aString=0xbfe95114, aLen=1, aIsASCII=0, aGlyphInfo=@0xbfe95010) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:2151 #4 0x012dbbfe in nsFontXft::GetWidth32 (this=0x8c56338, aString=0xbfe95114, aLen=1) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:2020 #5 0x012dacc9 in nsFontMetricsXft::GetWidthCallback (this=0x8b9abe0, aString=0xbfe95114, aLen=1, aFont=0x8c56338, aData=0xbfe98050) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1709 #6 0x012da33b in nsFontMetricsXft::EnumerateXftGlyphs (this=0x8b9abe0, aString=0xbfe95114, aLen=1, aCallback= {__pfn = 0x12dabba <nsFontMetricsXft::GetWidthCallback(unsigned int const*, unsigned int, nsFontXft*, void*)>, __delta = 0}, aCallbackData=0xbfe98050) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1529 #7 0x012da431 in nsFontMetricsXft::EnumerateGlyphs (this=0x8b9abe0, aString=0xbfe9810e, aLen=1, aCallback= {__pfn = 0x12dabba <nsFontMetricsXft::GetWidthCallback(unsigned int const*, unsigned int, nsFontXft*, void*)>, __delta = 0}, aCallbackData=0xbfe98050) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1550 #8 0x012d9964 in nsFontMetricsXft::RawGetWidth (this=0x8b9abe0, aString=0xbfe9810e, aLength=1) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1311 #9 0x012d875b in nsFontMetricsXft::CacheFontMetrics (this=0x8b9abe0) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:949 #10 0x012d854b in nsFontMetricsXft::RealizeFont (this=0x8b9abe0) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:883 #11 0x012d7bf3 in nsFontMetricsXft::Init (this=0x8b9abe0, aFont=@0x8b99da4, aLangGroup=0x8b96e70, aContext=0x8b86a18) at /home/tor/src/moz-pango/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:539 ...
any updates on this bug? it would be a good one to get for 1.8 if we can.
Flags: blocking1.8a+
Yeah, I'm making some progress offline.
Attached patch patch update (obsolete) (deleted) — Splinter Review
While blizzard is working on his patch (if i understand correctly, his patch is actually taking the approach suggested in bug 214715 and is quite a radical rewrite), here's my patch updated to the trunk for those who want to experiment with it.
Flags: blocking1.8a1+ → blocking1.8a1-
Blocks: 245157
Jungshik, I use Gentoo and depend on your patch to get mozilla compiled so that Devanagari shows up properly. I've filed a bug with Gentoo that the maintainers say that they won't fix. Could you update your patch so that it applies cleanly to moz 1.7 ? appending the build output below and am attaching the patch output as well.
Attached file failed patch attempt (deleted) —
I'll upload my 1.7 build (patched) soon to 'contributed' directory of ftp.mozilla.org.
Comment on attachment 151878 [details] failed patch attempt if you're uploading a text file, please say so.
Attachment #151878 - Attachment mime type: application/octet-stream → text/plain
Attached patch patch for 1.7 branch (obsolete) (deleted) — Splinter Review
This is for those who want to build their own 1.7branch builds with Pango patch applied.
+IsPreShaped(const FcChar32 *aString, PRUint32 aLen) +{ + while (--aLen) I often fear this kind of operations with unsigned integers. It will cause an infinite loop if started with zero.
(In reply to comment #87) > +IsPreShaped(const FcChar32 *aString, PRUint32 aLen) > +{ > + while (--aLen) > > I often fear this kind of operations with unsigned integers. It will cause an > infinite loop if started with zero. I haven't come across that problem in my build (that I use every day), but to be safe, I'll replace PRUint32 with PRInt32 and |--aLen| with |--aLen >= 0| wtih casting at call-sites. re: comment #70: > While the Pango-enabled Mozilla 1.6 binary in contrib works wonderfully for > complex text such as Devanagari, it tends to mess up right-to-left text such as > Arabic. I can't find any difference between Mozilla-Gtk2 (1.7) and Mozilla-Gtk2+Pango(1.7) when I tried BBC Arabic pages a couple of days ago. As blizzard wrote, can you write a small test case that reveals the problem?
> I can't find any difference between Mozilla-Gtk2 (1.7) and > Mozilla-Gtk2+Pango(1.7) when I tried BBC Arabic pages a couple of days ago. As > blizzard wrote, can you write a small test case that reveals the problem? Mozilla-Gtk2+Pango 1.6 and 1.7 (two builds, one with --enable-ctl and one with --disable-ctl) render it the same way for me -- Arabic text is rendered in the wrong direction. It's the same for Urdu (which uses a similar script). I'll see if I can narrow things down a bit when I have a little time. (I don't know Arabic, but I can read the script.)
OK, I've attached a simple test case that doesn't render well with the Pango patch in either the contrib 1.6 binary with gtk2+xft+pango or with my build of the 1.7 source with the latest patch (using gtk2+xft).
^ In the attached screenshots, the upper window is that of a patched Mozilla 1.7 and the lower one is that of stock Firefox 0.9 rendering the same page. (The lower one is correct; note how the glyphs are laid out left-to-right rather than right-to-left in the upper window.)
Attachment #144979 - Attachment is obsolete: true
(In reply to comment #91) > (The lower one is correct; note how the glyphs are laid out left-to-right > rather than right-to-left in the upper window.) I just verified with a build of stock 1.7; its rendering looks like the lower window.
Thanks a lot for the test case. I'll try to see what's wrong. All those characters belong to the range I'm filtering out (with IsPreShaped()) so that they're supposed to be treated the old way (instead of being handled by Pango).
+PRBool +IsPreShaped(const FcChar32 *aString, PRUint32 aLen) +{ + while (--aLen) + { + if (0x0590 <= aString[aLen] && aString[aLen] <= 0x6ff || + 0xfd1b <= aString[aLen] && aString[aLen] <= 0xfdff) + return PR_TRUE; + } + + return PR_FALSE; } I'm not sure where the values in the second line of the condition come from. I suspect you meant to write 0xfb1d and 0xfbff. You also probably want to test for the range 0xfe70-0xfefc.
(In reply to comment #94) > +PRBool > +IsPreShaped(const FcChar32 *aString, PRUint32 aLen) > + if (0x0590 <= aString[aLen] && aString[aLen] <= 0x6ff || > + 0xfd1b <= aString[aLen] && aString[aLen] <= 0xfdff) > > I'm not sure where the values in the second line of the condition come from. I > suspect you meant to write 0xfb1d and 0xfbff. You also probably want to > test for the range 0xfe70-0xfefc. Oh my gosh. I have no idea how I could mess that up. (actually, I guess I know. I meant '0xfb1d' but exchanged 'b' and 'd' when typing it.) I meant to cover all Hebrew and Arabic presentation form blocks in U+Fxxx (I still don't understand how come I missed Arabic presentation form B). I'll fix that and update the patch. Thanks a lot for catching it. BTW, you meant 0xfb1d - 0xfdff (instead of 0xfbff), right?
(In reply to comment #95) > Oh my gosh. I have no idea how I could mess that up. (actually, I guess I know. > I meant '0xfb1d' but exchanged 'b' and 'd' when typing it.) I meant to cover all > Hebrew and Arabic presentation form blocks in U+Fxxx (I still don't understand > how come I missed Arabic presentation form B). I'll fix that and update the > patch. Thanks a lot for catching it. BTW, you meant 0xfb1d - 0xfdff (instead of > 0xfbff), right? Are Hebrew and Arabic the only RTL scripts that come to Pango pre-shaped? (Checking for Unicode codepoint range sounds like a maintenance nightmare in the long run -- what if the ranges that come preshaped change? And are we sure these are the only ranges that correspond to RTL scripts?) Is there no way that's more maintainable (say a flag that indicates that Gecko already knows this is an RTL script and has reordered the glyphs prior to sending down to Pango)? Incidentally, the BBC Arabic page uses windows-1256 encoding. Would this range checking work for non-Unicode encodings, or is everything converted to UCS-2 (or UCS-4) internally?
(In reply to comment #96) > Are Hebrew and Arabic the only RTL scripts that come to Pango pre-shaped? > (Checking for Unicode codepoint range sounds like a maintenance nightmare in the > long run -- what if the ranges that come preshaped change? And are we sure these > are the only ranges that correspond to RTL scripts?) As I understand it, 0590-08FF and 00010800-00010FFF, plus the preshaped Presentation Forms which you have already discovered, are allocated or roadmapped for RTL scripts. See http://www.unicode.org/roadmaps/. Only some of these scripts have complex joining behaviour, and in principle all characters (even the Presentation Forms) can take combining marks. Microsoft ran into problems with Uniscribe in that at first it was used only for certain scripts. Later they realised that its shaping capabilities were needed even for Latin script. So the latest version of Uniscribe (comes with Office 2003) processes all characters, not just official complex script ones. Let's not make the same mistake here. So my advice would be not to filter out anything as preshaped but to let Pango decide which characters it needs to work on. As the Presentation Forms are in fact deprecated, there is little point in treating them sepcially just for performance reasons. > > Is there no way that's more maintainable (say a flag that indicates that Gecko > already knows this is an RTL script and has reordered the glyphs prior to > sending down to Pango)? Reordering and complex script shaping should be treated as independent processes. There are RTL scripts which are not complex, and there are LTR scripts which are complex.
> (In reply to comment #94) > BTW, you meant 0xfb1d - 0xfdff (instead of 0xfbff), right? I didn't, but now that I understand better what you were trying to do, I agree that 0xfdff is what you want. (In reply to comment #96) > Are Hebrew and Arabic the only RTL scripts that come to Pango pre-shaped? > (Checking for Unicode codepoint range sounds like a maintenance nightmare in the > long run -- what if the ranges that come preshaped change? And are we sure these > are the only ranges that correspond to RTL scripts?) I had the same concern when I filed bug 244036 :) > Incidentally, the BBC Arabic page uses windows-1256 encoding. Would this range > checking work for non-Unicode encodings, or is everything converted to UCS-2 (or > UCS-4) internally? Yes, to UTF-16 to be precise.(In reply to comment #95)
(In reply to comment #96) > (In reply to comment #95) > Are Hebrew and Arabic the only RTL scripts that come to Pango pre-shaped? > (Checking for Unicode codepoint range sounds like a maintenance nightmare in the > long run -- what if the ranges that come preshaped change? And are we sure This is not a long-term solution as my 'XXX' comment in the patch indicates. Eventually, I want to delegate as much as possible to Pango. Needless to say, Hebrew and Arabic are not the only RTL scripts, but they're the only scripts that reach Gfx pre-shaped and reordered in the _current_ mozilla partly because only for them, presentation-forms are encoded in Unicode. re comment #97 > Let's not make the same mistake here. So my advice would be not to filter > out anything as preshaped but to let Pango decide which characters > it needs to work on. Gee, you apparently haven't read my 'XXX' comment right above |IsPreShaped| in attachment 151928 [details] [diff] [review]. I'm very well aware of the need. Please, note that this is an interim solution only. blizzard has been working on a better solution. I'm updating this patch and am going to release my build for Indian people to use because I want to give them something to use right now instead of a year(?) from now.
Attached patch 1.7 brach patch updated (deleted) — Splinter Review
With the change per smontagu's comment, now it works for Arabic as well.
Attachment #151928 - Attachment is obsolete: true
This is the latest patch for the trunk.
Attachment #141894 - Attachment is obsolete: true
Attachment #148391 - Attachment is obsolete: true
The 1.7 patch is causing crash in epiphany 1.2.6 at startup (probably when loading a page), which complains about "free".
Attached patch glue patch for pango code (deleted) — Splinter Review
This code gets the code that I just checked in building. Pretty simple changes, actually. Should be easy to get into the tree.
Attachment #155008 - Flags: review?(jshin)
Comment on attachment 155008 [details] [diff] [review] glue patch for pango code I don't think this patch belongs here especially considering that you're planning to add nsFontMetricsPango. Why don't you attach it to bug 214715? Besides, it seems like some unrelated patches crept in.
Attachment #155008 - Flags: review?(jshin)
I've been having a problem with rendering Devanagri script on a build with this patch (215219.patch.1.7branch2). While certain web pages render properly (such as the BBC Hindi page at http://bbc.co.uk/hindi ) , some others like Google ( http://www.google.co.in/hi ) or my colleague's blog ( http://blog.sarai.net/users/ravikant ) use the font that unpatched builds do to render Hindi text ( example at http://www.geocities.com/kream77/badrendering.html - this is how the last webpage renders on my machine). The really frustrating thing is that it worked just fine in Mozilla 1.6 . If the patch can't be updated to fix this inconsistency in rendering, how do I find out which is the font that is being used to render hindi text by default ?
(In reply to comment #106) > I've been having a problem with rendering Devanagri script on a build with this > patch (215219.patch.1.7branch2). While certain web pages render properly (such > as the BBC Hindi page at > http://bbc.co.uk/hindi ) , some others like Google ( http://www.google.co.in/hi You may write to google to specify 'lang=hi' in their Hindi page. I wrote several times to add 'lang=xx' to all of their localized pages, but they'd not listen. > ) or my colleague's blog > ( http://blog.sarai.net/users/ravikant ) use the font that unpatched builds do > to render Hindi text ( example at > The really frustrating thing is that it worked just fine in Mozilla 1.6 . What exactly do you mean by Mozilla 1.6? Do you mean 1.6 without any patch. ( in that case, is it Xft or X11core build? ) or 1.6 with the patch for this bug? > If the patch can't be updated to fix this inconsistency in rendering, how do I > find out which is the font that is being used to render hindi text by default ? It depends on fonts you have on your system and fonts specified in the web page if there's no 'lang=hi' in the web page. Without 'lang="hi"' and with fonts like 'Arial, Helvetica' specified in the web page, Mozilla will use the first font that supports Devanagari character which may not be suitable for rendering text written in Devanagari because it doesn't have any opentype table for Devanagari. If you have pango 1.5.1(?) or later, you may want to build with '--enable-pango' configuration option. See bug 214715.
Re: comment # 107 > What exactly do you mean by Mozilla 1.6? Do you mean 1.6 without any patch. ( in that case, is it Xft or X11core build? ) or 1.6 with the patch for this bug? 1.6 with the patch for this bug. > Without 'lang="hi"' and with fonts like 'Arial, Helvetica' specified in the web page, Mozilla will use the first font that supports Devanagari character Aha! how do I find out which font this would be ? I'm desperate enough to try renaming it and seeing if anything bad happens or symlinking it to the proper Devanagari font. The webpage http://www.geocities.com/kream77/badrendering.html has been updated with examples of the way that different browsers render it - Mozilla, including Jungshik Shin's contributed build, Konqueror (perfectly) and Opera (imperfectly, again) and in a few hours i'll try and have an example of a patched Mozilla 1.6 rendering it properly.
re: comment # 106 > If you have pango 1.5.1(?) or later, you may want to build with '--enable-pango' configuration option. If I upgrade to >=pango-1.5.1, and configure mozilla with --enable-pango, do i still need to apply this patch ?
No, you shouldn't need this patch.
I'm still trying to track down the bug... now even a patched version of Mozilla 1.6 does not render properly. I did a fresh install and installed mozilla *without* adding any Hindi fonts but obviously there's some builtin font which has some **** Devanagari characters that is installed along with xorg-x11. Wish there was some way to find out which font it is. It's obviously now a problem with my dependencies - because indic characters in 1.6 used to work earlier. Here are it's dependencies, along with the package versions they belong to. Can someone tell me what I should downgrade ? Also, is this thing finally fixed in Moz 1.8 ? ldd /usr/lib/mozilla/mozilla-bin linux-gate.so.1 => (0xffffe000) libmozjs.so => /usr/lib/mozilla/libmozjs.so (0x40021000) net-www/mozilla-1.7.3 libplds4.so => /usr/lib/mozilla/libplds4.so (0x400a1000) net-www/mozilla-1.7.3 libplc4.so => /usr/lib/mozilla/libplc4.so (0x400a4000) net-www/mozilla-1.7.3 libnspr4.so => /usr/lib/mozilla/libnspr4.so (0x400a9000) net-www/mozilla-1.7.3 libpthread.so.0 => /lib/libpthread.so.0 (0x400dd000) sys-libs/glibc-2.3.3.20040420 libdl.so.2 => /lib/libdl.so.2 (0x400ee000) sys-libs/glibc-2.3.3.20040420 libgtk-x11-2.0.so.0 => /usr/lib/libgtk-x11-2.0.so.0 (0x400f2000) x11-libs/gtk+-2.4.9 libgdk-x11-2.0.so.0 => /usr/lib/libgdk-x11-2.0.so.0 (0x403bc000) x11-libs/gtk+-2.4.9 libatk-1.0.so.0 => /usr/lib/libatk-1.0.so.0 (0x4042c000) dev-libs/atk-1.6.1 libgdk_pixbuf-2.0.so.0 => /usr/lib/libgdk_pixbuf-2.0.so.0 (0x40446000) x11-libs/gtk+-2.4.9 libpangoxft-1.0.so.0 => /usr/lib/libpangoxft-1.0.so.0 (0x4045c000) x11-libs/pango-1.6.0 libpangox-1.0.so.0 => /usr/lib/libpangox-1.0.so.0 (0x40461000) x11-libs/pango-1.6.0 libpango-1.0.so.0 => /usr/lib/libpango-1.0.so.0 (0x4046d000) x11-libs/pango-1.6.0 libgobject-2.0.so.0 => /usr/lib/libgobject-2.0.so.0 (0x404a4000) dev-libs/glib-2.4.5 libgmodule-2.0.so.0 => /usr/lib/libgmodule-2.0.so.0 (0x404dc000) dev-libs/glib-2.4.5 libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x404e0000) dev-libs/glib-2.4.5 libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x4055d000) x11-base/xorg-x11-6.7.0 libm.so.6 => /lib/libm.so.6 (0x40624000) sys-libs/glibc-2.3.3.20040420 libstdc++.so.5 => /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.4/libstdc++.so.5 (0x40647000) sys-devel/gcc-3.3.4 libgcc_s.so.1 => /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.4/libgcc_s.so.1 (0x40710000) sys-devel/gcc-3.3.4 libc.so.6 => /lib/libc.so.6 (0x40719000) sys-libs/glibc-2.3.3.20040420 /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) sys-libs/glibc-2.3.3.20040420 libXrandr.so.2 => /usr/X11R6/lib/libXrandr.so.2 (0x4082a000) x11-base/xorg-x11-6.7.0 libXi.so.6 => /usr/X11R6/lib/libXi.so.6 (0x4082e000) x11-base/xorg-x11-6.7.0 libXinerama.so.1 => /usr/X11R6/lib/libXinerama.so.1 (0x40837000) x11-base/xorg-x11-6.7.0 libXext.so.6 => /usr/X11R6/lib/libXext.so.6 (0x4083a000) x11-base/xorg-x11-6.7.0 libXft.so.2 => /usr/X11R6/lib/libXft.so.2 (0x40848000) x11-base/xorg-x11-6.7.0 libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x4085a000) media-libs/freetype-2.1.5 libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0x408c4000) media-libs/fontconfig-2.2.2 libXcursor.so.1 => /usr/X11R6/lib/libXcursor.so.1 (0x408eb000) x11-base/xorg-x11-6.7.0 libXrender.so.1 => /usr/X11R6/lib/libXrender.so.1 (0x408f5000) x11-base/xorg-x11-6.7.0 libpangoft2-1.0.so.0 => /usr/lib/libpangoft2-1.0.so.0 (0x408fd000) x11-libs/pango-1.6.0 libexpat.so.0 => /usr/lib/libexpat.so.0 (0x40924000) dev-libs/expat-1.95.7 libz.so.1 => /lib/libz.so.1 (0x40943000) sys-libs/zlib-1.2.1
Here's Mozilla's latest nightly incorrectly rendering Hindi text on my system.
Blocks: 270012
Blocks: 60546
For complex script users, I updated my patch for suite 1.7/aviary 1.0 branches. This patch also includes the patch for bug 234182 (ported to branches). Btw, with this patch applied, I can't reproduce the problem reported in attachment 152059 [details]. The rendering of Arabic with this patch is identical to that without this patch.
Is there a patched build for this? I can't get the patch to apply cleanly to mozilla-1.7.11. Is anyone interested in the .out file of the patch reject?
We don't need this in the current code base.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
Component: Layout: CTL → Layout: Text
QA Contact: arthit → layout.fonts-and-text
comment 116 is a copy of comment 3, with spam added comment 117 is a copy of part of comment 71, with spam added comment 118 is a copy of comment 82, with spam added
Flags: needinfo?(askaboutwp)
(In reply to David Baron [:dbaron] ⌚️UTC+13 (busy, returning 8 February) from comment #119) > comment 116 is a copy of comment 3, with spam added > comment 117 is a copy of part of comment 71, with spam added > comment 118 is a copy of comment 82, with spam added This bug was WONTFIXED in 2008, and since then (with a sharp increase in the past year) practically nothing but spam. Leave it at that, or set restricted access?
Flags: needinfo?(dbaron)
The spam would probably just move elsewhere, but maybe dolske can set it.
Flags: needinfo?(dbaron) → needinfo?(dolske)
Restricting per comment 134.
Flags: needinfo?(dolske)
Restrict Comments: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: