Closed Bug 78201 Opened 23 years ago Closed 22 years ago

[BiDi]:Arabic 2 byte fonts don't seem to render / Add support for iso8859-6.8x fonts

Categories

(Core :: Layout: Text and Fonts, defect)

Sun
Solaris
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: prabhat.hegde, Assigned: smontagu)

References

()

Details

(Keywords: relnote)

Attachments

(11 files, 6 obsolete files)

(deleted), application/octet-stream
Details
(deleted), application/octet-stream
Details
(deleted), text/plain
Details
(deleted), application/octet-stream
Details
(deleted), image/jpeg
Details
(deleted), image/jpeg
Details
(deleted), image/jpeg
Details
(deleted), image/jpeg
Details
(deleted), image/gif
Details
(deleted), image/jpeg
Details
(deleted), patch
smontagu
: review+
Details | Diff | Splinter Review
On my bidi builds on both Solaris and Linux [Trunk nightly (04/29) + sources from erik@netscape], i am not able to view any arabic output. All i get is glyph for invalid char (?). The font i am using is a 2 byte font based on iso-8859-6. [Please email me for the font as i am not sure if i can attach it here]. The problem may be with 1 byte input-2 byte output (just guessing at this point). Test URL's : www.ayna.com, www.al-jazirah.com
Reassign to mkaply@us.ibm.com.
Assignee: nhotta → mkaply
What is the encoding of the two bytes font?
Changing QA contact to mahar@eg.ibm.com
QA Contact: andreasb → mahar
move to Bidi Hebrew/Arabic component probat- Pleaes attach your font again. Please specify mime type correctly while you attach. Is it a zip file? Is that true the font is encoded in Unicode ? What is the XLFD ?
Component: Internationalization → BiDi Hebrew & Arabic
reassign to katakai@japan.sun.com katakai- this is a linux font issue. I think you know the font code well to make this happen. Please work with bstell for details if you need help. First qustion we want to know is this font a TrueType font ? a ISO-10646 font? what is the XLFD? If it is a ISO-10646 font, then we will use the ISO-10646 font path in the gtk.
Assignee: mkaply → katakai
Mass-move all BiDi Hebrew and Arabic qa to me, zach@zachlipton.com. Thank you Gilad for your service to this component, and best of luck to you in the future. Sholom.
QA Contact: mahar → zach
QA to mahar.
QA Contact: zach → mahar
Blocks: 115714
What is the status of this bug ? Will this be fixed for 1.0 ?
Comment on attachment 32644 [details] The two byte arabic fonts that i tested with. Uhm... this tar.Z misses the matching fonts.dir file... ;-(
*** Bug 136101 has been marked as a duplicate of this bug. ***
It looks as if we have to add some special voodoo to use the font encoding from attachment 36498 [details].
hi simon, I believe this is because the font calls itself "iso8859-6" while actually being unicode encoded "iso10646". prabhat.
prabhat: Do you know which encoding the following fonts (diestributed with Solaris) have: 1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType 2. /usr/openwin/lib/locale/ar/X11/fonts/Type1 ?
hi roland, as i mentioned it is iso10646. As mentioned earlier it calls itself "iso8859-6". I am attaching arabic_font_info.tar.Z which has fonts.dir and also ttmap file for your info. prabhat
> Do you know which encoding the following fonts (diestributed with Solaris) > have: > 1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType > 2. /usr/openwin/lib/locale/ar/X11/fonts/Type1 Are these fonts or are these directories?
Brian Stell wrote: > > 1. /usr/openwin/lib/locale/ar/X11/fonts/TrueType > > 2. /usr/openwin/lib/locale/ar/X11/fonts/Type1 > > Are these fonts or are these directories? This are dirs which contain fonts for the X11 system (Solaris puts the X11-related code into /usr/openwin - but that's another story). Solaris seperates it's fonts per locale and then per font type ("ar" == arabic locale, "TrueType"/"Type1" are the font types).
prabhat: I have still no luck, the '?' do not disappear... .. and hacking the fonts.dir and renaming the fonts to -- snip -- NASKHMT.ttf -monotype-naskh-medium-r-normal--0-0-0-0-p-0-iso10646-1 NASKHBD.ttf -monotype-naskh-bold-r-normal--0-0-0-0-p-0-iso10646-1 -- snip -- causes Xsun to ask for a matching encoding file ("Cannot find encoding file for iso10646-1" in /var/dt/Xerrors). Looks I have to seek little bit harder for finding a solution...
hi roland, I don't think hacking font-encoding will help. As you mentioned, you also need to add ttmap for the encoding, add entry in ttmaps.dir and so on. One idea is to change the converter currently used in nsFontMetrics for this font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to handle 8859-6 + unicode encoded presentation forms A & B. prabhat.
Prabhat Hegde wrote: > I don't think hacking font-encoding will help. As you mentioned, you also > need to add ttmap for the encoding, add entry in ttmaps.dir and so on. > > One idea is to change the converter currently used in nsFontMetrics for this > font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to > handle 8859-6 + unicode encoded presentation forms A & B. I would perfer a solution which is not Solaris-specific since many people use SPARCs and use Xterminals or copy these fonts around... Adding special encoder support and treating the ISO-8859-6 fonts always as doublebyte fonts may be a solution. Looks we simple need someone who hacks the ISO-8859-6 converter code a little bit... :)
prabhat: Do you know whether the PostScript Type1 fonts in /usr/openwin/lib/locale/ar/X11/fonts/Type1/ contain the presentation forms a&b, too ? If yes - how can I use them from the X11 API ?
> One idea is to change the converter currently used in nsFontMetrics for this > font for Solaris only. ie iso8859-6 'SingleByteconverter' to be enhanced to > handle 8859-6 + unicode encoded presentation forms A & B. How would this work? Would it be applied to all iso8859-6 fonts? Would there be a different encoding; eg: "iso8859-6x" ?
Prabhat: please please please please do not attach file as zip or application/octet-stream. Attach seperate attachment for the thing you want to attach- one for each file. please. I won't have the right tool to look at your attachement from my Mac.
Hmm, I could not still find out how to browse Arabic with iso-8859-6 fonts. Have anyone succeeded? - Sun's ar fonts - failed as this bug report - Tried langbox arabic fonts from http://www.langbox.com/AraZilla/linux/arafontfull-1.2-4.i386.rpm Can anyone know the location of correct iso-8869-6 fonts? There is two way I have succeeded to display arabic characters without iso-8859-6 fonts, 1. use FreeType as fontpath to /usr/openwin/lib/locale/ar/TrueType Great!! However, there is not priting solution for now if we use TrueType. We's planning to suppport Xprint for Arabic. Xserver should be able to load and displayt Arabic glyphs. 2. Use iso10646-1 fonts I tried arabic bdf fonts from http://crl.nmsu.edu/~mleisher/download.html It works well.
Comment on attachment 83639 [details] not working - langbox 8859-6x fonts from http://www.langbox.com/AraZilla/linux/arafontfull-1.2-4.i386.rpm AFAIK we do not have any mapping for "iso-8859-6x" fonts in our fontmetrics code
What about treating ISO-8859-6 fonts like iso10646-1 in our fontmetrics system, e.g. use the X11 API to query which glyphs are present in the font and not try to "guess" it based on the encoding ? That seems to work for the Solaris 2.8 fonts from /usr/openwin/lib/locale/ar/X11/fonts/TrueType (at least "xfd" treats them as 16bit font then...).
Does anyone know which encoding standard the fonts "SHA1____.PFA" and "SHA2____.PFA" from /usr/openwin/lib/locale/ar/X11/fonts/Type1/ implement ?
These encodings are new to me and certainly not any of the common standard or proprietary encodings from ISO, ASMO, Microsoft, IBM or Apple, but each one is a folding of part of the Unicode set into 8 bits (assuming that they both start from 0x00 at the top left). SHA1____.PFA is like a variant of ISO-8859-6: 0x00-0x7F - equivalent to ASCII 0x80-0xEF - equivalent to Unicode 0x0600-0x066F SHA2____.PFA maps the presentation forms from Unicode 0xFF70-0xFEFF to 0x70-0xFF
Simon Montagu wrote: > These encodings are new to me and certainly not any of the common standard or > proprietary encodings from ISO, ASMO, Microsoft, IBM or Apple, but each one is > a folding of part of the Unicode set into 8 bits > (assuming that they both start from 0x00 at the top left). Mhhh... ... what above creating a new X11 font encoding (we may call it "sun.unicode.plane" :) which lists the unicode plane and offset being used in the font (this would be sufficient for 8bit fonts; 16bit fonts would need multiple entries in fonts.dir) ?
Clarification: We are talking about two things here: Solaris ships with two kinds of Arabic fonts: 1. TrueType fonts in /usr/openwin/lib/locale/ar/X11/fonts/TrueType/ - they are listed as *-iso-8859-6 fonts in /usr/openwin/lib/locale/ar/X11/fonts/TrueType/fonts.dir - but the Solaris TrueType font engine idetifies them correctly as 16bit fonts. 2. PS Type 1 fonts in /usr/openwin/lib/locale/ar/X11/fonts/Type1/ - which seem to represent the unicode blocks for arabic (see comment #34) squished into 8bit fonts For [1] I propose to check whether ISO-8859-6 fonts are 16bit X11 fonts. If they are 16bit fonts we should assume that they have the arabic glyphs in the expected places For [2] I propose the idea listed in comment #35 (new X11 encoding scheme "sun.unicode.plane")
I filed a seperate bug (bug 159430 ("[RFE] Add support for X11 fonts which represent single unicode blocks")) for the discussion around [2] from comment #36
BTW: I filed bug 158894 ("RFE: Document how to treat TrueType fonts as iso10646-1 with Solaris/Xsun") to document how users can use TrueType fonts with iso10646-1 encoding on Solaris/Xsun to view/print Arabic pages.
> 2. PS Type 1 fonts in /usr/openwin/lib/locale/ar/X11/fonts/Type1/ - which seem > to represent the unicode blocks for arabic (see comment #34) squished into > 8bit fonts > For [1] I propose to check whether ISO-8859-6 fonts are 16bit X11 fonts. Is it valid to use ios8859-x for a 16 bit font? > If they are 16bit fonts we should assume that they have the arabic glyphs in > the expected places Is there any standard that describes this? Doing this if no standard exists makes me very nervous. > For [2] I propose the idea listed in comment #35 (new X11 encoding scheme > "sun.unicode.plane") 0x00-0x7F - equivalent to ASCII 0x80-0xEF - equivalent to Unicode 0x0600-0x066F ...Unicode 0xFF70-0xFEFF to 0x70-0xFF (ignore the typo) These do not have a simple mapping which the suggestion in comment #35 assumes.
Just to be clear: both ftang and I are happy to support these fonts and if there is no standard we can just start the encoding-registry name with something like "x_" to indicate this is a non standard encoding.
from email, Prabhat Hegde <prabhath@mpkmail.eng.Sun.COM> writes: The arabic language folks at Sun finally fixed their font encoding which they say is Stds based (LangBox). This font is called iso8859-6.8x and is also supported by Gnome/Pango. How do i get this font to appear in the arabic font selection on my Solaris box? Edit->Preferences->Font->Languages->Arabic Currently only iso8859-6 based font shows up in the selection box.
blizzard: here is a small task where you could get you feet wet working on fonts.
Thanks to quick education from ftang, simon and myself found the following - A> A UnicodeToLangBox converter already exists in ucvlatin written long-long ago by ftang which can be re-used after synching with latest code-base. B> This converter needs to be modified to handle Arabic Pres form B. Finally, the converter is not as complicated as when frank wrote it since arabic presentation forms are already generated by layout. Hence only mapping but no shaping logic is needed. I am on it right now and should have a patch by tomorrow.
Attached patch Patch (obsolete) (deleted) — Splinter Review
This is mostly Prabhat's work with some contributions from me. intl/uconv changed under me while I was working on it, so there may be some oddities.
Attached patch Patch with all the files in it this time (obsolete) (deleted) — Splinter Review
Attachment #101334 - Attachment is obsolete: true
Over to smontagu...
Assignee: katakai → smontagu
Comment on attachment 101339 [details] [diff] [review] Patch with all the files in it this time r=Roland.Mainz@informatik.med.uni-giessen.de Patch builds&works, I can see the arabic fonts properly and print them (with Xprint), too (assuming that the matching fonts&*.enc&*.ttmap files are available).
Attachment #101339 - Flags: review+
I forgot one minor nit (no need to file a new patch for that): -- snip -- - NS_IMETHOD FillInfo(PRUint32* aInfo); -}; - +static PRUnichar uni2lbox [] = + { + 0xC1, /* FE80 */ + 0xC2 , + 0xC2 , -- snip -- Can you make that array |const|, please ?
would you add a font-lang group for arabic? nsFontLangGroup FLG_JA = { "ja", nsnull }; nsFontLangGroup FLG_KO = { "ko", nsnull }; +nsFontLangGroup FLG_AR = { "ar", nsnull }; nsFontLangGroup FLG_NONE = { nsnull , nsnull };
Brian Stell wrote: > would you add a font-lang group for arabic? ... which reminds me that we didn't add one for indic ("hi-IN") either... ;-(
Comment on attachment 101339 [details] [diff] [review] Patch with all the files in it this time Thanks for the r=, but the patch needs polish at the very least. Also, we have a problem with lam-alef ligatures (as attachment 101369 [details] shows) which must be fixed.
Attachment #101339 - Flags: review+ → needs-work+
Yup - i think it still needs some bug-fixing. Its not just combo(LAM+ALEF) case. I am not a native user or expert so the only way i can tell is by comparing with Windows version and also IE. I tried the following sites: assafir.com (MAC-ARABIC) this was the best. aljazeera.com (Windows) bbc.co.uk -> Arabic (Unicode encoded) However, it is worthwhile to integrate it so that arabic testers on Solaris can start testing. I can tell that selection is badly broken.
Simon Montagu wrote: > Thanks for the r=, but the patch needs polish at the very least. Also, we have > a problem with lam-alef ligatures (as attachment 101369 [details] shows) which must be > fixed. I thought this is a problem with the Solaris fonts - or did I understand you wrong here ?
Prabhat Hegde wrote: > However, it is worthwhile to integrate it so that arabic testers on Solaris > can start testing. I can tell that selection is badly broken. That's why I gave my r= for it. The code works IMHO "good enougth" for trunk and we can't really kill all issues in one step. Without it we're completely screwed without the evil iso10646-1 fonts (there's still the problem that the ISO8859-6.8x encoding files for Solaris+Xfree86 aren't available in the public... ;-( ).
Langbox encodings are publicly available: http://www.langbox.com/arabic/FontSet_ISO8859-6-8X.html
Prabhat Hegde wrote: > Langbox encodings are publicly available: > http://www.langbox.com/arabic/FontSet_ISO8859-6-8X.html ... I was think about the encodings files (*.enc and *.ttmap) in this case.
Attached patch Patch v.2 with lam-alef handling (obsolete) (deleted) — Splinter Review
I made the changes that Roland and Brian asked for, did some general clean-up, and added handling for lam-alef. This exposed a bug in nsRenderingContextGTK::GetTextDimensions: text with lam-alef was laid out incorrectly and couldn't be selected properly, unless I set MOZILLA_GFX_DISABLE_FAST_MEASURE in my environment. (Thanks are due to Roland for help in identifying this problem).
Attachment #101339 - Attachment is obsolete: true
Attached patch Patch v.3 (obsolete) (deleted) — Splinter Review
Oops! removed a printf
Attachment #101630 - Attachment is obsolete: true
Roland Mainz wrote: > Brian Stell wrote: > > would you add a font-lang group for arabic? > > ... which reminds me that we didn't add one for indic ("hi-IN") either... ;-( Filed bug 172515 ("Some nsFontLangGroup entries missing in X11 font code") for that issue...
The font changes look good and I am qualified to review them. r=bstell@ix.netcom.com for the font changes. I don't work in the converter code enough to be qualified to review those changes. Perhaps ftang can do that.
Adding dependicy to bug 172683 ("Problem with layout of Arabic lam-alef ligatures and nsRenderingContext{GTK|Xlib}::GetTextDimensions()") since lam-alef using the LangBox iso8859-6.8x is screwed-up due that bug...
Depends on: 172683
Attachment #101636 - Attachment is obsolete: true
ftang, please review the converter and Mac build changes in attachment 102093 [details] [diff] [review].
Comment on attachment 102093 [details] [diff] [review] Patch v.4: merged to tip (with fix for bug 172515) and added Mac build changes r=ftang only one minor issue. Please change the return to NS_OK in the following function: +NS_IMETHODIMP nsUnicodeToLangBoxArabic8::GetMaxLength( +const PRUnichar * aSrc, PRInt32 aSrcLength, + PRInt32 * aDestLength) +{ + *aDestLength = 2*aSrcLength; + return NS_OK_UENC_EXACTLENGTH; from NS_OK_UENC_EXACTLENGTH then you just update the patch with a has-review with it. The rest of the code looks good
Attachment #102093 - Flags: review+
Attachment #102093 - Attachment is obsolete: true
Comment on attachment 102096 [details] [diff] [review] Patch addressing ftang's review comments r=ftang per comment 66
Attachment #102096 - Flags: review+
Comment on attachment 102096 [details] [diff] [review] Patch addressing ftang's review comments sr=roc+moz
Attachment #102096 - Flags: superreview+
Fix checked in.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
There is one final thing ToDo: We need an entry in the >=1.2b release notes with the following information: - iso8859-6.8x support added for Unix/Linux incl. Xprint - Information where to get the iso8859-6.8x-encoded fonts from
Keywords: relnote
prabhat: Is there a way to contribute the iso8859-6.8x *.enc dir file to Xfree86.org and create a patch for Solaris's "ar"-locale with the fixed arabic fonts and *.enc/*.ttmap files ?
Filed release note item under bug 174672 comment #1 ...
Hi roland, OK - i'll look at *.enc file (as you know even minor contribution needs to go via legal). As to Solaris, i believe Ar locale owners will create a patch.
Prabhat Hegde wrote: > i'll look at *.enc file > (as you know even minor contribution needs to go via legal). ;-(( > As to Solaris, i believe Ar locale owners will create a patch. Well, AFAIK you'll have to patch the Solaris "ar"-TrueType fonts, too - per smontagu some glyphs are missing in these fonts (however, it should be easy to add them via "pfaedit" (more details on demand) ... :))
Summary: [BiDi]:Arabic 2 byte fonts don't seem to render → [BiDi]:Arabic 2 byte fonts don't seem to render / Add support for iso8859-6.8x fonts
Blocks: 199741
No longer blocks: 199741
Blocks: 199741
Filed http://bugs.xfree86.org//cgi-bin/bugzilla/show_bug.cgi?id=420 ("RFE: Add encodings files for Arabic LangBox encodings iso8859-6.8, iso8859-6.8x and iso8859-6.16") to get support for these encodings in Xfree86...
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: mahar → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: