Closed Bug 655980 Opened 14 years ago Closed 14 years ago

Issues with accented characters using Georgia

Categories

(Core :: Layout: Text and Fonts, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 609604

People

(Reporter: code, Unassigned)

References

()

Details

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 I keep stepping over webpages in languages that use accented characters (French, German) that are affected with rendering issues in Firefox 4 (and other browsers using Gecko 2: Seamonkey 2.1, Fennec, Firefox 5). The accents or umlauts are offset, an additional space is added - é is rendered as ´e - ä is rendered as ¨a. In order to reproduce the issue, we need the following: 1) Firefox 4 (or other Gecko 2 browser) 2) A webpage using the font Georgia (system font on Mac OSX, Windows XP, Windows Vista) 3) Some text on that webpage with accented characters, that have been inserted by a user in a non-conventional way involving copy paste. The most typical case is copy-paste from a PDF. If we remove one of those factors, the issue disappears: 1) If we watch the page in Firefox 3.6, Opera, IE or a Webkit-based browser, all looks good. 2) If we use another font than Georgia, the characters look good in FF4. 3) If we replace the imported accented characters and type them properly in a text editor, they will look good in FF4. I'm no expert in font rendering forensics, but I expect that somebody will tell me that: a) The Georgia.ttf font has issues. b) The copy-paste from the PDF has introduced some error in those accented characters, and is not the recommended way of writing web pages. Unfortunately we must face the reality of the web: a) Georgia is one of the fonts that have been considered "web-safe" for over a decade, so it's presently implemented in gazillions of websites. b) Most websites nowadays are running a CMS, therefore we cannot expect all content to be written in a HTML editor. People will happily continue copy-pasting text from PDFs and Word documents. It may be a "bad practice", but we cannot realistically change that, nor fix Adobe's PDF export bugs. Reproducible: Always Steps to Reproduce: 1. Grab some accented text from a PDF. You may use the sample PDF linked from my test page. 2. Paste that text into a HTML file. 3. Save the HTML file, make sure that you have a CSS rule saying "font-family: Georgia;" Actual Results: The accented characters are displayed incorrectly: the accents or umlauts are offset, an additional space is added - é is rendered as ´e - ä is rendered as ¨a. Expected Results: Firefox should display the accented characters correctly. Possibly related: Bug #609604 - Wrong font rendering of combining marks (accents) without anchors
The problem here is that the text uses combining characters from the U+03xx block in Unicode to encode the diacritics, but Georgia does not support those characters. Hence, Gecko finds an alternative font for the diacritic; but then the base character and diacritic are being drawn with different fonts, and this prevents proper positioning. Most keyboard layouts will generate "precomposed" Unicode characters where the letter+accent are combined into a single codepoint; Georgia does support lots of these, and so they display as expected. There are a couple of potential solutions to this: one is bug 543200, ensure that we use the same font (wherever possible) for a complete "cluster" of base letter + diacritics. Another is to apply Unicode canonical composition during the rendering process, so that even when the text is encoded using combining diacritics, we convert to the precomposed forms (where such forms exist) for rendering. This would be helpful in a number of cases, but needs to be done with care - it's possible for the opposite problem to arise, where a font supports both the base character and the combining diacritic, but does *not* directly support the precomposed character.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Oops, I misread the display when checking the Georgia font; looks like it does support some of the combining marks, but lacks proper OpenType tables to position them. See bug 609604 comment 16. So I think this is really a duplicate of that bug.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Thanks Jonathan for the explanations. Good to know the issue has been recognized. In the meantime, that's one more incentive for designers for going with @font-face webfonts.
You need to log in before you can comment on or make changes to this bug.