Closed Bug 123408 Opened 23 years ago Closed 17 years ago

[ps] No standard Postscript characters outside ISO 8859-1 print

Categories

(Core :: Printing: Output, defect, P2)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: Markus.Kuhn, Unassigned)

References

()

Details

Attachments

(3 files)

The Mozilla printer driver is not capable of printing characters outside the ISO
8859-1 set, even if the characters in a document are all part of the standard
PostScript fonts.

The example page

  http://www.cl.cam.ac.uk/~mgk25/ucs/groff_char.txt

shows all the characters that a Linux man page can contain and that can be
printed easily using the standard PostScript fonts. Mozilla prints only an empty
box for most of these characters at the moment.

Mozilla also uses in the PostScript output the wrong glyphs for the characters '
(straight single quotation mark, U+0027) and ` (grave accent, U+0060). For
detailed background information on this problem, please have a look at

  http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

For inspriation of what an extension of the Mozilla PostScript driver might be
able to achieve, have a look at

  http://www.pps.jussieu.fr/~jch/software/cedilla/

or (for a far more primitive approach) at

  http://www.cl.cam.ac.uk/~mgk25/ucs/utf2ps.c
  http://www.cl.cam.ac.uk/~mgk25/ucs/utf2ps.ps
Markus, what build are you using?  There was a recent fix to the postscript
print module to handle this exact problem....
I should note that in the linux 2002-02-02-22 build the following characters
from http://www.cl.cam.ac.uk/~mgk25/ucs/groff_char.txt print boxes:

hyphen, circle, handleft, handright, overline, carriage return 

A number of other characters are not printing (most of the greek letters and the
arrows) but are not printing an empty box either.
Looks good now. Reporter, could you try a new build?
Doesn't look anything resembling good here... See page 5 of the output.
Sorry. I should have been more specific. It works fine on Windows.

This bug should be confirmed, but I can't test it on Linux.
Status: UNCONFIRMED → NEW
Ever confirmed: true
-->
Assignee: rods → dcone
*** Bug 129896 has been marked as a duplicate of this bug. ***
Just testing whether I can submit UTF-8 characters in comments via bugzilla, as
in this case I will use them in my next followup:
capital-sigma = 'Σ', aleph = 'ℵ', less-then-or-equal = '≤', euro = '€',
copyright = 'Β©'
Can you read/print these?
I have now tested the Postscript print driver of Mozilla 0.9.9 for Linux. It is
a great improvement, most of the characters from the PostScript standard font
repertoire (including Symbol) are now available in printouts.

However, there are still a small number of characters missing, that should be
trivial to add. I have listen them in

  http://www.cl.cam.ac.uk/~mgk25/ucs/Postscript-Moz0.9.9.txt

Just print the above file/page, which contains the missing/wrong characters in
UTF-8. The full PostScript repertoire can be seen in the test texts

  http://www.cl.cam.ac.uk/~mgk25/ucs/Postscript.txt
  http://www.cl.cam.ac.uk/~mgk25/ucs/groff_char.txt

In particular:

  - The glyphs for U+0027 and U+0060 are still those for U+2019 and U+2018)
  - U+00AD SOFT HYPHEN is not shown (it shouldn't be shown in HTML 4.0, where
    it is used as a control character to denote a hyphenation opportunity,
    but it is a normal graphical character in preformatted plaintext, where
    it is used to distinguish hyphens that were inserted because of a
    line break.)
  - The uppercase Greek letters that are homoglyphs to Latin letters
    are missing.
  - Lowercase phi has the phi symbol shape (see note on this in Unicode 3.2)
  - A few maths symbols are missing.
  - Many of the math symbols (bracket components) in the Adobe Symbol font
    that had previously to be mapped to the Unicode private use area are now
    properly encoded in Unicode starting with Unicode 3.2. Here is an updated
    Unicode 3.2 mapping for the Symbol font:

0xE1    0x27E8  # MATHEMATICAL LEFT ANGLE BRACKET       # angleleft
0xE6    0x239B  # LEFT PARENTHESIS UPPER HOOK   # parenlefttp
0xE7    0x239C  # LEFT PARENTHESIS EXTENSION    # parenleftex
0xE8    0x239D  # LEFT PARENTHESIS LOWER HOOK   # parenleftbt
0xE9    0x23A1  # LEFT SQUARE BRACKET UPPER CORNER      # bracketlefttp
0xEA    0x23A2  # LEFT SQUARE BRACKET EXTENSION # bracketleftex
0xEB    0x23A3  # LEFT SQUARE BRACKET LOWER CORNER      # bracketleftbt
0xEC    0x23A7  # LEFT CURLY BRACKET UPPER HOOK # bracelefttp
0xED    0x23A8  # LEFT CURLY BRACKET MIDDLE PIECE       # braceleftmid
0xEE    0x23A9  # LEFT CURLY BRACKET LOWER HOOK # braceleftbt
0xEF    0x23AA  # CURLY BRACKET EXTENSION       # braceex
0xF1    0x27E9  # MATHEMATICAL RIGHT ANGLE BRACKET      # angleright
0xF4    0x23AE  # INTEGRAL EXTENSION    # integralex
0xF6    0x239E  # RIGHT PARENTHESIS UPPER HOOK  # parenrighttp
0xF7    0x239F  # RIGHT PARENTHESIS EXTENSION   # parenrightex
0xF8    0x23A0  # RIGHT PARENTHESIS LOWER HOOK  # parenrightbt
0xF9    0x23A4  # RIGHT SQUARE BRACKET UPPER CORNER     # bracketrighttp
0xFA    0x23A5  # RIGHT SQUARE BRACKET EXTENSION        # bracketrightex
0xFB    0x23A6  # RIGHT SQUARE BRACKET LOWER CORNER     # bracketrightbt
0xFC    0x23AB  # RIGHT CURLY BRACKET UPPER HOOK        # bracerighttp
0xFD    0x23AC  # RIGHT CURLY BRACKET MIDDLE PIECE      # bracerightmid
0xFE    0x23AD  # RIGHT CURLY BRACKET LOWER HOOK        # bracerightbt
Reporter:
Does this print OK for you ? Any suggestions, comments etc. ?
Summary: No standard Postscript characters outside ISO 8859-1 print → [ps] No standard Postscript characters outside ISO 8859-1 print
Attached file HTML 4 Character Entities (deleted) β€”
Here is another test page that might be useful in this context.
It shows all the character entities defined in HTML 4.

The last time I printed it, using build 200205308 on a Mandrake
system with the CUPS printing system, only a few characters
outside ISO-8859-1 came out.
Blocks: 157675
Priority: -- → P2
Target Milestone: --- → Future
Now that the PS module has Truetype to Postscript printing (bug 144663) can we 
close this?


If this bug is fixed, sure.  When I print that page with build 2003-01-30-08 on
Linux, I see exactly the same problem as illustrated in attachment 92202 [details].  Rest
assured that I have fonts that are quite capable of displaying those chars; some
of these are even truetype fonts.

Is the truetype printing disabled by default or something?
yes, to enable Truetype printing enable direct FreeType2 display; see:
http://www.mozilla.org/projects/fonts/unix/enabling_truetype.html
Does that work on RedHat 6.2?
yes, if the system has the freetype2 library and the build has this enabled
(Redhat rpms have this disabled)
I should clarify.  Will Freetype2 build usefully on a RH 6.2 system?

In either case, this bug is not fixed if what I print looks like garbage... You
can of course decide that it's wontfix on old Linux systems, but if you're going
to drop support for them like that it'd be nice if mozilla.org made that clear.
It was developed with Freetype2 2.0.6 on Redhat 6.2.
I just checked and it works with Freetype 2.1.3 on Redhat 6.2.
When I print a webpage (when having FreeType2 enabled) my Lexmark Optra-E312
printer (PostScript capable) does not understand the PostScript file generated
by Mozilla and doesn't print at all.

By accident I found that ps2pdf and pdf2ps fix that.

ps2pdf mozilla.ps; pdf2ps mozilla.pdf; lpr mozilla.ps

So, does Mozilla generate broken (but understandable for ps2pdf) PostScript
files, or do I have a bad printer?


Then the page is printed correctly.

When I print a webpage with FreeType disabled, most of the local ISO-8859-2
characters are missing (replaced with spaces). But the printer at least prints
something without the need to ps2pdf&&pdf2ps the output.

For example, try printing the http://www.wp.pl/ website and check the letters
such as a with tail, z with abovedot; n with acute, s with acute etc.)

Does your printer speak PS level3? If not, you need to filter PS output
generated by Mozilla through ghostscript (what you did with ps2pdf and pdf2ps is
sorta like that except that your method unnecessarily complex). Recent versions
of ghostscript speak PS level3 perfectly well and modern printing systems for
POSIX-system makes it very easy to configure filters like that.  See
'international known issues' in 'release notes' (help | release notes) for more
details. 
could you please attach a sample printout of the simplest page that causes this
issue. If possible it would help if the page was extremely simple, say just one
accented character.
to make the sample file: create a html page that display just one accented
character and then print it to file.

the code should only require level 2 (or 2015) not level 3. There is a test in
the printout that checks for this and explains what to do

alternately: you could add this to your moz print command 
gs -q -sDEVICE=pswrite -sOutputFile=- -dNOPAUSE -dBATCH
-dMozConvertedToLevel2=true - | lpr [OPTIONS]
To comment 21. 

Yes, this is a PostScript Level 2 printer.

I've set up CUPS to convert everything to Level 2 and it now prints correctly.
Thanks and sorry for spamming this bug.
*** Bug 152882 has been marked as a duplicate of this bug. ***
Assignee: dcone → printing
Mass-resolving some bugs related to the old unix printing system. The old code has been removed from the tree, and the bugs don't occur with the cairo-based printing code.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: