Closed
Bug 382398
Opened 17 years ago
Closed 14 years ago
checksetup.pl localized messages should be output in the console's charset
Categories
(Bugzilla :: Installation & Upgrading, enhancement)
Tracking
()
RESOLVED
FIXED
Bugzilla 4.0
People
(Reporter: vitaly.fedrushkov, Assigned: mkanat)
References
Details
(Keywords: intl)
Attachments
(1 file, 3 obsolete files)
(deleted),
patch
|
timello
:
review+
|
Details | Diff | Splinter Review |
Problem running checksetup.pl from non UTF-8 capable console.
We have messages.html.tmpl in UTF-8 which is right, but windows people (besides
Cygwin bash users) do use different text charsets -- for example, Windows-1251
here in Russia.
Keeping messages in different charsets within single file is not good.
[based on bug 352608 comment 3]
Assignee | ||
Comment 1•17 years ago
|
||
As a note, in case I don't fix this -- the solution is for Bugzilla::Install::Util::install_string to encode things into the console charset if the console charset is not UTF-8.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows XP → All
Hardware: PC → All
Summary: localized checksetup.pl charset → checksetup.pl localized messages should be output in the console's charset
Assignee | ||
Updated•17 years ago
|
Target Milestone: --- → Bugzilla 3.2
Reporter | ||
Updated•17 years ago
|
Comment 2•17 years ago
|
||
How about to check 'LANG' shell environment?
If user uses cmd.exe of Windows, we can assume that user can use utf-8.
On others, i think we should consider that shell cannot display utf-8 when $ENV{LANG} doesn't include '.UTF-8'.
Assignee | ||
Comment 3•17 years ago
|
||
No, you can just use the POSIX locale functions for it, and they should return something sensible on Windows, I think. The only difficult part is that POSIX locales don't map to Encode's understanding of character sets, necessarily.
Reporter | ||
Comment 4•17 years ago
|
||
(In reply to comment #2)
> If user uses cmd.exe of Windows, we can assume that user can use utf-8.
Wrong assumption: russian Windows uses codepage 1251 fot text windows. Cygwin bash works well however.
Comment 5•16 years ago
|
||
So we're talking about
binmode(STDOUT, ":encoding($charset)");
binmode(STDERR, ":encoding($charset)");
here, and we're trying to find a way to determine $charset?
Assignee | ||
Comment 6•16 years ago
|
||
(In reply to comment #5)
> So we're talking about
>
> binmode(STDOUT, ":encoding($charset)");
> binmode(STDERR, ":encoding($charset)");
>
> here, and we're trying to find a way to determine $charset?
If :encoding($charset) will properly translate utf-8 into that charset, then yeah. I'm not sure if the POSIX locale functions work on Windows or not, but if they do, that would possibly give us the info we need on all platforms. Otherwise there might be some Win32:: function we can use.
Reporter | ||
Comment 7•16 years ago
|
||
Can we rely on console windows using True Type fonts? Then we could enforce codepage 65001.
Comment 8•16 years ago
|
||
I don't think so. Even worse, I suspect codepages may not be a subset of cp 65001.
Assignee | ||
Comment 9•16 years ago
|
||
Yeah, I'm pretty sure all Windows consoles use bitmap fonts by default.
Reporter | ||
Comment 10•16 years ago
|
||
Workaround, tested on Russian Windows:
Select Lucida Console as cmd window font
Run chcp 65001 before checksetup.pl
Comment 11•15 years ago
|
||
Bugzilla 3.2 is restricted to security bugs only. Moreover, this bug is either assigned to nobody or got no traction for several months now. Rather than retargetting it at each new release, I'm clearing the target milestone and the bug will be retargetted to some sensible release when someone starts fixing this bug for real (Bugzilla 3.8 more likely).
Target Milestone: Bugzilla 3.2 → ---
Assignee | ||
Updated•15 years ago
|
Severity: normal → enhancement
Target Milestone: --- → Bugzilla 3.8
Assignee | ||
Comment 12•15 years ago
|
||
Okay, this does it. I didn't test it on Windows, but I did test that the POSIX::setlocale function works on Windows (which it does).
If you try to print out a character that your encoding doesn't support, Perl throws warnings.
Comment 13•15 years ago
|
||
To work correctly, it requires the patch from bug 550765.
Depends on: 550765
Comment 14•15 years ago
|
||
Comment on attachment 434739 [details] [diff] [review]
v1
This is a huge improvement over what we have currently, but there are still a few bits which are displayed incorrectly, see the output of checksetup.pl below, with french templates installed. Problems are:
1)
Wide character in print at Bugzilla/Install/Requirements.pm line 340.
Vérification des modules Perl DBD disponibles�
2)
ATTENTION : Vous devez définir le paramètre max_allowed_packet dans votre
configuration MySQL à au moins 3276750. Actuellement, il est défini à 1048576.
Vous pouvez définir ce paramètre dans la section [mysqld] de votre fichier de
configuration MySQL.
-----------
C:\Program Files\Apache Software Foundation\Apache2.2\htdocs\bugzilla>checksetup.pl
* Bugzilla 3.7 avec Perl 5.10.1
* sur Win7 Build 7100
Vérification des modules Perl.
Vérification de CGI.pm (v3.33) ok: v3.45 trouvé
Vérification de Digest-SHA (tout) ok: v5.48 trouvé
Vérification de TimeDate (v2.21) ok: v2.24 trouvé
Vérification de DateTime (v0.28) ok: v0.53 trouvé
Vérification de DateTime-TimeZone (v0.79) ok: v1.11 trouvé
Vérification de DBI (v1.41) ok: v1.609 trouvé
Vérification de Template-Toolkit (v2.22) ok: v2.22 trouvé
Vérification de Email-Send (v2.16) ok: v2.198 trouvé
Vérification de Email-MIME (v1.861) ok: v1.863 trouvé
Vérification de Email-MIME-Encodings (v1.313) ok: v1.313 trouvé
Vérification de Email-MIME-Modifier (v1.442) ok: v1.444 trouvé
Vérification de URI (tout) ok: v1.52 trouvé
Wide character in print at Bugzilla/Install/Requirements.pm line 340.
Vérification des modules Perl DBD disponibles�
Vérification de DBD-Pg (v1.45) non trouvé
Vérification de DBD-mysql (v4.00) ok: v4.011 trouvé
Vérification de DBD-Oracle (v1.19) non trouvé
Les modules Perl suivants sont optionnels :
Vérification de GD (v1.20) ok: v2.44 trouvé
Vérification de Chart (v2.1) ok: v2.4.1 trouvé
Vérification de Template-GD (tout) ok: v1.56 trouvé
Vérification de GDTextUtil (tout) ok: v0.86 trouvé
Vérification de GDGraph (tout) ok: v1.44 trouvé
Vérification de XML-Twig (tout) ok: v3.34 trouvé
Vérification de MIME-tools (v5.406) ok: v5.427 trouvé
Vérification de libwww-perl (tout) ok: v5.829 trouvé
Vérification de PatchReader (v0.9.4) ok: v0.9.5 trouvé
Vérification de perl-ldap (tout) ok: v0.39 trouvé
Vérification de Authen-SASL (tout) ok: v2.13 trouvé
Vérification de RadiusPerl (tout) ok: v0.17 trouvé
Vérification de SOAP-Lite (v0.710.06) ok: v0.710.10 trouvé
Vérification de JSON-RPC (tout) ok: v0.96 trouvé
Vérification de Test-Taint (tout) ok: v1.04 trouvé
Vérification de HTML-Parser (v3.40) ok: v3.64 trouvé
Vérification de HTML-Scrubber (tout) ok: v0.08 trouvé
Vérification de Email-MIME-Attachment-Stripper (tout) ok: v1.316 trouvé
Vérification de Email-Reply (tout) ok: v1.202 trouvé
Vérification de TheSchwartz (tout) non trouvé
Vérification de Daemon-Generic (tout) non trouvé
Vérification de mod_perl (v1.999022) non trouvé
***********************************************************************
* MODULES OPTIONNELS *
***********************************************************************
* Certains modules Perl ne sont pas indispensables pour Bugzilla, *
* mais en installant la dernière version, vous pourrez accéder à des *
* fonctionnalités supplémentaires. *
* *
* Les modules optionnels que vous n'avez pas installés sont listés *
* ci-dessous, avec le nom de la fonctionnalité qu'ils activent. Sous *
* ce tableau se trouvent les commandes pour installer chaque module. *
***********************************************************************
* MODULE NAME * ENABLES FEATURE(S) *
***********************************************************************
* TheSchwartz * File d'attente de courrier *
* Daemon-Generic * File d'attente de courrier *
* mod_perl * mod_perl *
***********************************************************************
* Note pour les utilisateurs Windows *
***********************************************************************
* Pour installer les modules listés ci-dessous, vous devez d'abord *
* exécuter la commande suivante en tant qu'administrateur : *
* *
* ppm repo add theory58S http://cpan.uwinnipeg.ca/PPMPackages/10xx/
***********************************************************************
COMMANDES POUR INSTALLER LES MODULES OPTIONNELS :
TheSchwartz: ppm install TheSchwartz
Daemon-Generic: ppm install Daemon-Generic
mod_perl: ppm install mod_perl
Reading ./localconfig...
OPTIONAL NOTE: If you want to be able to use the 'difference between two
patches' feature of Bugzilla (which requires the PatchReader Perl module
as well), you should install patchutils from:
http://cyberelk.net/tim/patchutils/
Vérification de DBD-mysql (v4.00) ok: v4.011 trouvé
Checking for MySQL (v4.1.2) ok: found v5.5.1-m2-community
ATTENTION : Vous devez définir le paramètre max_allowed_packet dans votre
configuration MySQL à au moins 3276750. Actuellement, il est défini à 1048576.
Vous pouvez définir ce paramètre dans la section [mysqld] de votre fichier de
configuration MySQL.
Suppression des modèles compilés existants.
Précompilation des modèles.terminé.
Checking for GraphViz (any) ok: found
Attachment #434739 -
Flags: review?(LpSolit) → review-
Comment 15•15 years ago
|
||
Unless there is a a technical limitation, we should really take it for 3.6. Else the output is unreadable, all lines beings of the form:
Vérification des modules Perl�
Vérification de CGI.pm (v3.33) ok: v3.45 trouvé
Vérification de Digest-SHA (tout) ok: v5.48 trouvé
Vérification de TimeDate (v2.21) ok: v2.24 trouvé
Flags: blocking3.6?
Target Milestone: Bugzilla 3.8 → Bugzilla 3.6
Assignee | ||
Comment 16•15 years ago
|
||
It's too much of an enhancement and refactoring at this point to take for 3.6.
Bug 550765 should resolve the issues with checksetup.pl, provided that the templates are stored in UTF-8 and the user's terminal encoding is UTF-8 (which should be the most common encoding for modern terminals).
Flags: blocking3.6? → blocking3.6-
Target Milestone: Bugzilla 3.6 → Bugzilla 3.8
Comment 17•15 years ago
|
||
(In reply to comment #16)
> templates are stored in UTF-8 and the user's terminal encoding is UTF-8 (which
> should be the most common encoding for modern terminals).
It's not on Windows, which is what comment 15 is about.
Assignee | ||
Comment 18•15 years ago
|
||
Ahh. Well, that's been a problem for quite some time (since Bugzilla 3.2), and it's what this bug is about. This patch affects every command-line script in Bugzilla, though, not just checksetup, so I don't want to mess around with that while we're in an RC stage.
FWIW, there are many languages (Russian, CJK, anything that isn't ISO-8859-1) that will do nothing but throw warnings on Windows's default charset, so checksetup.pl will become entirely a string of warnings. I think that's not a safe thing to do post-RC, also, but it's probably OK for 3.8 because we will have some time to test and get feedback and see if it really is a problem in practical situations.
Assignee | ||
Comment 19•15 years ago
|
||
Okay, I figured it out. There were two problems:
1) We were calling init_console twice, which was leading to double-encoding characters.
2) We didn't set encoding() on STDERR.
Attachment #434739 -
Attachment is obsolete: true
Attachment #440395 -
Flags: review?(LpSolit)
Comment 20•14 years ago
|
||
Without your patch, the output on Windows 7 is:
* Bugzilla 3.7 avec Perl 5.10.1
* sur Win7 Build 7600
V├®rification des modules PerlÔǪ
V├®rification de CGI.pm (v3.33) ok: v3.48 trouv├®
With your patch:
* Bugzilla 3.7 avec Perl 5.10.1
* sur Win7 Build 7600
VÚrification des modules Perlà
VÚrification de CGI.pm (v3.33) ok: v3.48 trouvÚ
This is only a slightly better, but all letters with accents are still rendered incorrectly.
Comment 21•14 years ago
|
||
Hum, despite the shell uses cp1252, the last few lines of checksetup.pl are displayed correctly when using cp850.
Assignee | ||
Comment 22•14 years ago
|
||
(In reply to comment #20)
>> VÚrification des modules Perlà
> VÚrification de CGI.pm (v3.33) ok: v3.48 trouvÚ
I can't reproduce this issue. Using the current French templates, the lines appear correctly for me using Windows's default terminal settings. Do you have something unusual about your terminal configuration?
I do see a problem with a single message in checksetup.pl--the one printed about the DBD modules. But that's it.
Assignee | ||
Comment 23•14 years ago
|
||
Also, you might want to try throwing some debug code into set_output_encoding to see what Bugzilla thinks your terminal's encoding is. Mine says cp1252.
Assignee | ||
Comment 24•14 years ago
|
||
Okay, so the problem that I was experiencing (and possibly that you were experiencing as well) is that CGI.pm sets binmode on STDOUT, but only on Windows! I'm going to report it to them as a bug.
Assignee | ||
Comment 25•14 years ago
|
||
I've reported the CGI.pm bug here:
https://rt.cpan.org/Ticket/Display.html?id=57524
Assignee | ||
Comment 26•14 years ago
|
||
Okay, this works around the CGI.pm bug. Calling set_output_encoding over and over is harmless, because it does nothing if the output encodings are already correct.
Attachment #440395 -
Attachment is obsolete: true
Attachment #445575 -
Flags: review?(LpSolit)
Attachment #440395 -
Flags: review?(LpSolit)
Comment 27•14 years ago
|
||
Hum, this change doesn't help. The output remains the same.
Comment 28•14 years ago
|
||
I added some debug code into set_output_encoding() as follows:
sub set_output_encoding {
# If we've already set an encoding layer on STDOUT, don't
# add another one.
my @stdout_layers = PerlIO::get_layers(STDOUT);
print "\nSTDOUT layers are " . join("/", @stdout_layers) . "\n";
return if grep(/^encoding/, @stdout_layers);
my $encoding;
my $locale = setlocale(LC_CTYPE);
print "LC_CTYPE = $locale\n";
if ($locale =~ /\.([^\.]+)$/) {
$encoding = $1;
print "found encoding $encoding\n";
if (ON_WINDOWS) {
$encoding = "cp$encoding";
print "Windows detected. Setting encoding to $encoding\n";
}
}
$encoding = Encode::resolve_alias($encoding) if $encoding;
print "encoding alias is $encoding\n";
...
}
And now the output of checksetup.pl becomes:
C:\Program Files\Bugzilla\bugzilla>..\perl\perl\bin\perl.exe checksetup.pl -t
STDOUT layers are unix/crlf
LC_CTYPE = French_Switzerland.1252
found encoding 1252
Windows detected. Setting encoding to cp1252
encoding alias is cp1252
* Bugzilla 3.7 avec Perl 5.10.1
* sur Win7 Build 7600
VÚrification des modules Perlà
STDOUT layers are unix/crlf
LC_CTYPE = French_Switzerland.1252
found encoding 1252
Windows detected. Setting encoding to cp1252
encoding alias is cp1252
VÚrification de CGI.pm (v3.33) ok: v3.48 trouvÚ
STDOUT layers are unix/crlf/encoding(cp1252)/utf8
VÚrification de Digest-SHA (tout) ok: v5.48 trouvÚ
STDOUT layers are unix/crlf/encoding(cp1252)/utf8
VÚrification de TimeDate (v2.21) ok: v2.24 trouvÚ
Is the mix encoding(cp1252)/utf8 expected?
Reporter | ||
Comment 29•14 years ago
|
||
Have you tried explicit 'chcp 65001' before checksetup.pl? Any changes in output?
Comment 30•14 years ago
|
||
(In reply to comment #29)
> Have you tried explicit 'chcp 65001' before checksetup.pl? Any changes in
> output?
What's that?
Comment 31•14 years ago
|
||
chcp returns 850, despite LC_CTYPE says 1252.
Comment 32•14 years ago
|
||
Oh, and chcp 65001 before checksetup.pl has no effect, with or without the patch applied.
Comment 33•14 years ago
|
||
(In reply to comment #32)
> Oh, and chcp 65001 before checksetup.pl has no effect, with or without the
> patch applied.
I take that back. I changed the font used by cmd.exe to Lucida, and now your trick works great, without mkanat's patch!
Assignee | ||
Comment 34•14 years ago
|
||
Ohhh, I think maybe we have to use a different function to get the console encoding, on Windows. I know what it is, I'll provide another patch and see if it makes a difference.
Assignee | ||
Comment 35•14 years ago
|
||
Okay, this patch uses OutputCP instead of setlocale, now, on Windows. Does this fix your problem?
Attachment #445575 -
Attachment is obsolete: true
Attachment #445601 -
Flags: review?(LpSolit)
Attachment #445575 -
Flags: review?(LpSolit)
Comment 36•14 years ago
|
||
As I said in comment 33, I see no difference now that I set the font to Lucida, so I cannot review your patch as "ok, this fixes my problem". Vitaly, does this patch help in your case?
Assignee | ||
Updated•14 years ago
|
Attachment #445601 -
Flags: review?(LpSolit) → review?(timello)
Updated•14 years ago
|
Attachment #445601 -
Flags: review?(timello) → review+
Comment 37•14 years ago
|
||
Comment on attachment 445601 [details] [diff] [review]
v4
It works! I tested it using cp1252. I printed some portuguese words with accents which were written in UTF-8. They all were printed the way they should be. I suppose it will work for other languages too.
Updated•14 years ago
|
Flags: approval?
Assignee | ||
Updated•14 years ago
|
Flags: approval? → approval+
Assignee | ||
Comment 38•14 years ago
|
||
Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/trunk/
modified Bugzilla.pm
modified checksetup.pl
modified Bugzilla/Install/Requirements.pm
modified Bugzilla/Install/Util.pm
Committed revision 7257.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•