Closed
Bug 63608
Opened 24 years ago
Closed 16 years ago
What's related (related links) sidebar can't handle non-Western characters
Categories
(SeaMonkey :: Sidebar, defect)
SeaMonkey
Sidebar
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: jshin, Unassigned)
References
()
Details
(Keywords: intl)
*Symptom:
What's related (related links) sidebar renders
everything in a font for ISO-8859-1 so that
non-Western-European characters are not rendered
correctly in 'related links' sidebar.
The default encoding (set in Edit|Preference|Navigator
|Language) doesn't seem to affect how 'related links'
sidebar is rendered.
* To reproduce: go to any popular non-English (or non-Western
-European) sites likely to have related links
in non-Western-European and see how related links
are rendered in 'related links' sidebar.
Change the default encoding in Preference dialog
to the encoding of the site and see if there's
any difference.
* Suggestion:
Given that there are so
many sites with no encoding information or incorrect
encoding information, it's very hard to get
right the encoding of links in 'related links'
sidebar.
However, as a zeroth order approximation, Mozilla
may render everything in 'related links' sidebar
as if they're in the default encoding set in
Preference dialog. If a user sets her/his
default encoding to a particular encoding,
sites (s)he visits are more likely to be in
that encoding than any other encodings and
so are related links to sites (s)he visits.
However, for some users, only a small portion
of sites visited (let alone being the
majority which might be the case for some users)
may be indeed in the default encoding. For them,
the zeroth order approximation doesn't work
very well. A better approach would be use
the encoding of the page currently being viewed
to render links in 'related links' sidebar.
There's no guarantee that related links to
the site are in the same encoding as the site,
but it's pretty likely that they are.
Maybe, in case the encoding of related links
can be explicitly determined (from http header
or meta tag), that should be honored and
two approximations mentioned above can be used
only as fallbacks when the encoding cannot
be determined explicitly (or the encoding
is cached in memory from the previous visit).
There's a problem, though, with depending on
the encoding info. provided via http header
or meta tag because there are a lot of sites
that get it wrong.
Reporter | ||
Comment 1•24 years ago
|
||
The url for a site with non-English related links
is added.
Keywords: intl
Comment 2•24 years ago
|
||
QA contact to blee since he is familiar with international
issues involved in the Related links. I thought the related
links (NS6) is tagged with UTF-8 encoding so that the problem
has to do with the server database not bein correct.
QA Contact: shrir → blee
This needs to be fixed for the next release -- added the nsbeta1 keyword.
This may be a server-side problem. In the first What's Related release, the
server only converted Latin1 and Japanese encodings to UTF-8. I don't know if
the WR vendor fixed this for other encodings. So the first step would be to
verify that the client is receiving good data from the WR server.
Keywords: nsbeta1
Comment 4•24 years ago
|
||
Adding tpringle, kmurray and lbaliman to cc: list.
Comment 5•24 years ago
|
||
Assinging to TPringle for resolution from Alexa. May need help from Bob Jung.
Adding Bobj to cc: list.
Assignee: matt → tpringle
Priority: -- → P2
nav triage team:
Resetting priority so that this bug gets retriaged.
Priority: P2 → --
Removing nsbeta1+ from status whiteboard, need to figure what to do in general
with what's related.
Whiteboard: nsbeta1+
Comment 10•24 years ago
|
||
Changing QA contact to jonrubin@netscape.com.
QA Contact: andreasb → jonrubin
Comment 11•24 years ago
|
||
Jon : is this still a problem in NS 6.01? Which uses the Netscape WR tab rather
than the Alexa tab.
Comment 12•24 years ago
|
||
Is Alexa sending back info correctly converted to UTF-8? In Alexa's
original implementation, it only did so for ISO-Latin1 and Japanese
charset encodings. I don't know if Alexa ever fixed its server to convert
other charsets (e.g., Korean charsets) to UTF-8.
It appears from external usage, that Mozilla only displays ISO-Latin1 WR
titles. For pages with non-ISO-Latin1 titles (even Japanese), the WR sidebar
displays the URL.
Is this because that is what Alexa is returning? Or is the browser doing
something?
Netscape 4.x displayed Japanese titles in the WR dropdown. But it appears
that the WR info returned to 4.x is different than what is returned to
Mozilla. When I try both, I get a different list of related URLs for the
same URL. Are they pointing to the same WR server/URL or is Alexa sniffing
the browser?
Comment 13•24 years ago
|
||
Ccing Matt and Myron - do you guys know the answer to this?
Comment 14•24 years ago
|
||
Does this affect 6.01 as vishy asked. We are checking in that code instead of
using the Alexa tab. If it doesn't this bug is only for mozilla and not
netscape
Comment 15•24 years ago
|
||
Vishy, 6.01 appears to be fine. Japanese characters to display correctly.
Comment 16•24 years ago
|
||
I checked Korean as well. I can see Korean characters in 6.01, but I cannot
verify as to whether they make any sense. But Japanese is definitely displaying
properly.
Reporter | ||
Comment 17•24 years ago
|
||
In NS 6.0, Korean characters look *mostly* fine, but Korean characters
in some sites are treated as if they're ISO-8859-1. I guess this is due
to the fact that they're regarded as *non-Korean* sites (and as a result
the conversion to Unicode was not done properly) when the DB entries
for them were made. For instance, try <http://www.ohmynews.com>
and there are two entries in 'What's related' with Korean characters
properly displayed and 5 entries with Korean characters garbled
(rendered as though they're ISO-8859-1).
Comment 18•24 years ago
|
||
marking as nsbeta1- per i18n triage.
Comment 19•24 years ago
|
||
Todd - Who's got the answer to this one? This is a server side issue, correct.
Comment 20•24 years ago
|
||
Matt, do you think Shawkat would know?
Comment 21•24 years ago
|
||
Assigning TM = M0.9.2 | P3.
Linda - Can you work with Todd on this one?
Comment 22•24 years ago
|
||
Todd, this was first reported quite a while ago, so I checked W/R results today
(05.25.01) and I am seeing corrupted (German) characters in the W/R results.
Please let me know how you want to proceed and what I can do to help.
Comment 23•23 years ago
|
||
Lynn, I think I see What's Related extended characters working in DE 6.1b, but
not FR 6.1b. Would you please confirm?
Teruko, would you please check JA 6.1b?
Comment 24•23 years ago
|
||
It depends what you're looking at. The default language for the browser is currently set wrong on FR, so you won't see the right
information.
Updated•23 years ago
|
Assignee: tpringle → vishy
Target Milestone: mozilla0.9.2 → mozilla1.0
Comment 25•23 years ago
|
||
Changing milestone, reassigning to vishy.
Comment 26•23 years ago
|
||
I see a lot of sites are still broken, even netscape one'
for example
http://www.atour.co.jp/golf/index2.html
http://home.netscape.com/zh/tw/
http://home.netscape.com/zh/cn/
http://home.netscape.com/ko/
http://www.edu.cn/
etc. This is a server side issue. I think we should first run the top 100 intl
QA sites against this bug (See how many of the what's related links for those
top 100 intl sites are borken) . It is very sad that this kind of problem still
happen after years of intergrating "What's related" service into the client.
Comment 27•23 years ago
|
||
I quickly walked through the JA top 100 sites and at least the following sites
are broken for "What's related":
http://www.rakuten.co.jp
http://www.cool.ne.jp
http://www.tok2.com
http://www.suntory.co.jp
http://www.otd.co.jp
http://www.fujitv.co.jp
http://www.melma.com
http://www.alpha-net.ne.jp
Comment 28•23 years ago
|
||
-> samir for investigation with help from Frank.
Assignee: vishy → sgehani
Target Milestone: mozilla1.0 → mozilla0.9.5
Comment 29•23 years ago
|
||
mass change, switching qa contact from jonrubin to ruixu.
QA Contact: jonrubin → ruixu
Comment 30•23 years ago
|
||
Marking nsbranch- as it was decided in the August bug triage that we wouldn't
have eenough time in eMojo to fix this. Let's revisit for MachV.
Keywords: nsbranch-
Comment 31•23 years ago
|
||
I just tried these again:
> http://www.rakuten.co.jp
> http://www.cool.ne.jp
> http://www.tok2.com
> http://www.suntory.co.jp
> http://www.otd.co.jp
> http://www.fujitv.co.jp
> http://www.melma.com
> http://www.alpha-net.ne.jp
and all but fujitv returned Japanese results in the What's Related sidebar
panel. fujitv reported no related links at all.
Comment 32•23 years ago
|
||
removed keyword nsbranch since it now has nsbranch-, per pdt mtg.
Keywords: nsbranch
Comment 33•23 years ago
|
||
Mass-moving lower-priority 0.9.5 bugs off to 0.9.6 to make way for remaining
0.9.4/eMojo bugs, and MachV planning, performance and feature work. If you
disagree with any of these targets, please let me know.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Comment 36•23 years ago
|
||
Sidebar triage team: commercial client invesitgation to be done by Sujay with
help from i18n QA. (The mozilla What's Related content is entirely web-based.)
Target Milestone: mozilla0.9.9 → Future
Comment 37•23 years ago
|
||
As per Sujay's email, filed bug 12290 in bugscape for commercial build.
Updated•20 years ago
|
Product: Browser → Seamonkey
Updated•16 years ago
|
Assignee: samir_bugzilla → nobody
Priority: P3 → --
QA Contact: ruixu → sidebar
Target Milestone: Future → ---
Comment 38•16 years ago
|
||
Currently the Alexa server seems smart enough to return just the URL as the link if the <title> of the linked page contains non-western characters. Please re-open if this is not the case.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•