Closed
Bug 42221
Opened 24 years ago
Closed 24 years ago
DE: Searching using extended char. cuts off word in sidebar search widget
Categories
(SeaMonkey :: Search, defect, P2)
Tracking
(Not tracked)
VERIFIED
FIXED
M18
People
(Reporter: lynnw, Assigned: nhottanscp)
References
Details
(Whiteboard: [nsbeta3+][nsbeta2-])
Attachments
(4 files)
Performing a search from the sidebar using an extended character such as an "a"
with an umlaut (use a word such as "Männer") produces the correct results, but
the word left in the search box is cut off where the extended character should
be (in this case, if the user searched for "männer", all the user will see in
the search box after the search is finished is "m").
Putting on [nsbeta2-]. Once you start typing, this is set, and you would have
to start again if hitting reload. This is the way it works.
Whiteboard: [nsbeta-]
Let me clarify this situation:
1. In sidebar search, enter an English word, such as "car", into the search box.
Click on "search" button. When the results pop up, the search box in the
sidebar still contains the word "car".
2. In the sidebar search, enter a foreign word such as "männer", into the search
box. Click on "search" button. When the results pop up, the search box in the
sidebar contains the word "m". The rest of the word has been dropped. If the
word had been "manner", the whole word would have been in the box. I don't
understand what the last comment on this bug has to do with the situation I have
described here.
Comment 6•24 years ago
|
||
Nav triage team: NEED INFO; reassiging to Frank.
Assignee: slamm → ftang
Whiteboard: [nsbeta2-] → [nsbeta2-][NEED INFO]
Comment 7•24 years ago
|
||
nhotta, can you take a look at this? Is that cased by non convertable chars?
Assignee: ftang → nhotta
Assignee | ||
Comment 8•24 years ago
|
||
Accepting, it still happens (used win32 build ID 2000072704).
Adding rjc, ftang to cc.
Status: NEW → ASSIGNED
Keywords: nsbeta3
Comment 9•24 years ago
|
||
Per i18n triage meeting, the status is now [nsbeta3+].
Please check on the current status of this problem.
I see it working with 7/29 build.
Whiteboard: [nsbeta2-][NEED INFO] → [nsbeta3+]
Comment 10•24 years ago
|
||
could someone tell me how to generate the a-umlaut on a US pc keyboard, so that I may test this properly?
Comment 11•24 years ago
|
||
alt + 0228 (on the num pad) ä
Tip: Run Character Map (charmap.exe) to see an entire font and the ALT key
combinations needed to create extended characters on a non native keyboard.
Reporter | ||
Comment 12•24 years ago
|
||
You can generate the a+umlaut with this command: ALT+NUMLOCK+0228.
Assignee | ||
Comment 13•24 years ago
|
||
I cannot see the problem using M18 win32 build ID2000080308.
Marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 14•24 years ago
|
||
Actually, I saw the same problem again this morning while using the JA PR2
browser, so it doesn't appear to be fixed. Sometimes it works, but sometimes it
doesn't. Will have to pinpoint the problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 15•24 years ago
|
||
Since I saw it's fixed in M18, please reopen when you see that again in M18
(ID2000080308 or later).
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 16•24 years ago
|
||
yeah this went out broken in PR2 (it was nsbeta2-). It is however VERIFIED Fixed in m18 builds (2000080908).
Status: RESOLVED → VERIFIED
Reporter | ||
Comment 18•24 years ago
|
||
This is still an issue in the Aug. 15, 2000 build of Netscape 6. It seems to be
something wrong with the way the Google search results are being handled. I will
attach a file showing the problems in the sidebar search box and another problem
in the sidebar results.
There is also another bug I will open against the US NetscapeSearch.src file,
since this file does not show Google results if a user is looking at ODP results
and then clicks on the link to see more results using Google. I'll let you know
what the bug number is once I create it.
Status: CLOSED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 19•24 years ago
|
||
Assignee | ||
Comment 20•24 years ago
|
||
I used the same build but I cannot reproduce this. ODP found the search results
for "Männer" and did not go to Google.
Reporter | ||
Comment 21•24 years ago
|
||
Sorry, forgot to mention that this is probably a bug in the German sherlock
file.
Component: Sidebar → Search
Keywords: deb2
Summary: Searching using extended char. cuts off word in sidebar search widget → DE: Searching using extended char. cuts off word in sidebar search widget
Reporter | ||
Comment 22•24 years ago
|
||
There was no issue with the US NetscapeSearch.src file. Looks OK.
Assignee | ||
Comment 23•24 years ago
|
||
Okay, then the bug has to be reassigned. Who is responsible for the German
version?
Reporter | ||
Comment 24•24 years ago
|
||
Correction, I AM seeing it in the US NetscapeSearch.src file.
1. Select the US Netscape Search for your sidebar search.
2. Type in männer in the sidebar search box and click on Search button.
3. Once your results come up, go to the bottom of the result page and click on
the link "Additional results for ' männer ' using Googl."
4. Note that the results you get look like what you see in the attached image
file.
Assignee | ||
Comment 25•24 years ago
|
||
Assignee | ||
Comment 26•24 years ago
|
||
Assignee | ||
Comment 27•24 years ago
|
||
I still cannot reproduce. Lynn, the build id of your screen shot says
2000080712. Is it also reproducible with today's build?
Reporter | ||
Comment 28•24 years ago
|
||
Note that in the screenshot you took, the Google results aren't even appearing
in the sidebar, thus you can't see the problem I'm talking about. This
problem with Google results not showing up in the sidebar was another bug I had
started to report, but then I thought perhaps it was fixed. Looks like it
isn't???
Reporter | ||
Comment 29•24 years ago
|
||
This is still an issue in the Aug. 15, 2000 build of Netscape 6.
Assignee | ||
Comment 30•24 years ago
|
||
So I cannot get to the problem because the side bar does not update, any
workaround?
Reporter | ||
Comment 31•24 years ago
|
||
The odd thing is that it does update sometimes. Try removing the Users50 profile
and use the default sidebar search. It's working for me right now.
Assignee | ||
Comment 32•24 years ago
|
||
I found that there is a problem parsing the result page which it contains named
entites (e.g. ä). It is possible to change to support HTML 4 Latin1
entities (96 entites).
Status: REOPENED → ASSIGNED
Reporter | ||
Comment 33•24 years ago
|
||
I see that Google and also another search engine we use for Germany do not
define their charset in their search result page. Could this be the reason why
the sidebar is unable to show the extended characters?
Reporter | ||
Comment 34•24 years ago
|
||
No, I checked our search pages and we don't define the charset either. I'll look
at the latest US build to see if this problem has been fixed.
Assignee | ||
Comment 35•24 years ago
|
||
Charset "ISO-8859-1" to be used in case no charset/encoding specified in the
sherlock file. So the Google German result page is interpreted as "ISO-8859-1".
Comment 36•24 years ago
|
||
So, what will happen if we spcify a non Single byte charset and somehow the
return page are not encode in that page ? Do we handle the conversion error in
the internet search ? What we should do is if we hit that kind of problem, we
should skip one byte, reset the converter, and try to convert the rest of the
received data.
Assignee | ||
Comment 37•24 years ago
|
||
That topic is not directly related to this bug. But I think when the user set a
charset by menu or a charset is auto detected, and if that matches the charset
of the sherlock file then the parse will succeed after the reload.
Assignee | ||
Comment 38•24 years ago
|
||
I have a patch for the entity interpretation but I found another problem for
search query. The search code is expecting an escaped "UTF-8" query string but
it gets an escaped "ISO-8859-1" query string instead. Because of that, only "M"
is sent for the query instead of "Männer".
I think this has changed recently. We have been getting "UTF-8" and it
has been working fine. I think we can encode to "UTF-8" explicitly in .js file
when we escape the query.
Assignee | ||
Comment 39•24 years ago
|
||
Adding vidur and brendan.
The query problem started from 8/22 build. Escape behavior has changed since
that build.
Assignee | ||
Comment 40•24 years ago
|
||
Example of the problem:
Before 8/22 "Santé" was escaped as "Sant%C3%A9",
after 8/22 escaped as "Sant%E9".
I backed out (locally) three files which were checked in by brendan on 8/21 then
the escape returned to the old behavior and the internet search query worked
fine again.
nsGlobalWindow.cpp rev=1.322
nsJSEnvironment.cpp rev=1.103
jsapi.c rev=3.66
Comment 41•24 years ago
|
||
(Cc'ing leger. Why is this bug Netscape-Confidential? I couldn't tell from
"View Bug Activity".)
Lazy JS class initialization broke the dom/src/base code's attempt to override
the ECMA-standard escape and unescape functions, which the JS engine implements
for all embeddings, irrespective of any dom/src/base/nsJSWindow.cpp overrides.
I didn't realize we overrode the engine's standards compliant implementations of
those functions. Do the overriding, dom-based ones really comply with ECMA-262
Edition 3?
My change to lazily initialize the String class obviously breaks these overrides
if the String class is initialized after the nsJSWindow.cpp code eagerly defines
the overriding escape and unescape functions: as soon as the script then uses
String, String.prototype, or another not-yet-defined name to do with strings and
String in JS, then js_InitStringClass will be called, and it will override the
overrides with the original, JS-engine-based versions of escape and unescape.
Before we try to fix this bug, or back any code out, we need to determine why
the ECMA-compliant implementations of escape and unescape from the JS engine are
not sufficient, and (if they are not) that the overriding implementations that
are defined in nsGlobalWindow.cpp are ECMA-compliant, as well as good for all
locales.
Cc'ing mccabe, rogerl, and waldemar. The overriding implementations of escape
and unescape are at
http://lxr.mozilla.org/mozilla/source/dom/src/base/nsGlobalWindow.cpp#1900 and
below. I believe the "nsEscape.h" include and nsEscape (global function? or is
it a ctor?) call in nsGlobalWindow.cpp get you this code:
http://lxr.mozilla.org/mozilla/source/xpcom/io/nsEscape.cpp
/be
Assignee | ||
Comment 42•24 years ago
|
||
*** Bug 39956 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 43•24 years ago
|
||
There is a bug about escape and unescape (bug 44272).
The overriding code does url escape like %xx while ECMA's complient way is a
mixture of %xx and \uxxxx. As mentioned in bug 44272, there is a compatibility
issue, scripts which are using escape() to generate url string will break if we
generate \uxxxx.
Other issue is the regression, this bug and 39956. I think it is also used in
local file attachment. And I see other places uses escape() when I search our
.js files.
So I want the old behavior for escape().
Comment 44•24 years ago
|
||
We should use the escape from dom/src/base. This issue has a long and tangled
history, and joki has lost much hair because of it.
The issue isn't with escape and unescape per se, but with the fact that string
information that is to be transmitted to servers primarily passes through them.
Interaction with servers evolved back when JavaScript used 8-bit characters in
strings, and more importantly, when JavaScript worked directly with the document
charactacter encoding rather than with a unicode reflection of it.
After we moved escape and unescape into the JS engine in an effort to keep up
with the evolving ECMA standard, all of these server interactions broke, as they
were now getting unicode instead of, say, shift-JIS. We ended up keeping the
ECMA-draft versions in the engine (note that they are now not part of the
standard, because of this issue) and Joki implemented escape and unescape
functions in the DOM, where they had access to the document character encoding
and could to back-and-forth translations.
In short -
We need the engine versions, because embedding uses might expect them.
We need the DOM version, and it has to support legacy encoding behavior
The DOM version needs to live there, lest the engine learn about document
character encodings.
Hm, I suppose we *could* #ifdef or otherwise detect that the engine is being
compiled into the DOM, and omit the engine escape and unescape in that case.
Comment 45•24 years ago
|
||
mccabe: might not some newer authors want the ECMA ed. 3 escape and unescape?
Maybe we should provide them under different names (ECMAescape, ECMAunescape --
what could be uglier. Anyway, our claim to be ECMA ed. 3 compliant will suffer
slightly.
I'll take this bug, but please give me your thoughts on how to provide both sets
of functions.
/be
Assignee: nhotta → brendan
Status: ASSIGNED → NEW
Updated•24 years ago
|
Status: NEW → ASSIGNED
Target Milestone: M19 → M18
Comment 46•24 years ago
|
||
escape/unescape are now appendices to ECMA-262, but aren't needed for
compliance.
encodeURI/decodeURI are part of the ECMA-262 standard, and provide the
unicode-friendly behavior that's missing from the legacy-compatable escape and
unescape.
I think given what's out there, it's sufficient to just provide the DOM
escape/unescape when in the browser, whether this is enforced by #ifdef or some
API interaction.
Comment 47•24 years ago
|
||
Ah, good. I think I'll simply take escape and unescape out of the lazy-init
table in jsapi.c, using #ifndef MOZILLA_CLIENT.
/be
Comment 48•24 years ago
|
||
Comment 49•24 years ago
|
||
Patch checked in, with mccabe's presumptive blessing (he's taken off on vacation
but I'm highly confident).
/be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 50•24 years ago
|
||
r=mccabe, it's the right fix.
Assignee | ||
Comment 51•24 years ago
|
||
I checked the query problem is fixed on my local build. But the original problem
is not fixed, so reopen.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 52•24 years ago
|
||
Reassign to nhotta.
Assignee: brendan → nhotta
Status: REOPENED → NEW
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 53•24 years ago
|
||
Looks like I didn't really recompile the last time I tested the change.
The JS problem is still there. GlobalWindowImpl::Escape is still not called in
the current build. Reassign to brendan, please reassign back to me when the JS
problem is fixed.
Assignee: nhotta → brendan
Status: ASSIGNED → NEW
Comment 54•24 years ago
|
||
Whoops, forget an #ifndef -- sorry about that. Fixed in js/src/jsstr.c rev
3.38, back to nhotta.
/be
Assignee: brendan → nhotta
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 56•24 years ago
|
||
Checked in a fix for the search text truncation problem.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 57•24 years ago
|
||
VERIFIED Fixed with 200008311 builds. *all I did was test the original bug as reported and
currently summarized, that nhotta says is fixed. If there's something else it really should've
been another bug.
Status: RESOLVED → VERIFIED
Comment 58•24 years ago
|
||
brendan's fix to js/src/jsstr.c rev 3.38 is broken. This #ifdef means that any
JS code that uses the dll built in the mozilla tree doesn't get an escape or
unescape function in global scope *unless* they use a DOM window as the global
object. This is a hidden gotcha for anyone using the dll or the build system
unmodified (including xpcshell). AND it breaks ECMA promises for JS code in our
browser executed in prefs, xpinstall, and as JS components. This is bad.
Can't we init these globals in the engine and then explicitly override them in
the DOM window?
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 59•24 years ago
|
||
John, I think you should reopen bug 44272 instead of this one.
Comment 60•24 years ago
|
||
I disagree. The change that breaks non-DOM users of JS was made to fix this bug
and specifically references this bug.
Comment 61•24 years ago
|
||
(reassigning to brendan@mozilla.org is always an option)
Assignee | ||
Comment 62•24 years ago
|
||
Originally, this bug had nothing to do with that JS issue.
The last time, the JS change caused a regression and broke the search feature
itself so the fix for the original problem was blocked.
Comment 63•24 years ago
|
||
jband: escape and unescape are not normative in ECMA-262 Edition 3, according to
mccabe's comment in this bug, and to my reading (they're in an informative
appendix). Are you missing them in some context? As mccabe points out, there
are "better for Unicode" ECMA methods available in all contexts.
Of course, I should have debloated jsstr.c by #ifndef'ing MOZILLA_CLIENT the
function definitions too. I'll do that, if you agree that we can afford to take
the cheap #ifndef way out here.
/be
Comment 64•24 years ago
|
||
[re-closing this]
brendan: Sorry, rginda and I were taking Flanagan's (informal and dated) word
that escape and unescape where in ECMA. I thought the other bug was more about
the lack of unicode support. I now see mccabe's closing comment and I see how
ECMA 3 treats these functions. If you think leaving them out is the right thing,
then I'm OK with that.
This came to my attention because rginda mentioned to me that the functions do
not appear in xpcshell (but do in jsshell). I'm not sure if this is a hardship
for him or not. When I saw the mechanism by which they are left out I was
worried that you'd temporarily overlooked the fact that 'compiled with
MOZILLA_CLIENT' != 'running in DOM window or even in seamonkey'. I don't know
how to gauge people's expectations about the availability of these funtions. I
just wanted to make sure you'd made your decision keeping in mind other users of
the code as built in the mozilla tree.
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 65•24 years ago
|
||
MOZILLA_CLIENT is defined by autoconf.mk for the client build, but it's not
defined by Makefile.ref or js.mak. I really don't think the #ifndef solution is
going to cause anyone hardship (phil and rob should comment on how standalone JS
is packaged, but I bet it's not with MOZILLA_CLIENT hardcoded).
/be
Comment 66•24 years ago
|
||
I'm betting that most projects using other mozilla technology in addition to the
JS engine (e.g. nspr & xpcom & js & xpconnect & maybe more of gecko) will build
using the mozilla build system and then just import the built parts into their
own build systems. Or they'll clone enough of the mozilla build system to make
it work for them. JS by itself using the standalone makefiles is not the only
game in town. XPCOM_STANDALONE and XPCONNECT_STANDALONE are very minor
variations on the mozilla system and I know they don't undo MOZILLA_CLIENT.
Again, I'm not too worried. But I think the effects of the MOZILLA_CLIENT
#define are too subtle for your average embedder to notice.
Comment 67•24 years ago
|
||
re-verifying. the behavior this bug references is fixed and verified. Anything else
is an offline issue or needs to be a different bug. Feel free to reference this bug #.
It'll still be here it'll just be VERIFIED Fixed
Status: RESOLVED → VERIFIED
Comment 68•24 years ago
|
||
The bug originally reported here was corrected by a checkin made by nhotta on
August 30 (file nsInternetSearchService.cpp). There is no patch posted here for
that checkin.
Unfortunately that change is the cause of a new bug whereby any plus signs that
the user types into the sidebar search field will get erased after the search is
done. This is bug 55052. I'm currently investigating how to modify nhotta's
checkin so that it doesn't cause the plus signs to be erased.
Comment 69•23 years ago
|
||
I don't get why this is Netscape Confidental either, opening this up to be
world-viewable since nobody gave a reason when Brendan asked why this needs to
be Netscape Confidential back in comment #41.
Group: netscapeconfidential?
Updated•16 years ago
|
Product: Core → SeaMonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•