Closed Bug 42221 Opened 24 years ago Closed 24 years ago

DE: Searching using extended char. cuts off word in sidebar search widget

Categories

(SeaMonkey :: Search, defect, P2)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: lynnw, Assigned: nhottanscp)

References

Details

(Whiteboard: [nsbeta3+][nsbeta2-])

Attachments

(4 files)

Performing a search from the sidebar using an extended character such as an "a" with an umlaut (use a word such as "Männer") produces the correct results, but the word left in the search box is cut off where the extended character should be (in this case, if the user searched for "männer", all the user will see in the search box after the search is finished is "m").
qa: claudius
QA Contact: shrir → claudius
Keywords: nsbeta2
Putting on [nsbeta2-]. Once you start typing, this is set, and you would have to start again if hitting reload. This is the way it works.
Whiteboard: [nsbeta-]
Let me clarify this situation: 1. In sidebar search, enter an English word, such as "car", into the search box. Click on "search" button. When the results pop up, the search box in the sidebar still contains the word "car". 2. In the sidebar search, enter a foreign word such as "männer", into the search box. Click on "search" button. When the results pop up, the search box in the sidebar contains the word "m". The rest of the word has been dropped. If the word had been "manner", the whole word would have been in the box. I don't understand what the last comment on this bug has to do with the situation I have described here.
Correcting spelling for beta2 munus
Whiteboard: [nsbeta-] → [nsbeta2-]
m19
Target Milestone: --- → M19
Nav triage team: NEED INFO; reassiging to Frank.
Assignee: slamm → ftang
Whiteboard: [nsbeta2-] → [nsbeta2-][NEED INFO]
nhotta, can you take a look at this? Is that cased by non convertable chars?
Assignee: ftang → nhotta
Accepting, it still happens (used win32 build ID 2000072704). Adding rjc, ftang to cc.
Status: NEW → ASSIGNED
Keywords: nsbeta3
Per i18n triage meeting, the status is now [nsbeta3+]. Please check on the current status of this problem. I see it working with 7/29 build.
Whiteboard: [nsbeta2-][NEED INFO] → [nsbeta3+]
could someone tell me how to generate the a-umlaut on a US pc keyboard, so that I may test this properly?
alt + 0228 (on the num pad) ä Tip: Run Character Map (charmap.exe) to see an entire font and the ALT key combinations needed to create extended characters on a non native keyboard.
You can generate the a+umlaut with this command: ALT+NUMLOCK+0228.
Whiteboard: [nsbeta3+] → [nsbeta3+][nsbeta2-]
I cannot see the problem using M18 win32 build ID2000080308. Marking as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Actually, I saw the same problem again this morning while using the JA PR2 browser, so it doesn't appear to be fixed. Sometimes it works, but sometimes it doesn't. Will have to pinpoint the problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Since I saw it's fixed in M18, please reopen when you see that again in M18 (ID2000080308 or later).
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
yeah this went out broken in PR2 (it was nsbeta2-). It is however VERIFIED Fixed in m18 builds (2000080908).
Status: RESOLVED → VERIFIED
Closing, since verified fixed.
Status: VERIFIED → CLOSED
This is still an issue in the Aug. 15, 2000 build of Netscape 6. It seems to be something wrong with the way the Google search results are being handled. I will attach a file showing the problems in the sidebar search box and another problem in the sidebar results. There is also another bug I will open against the US NetscapeSearch.src file, since this file does not show Google results if a user is looking at ODP results and then clicks on the link to see more results using Google. I'll let you know what the bug number is once I create it.
Status: CLOSED → REOPENED
Resolution: FIXED → ---
Attached image googleresults08152000 (deleted) —
I used the same build but I cannot reproduce this. ODP found the search results for "Männer" and did not go to Google.
Sorry, forgot to mention that this is probably a bug in the German sherlock file.
Component: Sidebar → Search
Keywords: deb2
Summary: Searching using extended char. cuts off word in sidebar search widget → DE: Searching using extended char. cuts off word in sidebar search widget
There was no issue with the US NetscapeSearch.src file. Looks OK.
Okay, then the bug has to be reassigned. Who is responsible for the German version?
Correction, I AM seeing it in the US NetscapeSearch.src file. 1. Select the US Netscape Search for your sidebar search. 2. Type in männer in the sidebar search box and click on Search button. 3. Once your results come up, go to the bottom of the result page and click on the link "Additional results for ' männer ' using Googl." 4. Note that the results you get look like what you see in the attached image file.
I still cannot reproduce. Lynn, the build id of your screen shot says 2000080712. Is it also reproducible with today's build?
Note that in the screenshot you took, the Google results aren't even appearing in the sidebar, thus you can't see the problem I'm talking about. This problem with Google results not showing up in the sidebar was another bug I had started to report, but then I thought perhaps it was fixed. Looks like it isn't???
This is still an issue in the Aug. 15, 2000 build of Netscape 6.
So I cannot get to the problem because the side bar does not update, any workaround?
The odd thing is that it does update sometimes. Try removing the Users50 profile and use the default sidebar search. It's working for me right now.
I found that there is a problem parsing the result page which it contains named entites (e.g. ä). It is possible to change to support HTML 4 Latin1 entities (96 entites).
Status: REOPENED → ASSIGNED
I see that Google and also another search engine we use for Germany do not define their charset in their search result page. Could this be the reason why the sidebar is unable to show the extended characters?
No, I checked our search pages and we don't define the charset either. I'll look at the latest US build to see if this problem has been fixed.
Charset "ISO-8859-1" to be used in case no charset/encoding specified in the sherlock file. So the Google German result page is interpreted as "ISO-8859-1".
So, what will happen if we spcify a non Single byte charset and somehow the return page are not encode in that page ? Do we handle the conversion error in the internet search ? What we should do is if we hit that kind of problem, we should skip one byte, reset the converter, and try to convert the rest of the received data.
That topic is not directly related to this bug. But I think when the user set a charset by menu or a charset is auto detected, and if that matches the charset of the sherlock file then the parse will succeed after the reload.
I have a patch for the entity interpretation but I found another problem for search query. The search code is expecting an escaped "UTF-8" query string but it gets an escaped "ISO-8859-1" query string instead. Because of that, only "M" is sent for the query instead of "Männer". I think this has changed recently. We have been getting "UTF-8" and it has been working fine. I think we can encode to "UTF-8" explicitly in .js file when we escape the query.
Adding vidur and brendan. The query problem started from 8/22 build. Escape behavior has changed since that build.
Example of the problem: Before 8/22 "Santé" was escaped as "Sant%C3%A9", after 8/22 escaped as "Sant%E9". I backed out (locally) three files which were checked in by brendan on 8/21 then the escape returned to the old behavior and the internet search query worked fine again. nsGlobalWindow.cpp rev=1.322 nsJSEnvironment.cpp rev=1.103 jsapi.c rev=3.66
(Cc'ing leger. Why is this bug Netscape-Confidential? I couldn't tell from "View Bug Activity".) Lazy JS class initialization broke the dom/src/base code's attempt to override the ECMA-standard escape and unescape functions, which the JS engine implements for all embeddings, irrespective of any dom/src/base/nsJSWindow.cpp overrides. I didn't realize we overrode the engine's standards compliant implementations of those functions. Do the overriding, dom-based ones really comply with ECMA-262 Edition 3? My change to lazily initialize the String class obviously breaks these overrides if the String class is initialized after the nsJSWindow.cpp code eagerly defines the overriding escape and unescape functions: as soon as the script then uses String, String.prototype, or another not-yet-defined name to do with strings and String in JS, then js_InitStringClass will be called, and it will override the overrides with the original, JS-engine-based versions of escape and unescape. Before we try to fix this bug, or back any code out, we need to determine why the ECMA-compliant implementations of escape and unescape from the JS engine are not sufficient, and (if they are not) that the overriding implementations that are defined in nsGlobalWindow.cpp are ECMA-compliant, as well as good for all locales. Cc'ing mccabe, rogerl, and waldemar. The overriding implementations of escape and unescape are at http://lxr.mozilla.org/mozilla/source/dom/src/base/nsGlobalWindow.cpp#1900 and below. I believe the "nsEscape.h" include and nsEscape (global function? or is it a ctor?) call in nsGlobalWindow.cpp get you this code: http://lxr.mozilla.org/mozilla/source/xpcom/io/nsEscape.cpp /be
*** Bug 39956 has been marked as a duplicate of this bug. ***
There is a bug about escape and unescape (bug 44272). The overriding code does url escape like %xx while ECMA's complient way is a mixture of %xx and \uxxxx. As mentioned in bug 44272, there is a compatibility issue, scripts which are using escape() to generate url string will break if we generate \uxxxx. Other issue is the regression, this bug and 39956. I think it is also used in local file attachment. And I see other places uses escape() when I search our .js files. So I want the old behavior for escape().
We should use the escape from dom/src/base. This issue has a long and tangled history, and joki has lost much hair because of it. The issue isn't with escape and unescape per se, but with the fact that string information that is to be transmitted to servers primarily passes through them. Interaction with servers evolved back when JavaScript used 8-bit characters in strings, and more importantly, when JavaScript worked directly with the document charactacter encoding rather than with a unicode reflection of it. After we moved escape and unescape into the JS engine in an effort to keep up with the evolving ECMA standard, all of these server interactions broke, as they were now getting unicode instead of, say, shift-JIS. We ended up keeping the ECMA-draft versions in the engine (note that they are now not part of the standard, because of this issue) and Joki implemented escape and unescape functions in the DOM, where they had access to the document character encoding and could to back-and-forth translations. In short - We need the engine versions, because embedding uses might expect them. We need the DOM version, and it has to support legacy encoding behavior The DOM version needs to live there, lest the engine learn about document character encodings. Hm, I suppose we *could* #ifdef or otherwise detect that the engine is being compiled into the DOM, and omit the engine escape and unescape in that case.
mccabe: might not some newer authors want the ECMA ed. 3 escape and unescape? Maybe we should provide them under different names (ECMAescape, ECMAunescape -- what could be uglier. Anyway, our claim to be ECMA ed. 3 compliant will suffer slightly. I'll take this bug, but please give me your thoughts on how to provide both sets of functions. /be
Assignee: nhotta → brendan
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Target Milestone: M19 → M18
escape/unescape are now appendices to ECMA-262, but aren't needed for compliance. encodeURI/decodeURI are part of the ECMA-262 standard, and provide the unicode-friendly behavior that's missing from the legacy-compatable escape and unescape. I think given what's out there, it's sufficient to just provide the DOM escape/unescape when in the browser, whether this is enforced by #ifdef or some API interaction.
Ah, good. I think I'll simply take escape and unescape out of the lazy-init table in jsapi.c, using #ifndef MOZILLA_CLIENT. /be
Patch checked in, with mccabe's presumptive blessing (he's taken off on vacation but I'm highly confident). /be
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
r=mccabe, it's the right fix.
I checked the query problem is fixed on my local build. But the original problem is not fixed, so reopen.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reassign to nhotta.
Assignee: brendan → nhotta
Status: REOPENED → NEW
Status: NEW → ASSIGNED
Looks like I didn't really recompile the last time I tested the change. The JS problem is still there. GlobalWindowImpl::Escape is still not called in the current build. Reassign to brendan, please reassign back to me when the JS problem is fixed.
Assignee: nhotta → brendan
Status: ASSIGNED → NEW
Whoops, forget an #ifndef -- sorry about that. Fixed in js/src/jsstr.c rev 3.38, back to nhotta. /be
Assignee: brendan → nhotta
mark as P2
Priority: P3 → P2
Status: NEW → ASSIGNED
Checked in a fix for the search text truncation problem.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
VERIFIED Fixed with 200008311 builds. *all I did was test the original bug as reported and currently summarized, that nhotta says is fixed. If there's something else it really should've been another bug.
Status: RESOLVED → VERIFIED
brendan's fix to js/src/jsstr.c rev 3.38 is broken. This #ifdef means that any JS code that uses the dll built in the mozilla tree doesn't get an escape or unescape function in global scope *unless* they use a DOM window as the global object. This is a hidden gotcha for anyone using the dll or the build system unmodified (including xpcshell). AND it breaks ECMA promises for JS code in our browser executed in prefs, xpinstall, and as JS components. This is bad. Can't we init these globals in the engine and then explicitly override them in the DOM window?
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
John, I think you should reopen bug 44272 instead of this one.
I disagree. The change that breaks non-DOM users of JS was made to fix this bug and specifically references this bug.
(reassigning to brendan@mozilla.org is always an option)
Originally, this bug had nothing to do with that JS issue. The last time, the JS change caused a regression and broke the search feature itself so the fix for the original problem was blocked.
jband: escape and unescape are not normative in ECMA-262 Edition 3, according to mccabe's comment in this bug, and to my reading (they're in an informative appendix). Are you missing them in some context? As mccabe points out, there are "better for Unicode" ECMA methods available in all contexts. Of course, I should have debloated jsstr.c by #ifndef'ing MOZILLA_CLIENT the function definitions too. I'll do that, if you agree that we can afford to take the cheap #ifndef way out here. /be
[re-closing this] brendan: Sorry, rginda and I were taking Flanagan's (informal and dated) word that escape and unescape where in ECMA. I thought the other bug was more about the lack of unicode support. I now see mccabe's closing comment and I see how ECMA 3 treats these functions. If you think leaving them out is the right thing, then I'm OK with that. This came to my attention because rginda mentioned to me that the functions do not appear in xpcshell (but do in jsshell). I'm not sure if this is a hardship for him or not. When I saw the mechanism by which they are left out I was worried that you'd temporarily overlooked the fact that 'compiled with MOZILLA_CLIENT' != 'running in DOM window or even in seamonkey'. I don't know how to gauge people's expectations about the availability of these funtions. I just wanted to make sure you'd made your decision keeping in mind other users of the code as built in the mozilla tree.
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
MOZILLA_CLIENT is defined by autoconf.mk for the client build, but it's not defined by Makefile.ref or js.mak. I really don't think the #ifndef solution is going to cause anyone hardship (phil and rob should comment on how standalone JS is packaged, but I bet it's not with MOZILLA_CLIENT hardcoded). /be
I'm betting that most projects using other mozilla technology in addition to the JS engine (e.g. nspr & xpcom & js & xpconnect & maybe more of gecko) will build using the mozilla build system and then just import the built parts into their own build systems. Or they'll clone enough of the mozilla build system to make it work for them. JS by itself using the standalone makefiles is not the only game in town. XPCOM_STANDALONE and XPCONNECT_STANDALONE are very minor variations on the mozilla system and I know they don't undo MOZILLA_CLIENT. Again, I'm not too worried. But I think the effects of the MOZILLA_CLIENT #define are too subtle for your average embedder to notice.
re-verifying. the behavior this bug references is fixed and verified. Anything else is an offline issue or needs to be a different bug. Feel free to reference this bug #. It'll still be here it'll just be VERIFIED Fixed
Status: RESOLVED → VERIFIED
The bug originally reported here was corrected by a checkin made by nhotta on August 30 (file nsInternetSearchService.cpp). There is no patch posted here for that checkin. Unfortunately that change is the cause of a new bug whereby any plus signs that the user types into the sidebar search field will get erased after the search is done. This is bug 55052. I'm currently investigating how to modify nhotta's checkin so that it doesn't cause the plus signs to be erased.
I don't get why this is Netscape Confidental either, opening this up to be world-viewable since nobody gave a reason when Brendan asked why this needs to be Netscape Confidential back in comment #41.
Group: netscapeconfidential?
Product: Core → SeaMonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: