Closed
Bug 5933
Opened 27 years ago
Closed 25 years ago
International support for IMAP4 search
Categories
(MailNews Core :: Internationalization, defect, P1)
Tracking
(Not tracked)
VERIFIED
FIXED
M17
People
(Reporter: mozilla, Assigned: nhottanscp)
References
Details
(Whiteboard: nsbeta2+]Exception Feature)
(This bug imported from BugSplat, Netscape's internal bugsystem. It
was known there as bug #88257
http://scopus.netscape.com/bugsplat/show_bug.cgi?id=88257
Imported into Bugzilla on 05/04/99 17:49)
Messenger client should have fall back mechanism just in case IMAP4 server
doesn't support the charset used with SEARCH command. For example, when it's
working in Japanese char encoding, it should work like below:
result = SEARCH(UTF8 string with charset "UTF-8");
if (result == NO) { // UTF-8 may not be supported.
result = SEARCH(ISO-2022-JP string with charset "ISO-2022-JP");
if (result == NO) { // ISO-2022-JP may not be supported.
result = SEARCH(AS IS without charset);
if (result == NO)
printf("Couldn't match any");
}
}
Notice: By checking whether the search string contains only ASCII or not,
you can skip first two SEARCH(). It's up to implementation.
Messenger client in Communicator 4.0 doesn't work like above. It sends SEARCH
command with "Shift_JIS" charset, and gives up without retrying if server
response is "NO".
Actually, there is one change to the algorithm specified here.
As the very first step, if search string contains only US-ASCII (regardless of
encoding of the search UI), then SEARCH with charset=US-ASCII otherwise continue
as listed here.
Comment 3•27 years ago
|
||
Seems more search related than IMAP related. If you disagree, assign back to me.
Comment 4•27 years ago
|
||
Yes; it's search related, so it goes to Scott :-)
We'll need to invent some way to allow multiple passes at a single search scope,
which we don't have right now.
To clarify jfriend's point, if the search string contains only US-ASCII, we only
try US-ASCII, and not any i18n charset stuff.
I'd also like to clarify whether the IMAP server must send a NO response
if it doesn't know the charset, or whether it can just search and not find any
matches.
Mass moving bugs from product version 5.0 to 4.5 since that's where the bugs are
now (no change to TFV).
Comment 9•27 years ago
|
||
Bulk change: Bug assigned to mail/news engineer but no component specified.
Changed to mail/news component.
Comment 10•27 years ago
|
||
<sorry for the bug notification intrusion. Product version on this bug shows
1.0 (due to a bugsplat bug). Correcting all mail/news bugs numbered < 90000 to
product version 4.0. Bulk changing this.>
Reporter | ||
Comment 11•27 years ago
|
||
FYI.
tintin.mcom.com is now running MS4.0 Beta which supports various
charset option for IMAP SEARCH. If you don't have test environment
now, ask sheneman@netscape.com for an account.
However, I would recommend to have WSU's IMAP4 server as reference
as well. It also have good SEARCH implementation.
Comment 12•27 years ago
|
||
Phil, wasn't this the I18N bug we were talking about with Naoki and Bob? Where
you going to end up doing this one?
changing QA field to gbush
Comment 13•27 years ago
|
||
Bouncing over to Phil.
Comment 14•27 years ago
|
||
M15, I hope. I won't get to this for 4.5b2
Comment 15•26 years ago
|
||
Later. Too many more serious bugs for 4.5.
Comment 16•26 years ago
|
||
How can this remain "latered"? Negotiation of search
charsets with our own MS4.0 is something most major mail clients
can perform now, e.g. Outlook, WinBiff, etc. We should be
competitive and support the search charset negotiation. Without
this, our IMAP search for Japanese and other non-ASCII languages
would not work. How can we promote our clients to enterprise
customers without this feature working?
MS4.0 is nearing completion. This bug should be a perfect candiadte
for 4.51.
Re-opening for consideration in 4.51.
Comment 17•26 years ago
|
||
In case we need to review how this functionality should work,
I consulted taka and came up with the following summary of the
spec.
** Proposed steps for negotiating down the IMAP search charset. **
0. Check the 'capability' of the IMAP server for UTF-8.
IMAP4 capability command should return something like the
following in response to "a capability" command:
a capability
* CAPABILITY IMAP4 IMAP4rev1 ACL QUOTA LITERAL NAMESPACE UIDPLUS
LANGUAGE XSENDER X-NETSCAPE XSERVERINFO AUTH=PLAIN AUTH=LOGIN
a OK Completed
If the return string contains "X-NETSCAPE", we can be assured of UTF-8
seacrh capability with this server.
(Note: If you see X-NETSCAPE in the response of CAPABILITY command, there's
100% guarantee that the server will recognize UTF-8 charset. Do NOT
rely on the banner message because it's configurable, user may change
it to something else. You can always try UTF-8 as charset whethr or not
it's IMAP4 server (it will fail if the server doesn't know UTF-8). )
1. Determine if the search string contains any 8-bit characters.
---> If not (=only 7-bit data), send the search string in ASCII.
2. If 1) is yes, then assume that the search charset is in the System
Charset (or the global default -- e.g. in 4.5 we use global default
for LDAP servers so that more than one charsets can be used for search.)
Convert it to UTF-8 and send to the server. If the server accepts it,
then it should return matches if there are any matches.
3. If the request in 2 is rejected by the server, then, send the string in
the standard mail charset matching the System (or the global default)
charset. (For example, iso-2022-jp for the Japanese Win/Mac system
charset, Shift_JIS.)
4. If the request in 3 is rejected, then send the raw search
string (as is) without any charset specification.
And this completes the client's responsibility.
Open issue: Should we use the global default or the system charset
as the basis for the source charset? The global default is
more flexible in that we can input in different charsets
if proper keyboards or input methods are available as we
change the global default.
Comment 18•26 years ago
|
||
qa assigned shouldn't be gbush. Should be someone in msanz's group.
Assignee | ||
Comment 19•26 years ago
|
||
There are two issues,
The pref mailnews.force_ascii_search is set to true.
The second problem is that we need to convert search string to mail charset
which is JIS in case of Japanese. We are currently using the folder csid which
is ShiftJIS or EUC.
Here is a change I applied to my local tree.
Index: search.cpp
===================================================================
RCS file: /m/src/ns/lib/libmsg/search.cpp,v
retrieving revision 1.112.4.2.2.42
diff -c -r1.112.4.2.2.42 search.cpp
*** search.cpp 1998/10/01 04:24:55 1.112.4.2.2.42
--- search.cpp 1998/11/10 18:53:45
***************
*** 2182,2188 ****
--- 2182,2192 ----
// Ask the newsgroup/folder for its csid.
if (m_scope->m_folder)
{
dst_csid = m_scope->m_folder->GetFolderCSID() &
~CS_AUTO;
dst_csid = INTL_DefaultMailCharSetID(dst_csid);
}
}
// default means that our best guess is to get the default window char
set ID
Comment 20•26 years ago
|
||
This sounds like a lot of work, so I think we shouldn't commit to doing this
for 4.51, unless a customer escalation comes in which forces us to do it.
Clearing TFV. Please see me before setting the TFV.
BTW, I think Naoki's proposed change above is partial, at best, and defeats the
per-folder CSID that we allow the user to set.
Reporter | ||
Comment 21•26 years ago
|
||
Why can it sound like a lot of work? Naoki shows everything to fix.
What is wrong with partial solution? Any serious side effect?
Although I don't mind what TFV it's got, I do care if customers in Japan
find all other IMAP clients work with Messaging Server 4.0, but
only Netscape client (except Messenger Express 4.1) doesn't with
Netscape's own IMAP server.
I've waited almost 10 month. And, seems like I have to keep
waiting more. Am I expecting too much?
Assignee | ||
Comment 22•26 years ago
|
||
>and defeats the per-folder CSID that we allow the user to set.
That has been true anyway as we restrict to Ascii only. The other issue is that
we only support single charset inside the search dialog. Also more complicated
issue is folder hierachy which may have mixed charsets situations.
So, those issues need to be solved in future. But I am not sure if we should
support only ascii until we solve those issues.
Comment 23•26 years ago
|
||
> Why can it sound like a lot of work?
Because none of the other searching code takes more than one attempt at a search
based on the results of previous attempts.
> Naoki shows everything to fix.
That is absolutely not true. Naoki shows how to convert to the mail server's
charset only. That does not implement the algorithm Kat showed his 10/29/98
comments.
> I've waited almost 10 month. And, seems like I have to keep waiting more.
> Am I expecting too much?
As I said above, the question for when we add this feature is determined by
customer escalations. There are lots of other features that people have wanted
for longer then 10 months that we're not doing in 4.51.
Comment 24•26 years ago
|
||
After discussing various pros and cons, we have decided to
open a new bug for fulfilling a minimum IMAP search
requirement for the Japanese market. A new bug does not
ask for server-client negotiation, and should be handled by
the escalation team.
The new bug is: 334536.
Comment 25•26 years ago
|
||
TFV 5.0
Comment 26•26 years ago
|
||
I (or someone else) will be moving enhancements, etc, bugs targeted for 5.0 to
bugzilla in the near future.
------- Additional Comments From paulmac May-04-1999 17:44 -------
Okay, time to close out old bugsplat bugs - Please move to bugzilla if this
one is still relevant or mark won't fix, please.
------- Additional Comments From momoi May-04-1999 17:49 -------
Well, this is still a valid bug.
Let's move to 5.0 and send it to the Mail/News team.
Updated•26 years ago
|
Target Milestone: M9
Updated•26 years ago
|
Target Milestone: M9 → M13
Comment 27•26 years ago
|
||
search is moving out.
Comment 28•26 years ago
|
||
Search won't be implemented until after Beta 1, so this bug does not need to be
fixed until after Beta 1
Updated•26 years ago
|
Assignee: phil → mscott
Status: REOPENED → NEW
Target Milestone: M13 → M14
Comment 29•26 years ago
|
||
mscott owns the search backend, so reassigning to him for M14. Searching is not
a B1 feature.
Updated•25 years ago
|
Target Milestone: M14 → M16
Comment 31•25 years ago
|
||
Based on Beta2 Criteria http://client/seamonkey/prd/beta2criteria.html.
This is beta2 P1 bug, should add a keyworkds beta2 on this bug?
Comment 32•25 years ago
|
||
Karen, the beta2 doc says we need to implement a search back end which is a
separate bug. We need the search backedn before we can start fixing bugs like
this which have been around since 4.5. =(
I don't see any mention of this bug in the beta2 docs so I'm not sure what you
were looking at or maybe you were thinking about the comment to implement search
for beta2?
Comment 33•25 years ago
|
||
I suck i was only looking under mail not under mail 18n on the beta2 docs.
moving back to a beta2 milestone. Thanks for catching my mistake Karen!
I18N, are you guys sure this is a beta2 stopper?
Target Milestone: M18 → M17
Comment 34•25 years ago
|
||
4.x didn't do this - I can't believe it would be a beta stopper for 6.0, and we
could ship with it as well - we always have before.
Comment 35•25 years ago
|
||
From Beta2 Criteria http://client/seamonkey/prd/beta2criteria.html.
1) Scroll down to see the Features
2) Selec I18N Features.
3) Select Mail I18N
4) Search for Mail/News Tasks - IMAP I18N - IMAP search 5933 - P1
P.S. I don't know what I18N mean? Does anybody know that?
Comment 36•25 years ago
|
||
I18N = Internationalization. I believe that the i18n group says it's a beta
stopper. I just don't think we're going to have time to do it.
Comment 37•25 years ago
|
||
OK. I am just checking & trying to clarify that.
Then the document should be modified!!
Assignee | ||
Comment 38•25 years ago
|
||
This bug was transferred from 4.x bug system.
What we need for beta2 is i18n IMAP search to work. It is working in 4.x.
In 4.x, if ascii search does fails then it falls back to another query using a
folder charset.
But for mozilla, it is easier and better to do UTF-8 query since we have a
query string in unicode.
Comment 39•25 years ago
|
||
This is an IMAP spec.
We made some very hard choices to ship 4.5 and this was one of the features
that was cut at the very end.
The mail server guys have been very adamant that the client needs to support
this and were very disappointed that if fell off the 4.5 list at the end of
that development cycle.
taka and jgmyers can provide more data on what will break for who without this
long awaited feature...
Comment 40•25 years ago
|
||
I'd be surprised if we get 80% of the search functionality that was in 4.5 into
6.0 - getting > 100% would be a miracle. If you hadn't noticed, we haven't even
started search yet!
Assignee | ||
Comment 41•25 years ago
|
||
Putting beta2 for i18n beta2 criteria items. Contact bobj for question.
Keywords: beta2
Comment 42•25 years ago
|
||
> This is an IMAP spec.
I don't see this mentioned in RFC 2060 or 2683. Please give the spec reference
which supports your claim.
Assignee | ||
Comment 44•25 years ago
|
||
As the bug is old and the original comment is not consistent with what we need
for beta2, I am rewriting the i18n requirement for beta2 (which is the same
level of support as the current 4.x cleint). I also changed the summary.
For beta2, we need US-ASCII search and charset specified search (i18n search).
Here is how we can do,
* Apply 7 bit check against search string. Assuming the search string is unicode
(PRUnichar* or UTF-8), we can check < 128 against the search string.
* If the search string is 7bit then the do US-ASCII search (search with no
charset specified).
* If the search string is 8bit then get the folder charset, convert the unicode
string to the folder charset and specify the charset in the search command.
Summary: IMAP4 search doesn't retry if first attempt fails → International support for IMAP4 search
Comment 46•25 years ago
|
||
ftang, why did you clear nsbeta2-..can you state your case?
Whiteboard: [NEED INFO]
Comment 47•25 years ago
|
||
Since search has been an approved feature exception, this goes hand in hand with
that. It basically says make our imap seach I18N friendly when we implement it =).
Comment 48•25 years ago
|
||
On exception list for PR2, removing 5/16...giving [nsbeta2+]Exception Feature
status.
Whiteboard: [NEED INFO] → nsbeta2+]Exception Feature
Comment 49•25 years ago
|
||
It's my understanding that the mail team cut search today.
Comment 50•25 years ago
|
||
so, like the last bug, I did a bunch of i18n work yesterday.
And a reality check from everyone: This bug is over 2 years old now, a carryover
from 4.5.. the general i18n-ness of search is already covered in bug 11659..
kinda seems like this should just be a dupe.
if however this bug is referring to the algorithm described at the top of this
file, I believe it may never have been implemented in 4.x.. and if that's the
case I'm not sure why this would be nsbeta2+
in any case, I think this should either go to bienvenu or myself to lighten
scott's load.
Comment 51•25 years ago
|
||
So after your i18n fixes, are we now close to parity
with 4.5 and later? The spec there was described in
nhotta@netscape.com 2000-05-01 16:00 comment above.
That should be the minimum -- it has been implemnted before
and current users of Communicator will expect as much.
Comment 52•25 years ago
|
||
I _think_ so... we won't know for certain until we have a UI.
I haven't seen the equivalent of the algorithm described at the top of this
bug...it might be there though
Comment 53•25 years ago
|
||
The algorithm which retries with a different character set if no hits are found
was not implemented in 4.x. Since that's that this bug was about originally, I'm
guessing that we should separate that issue (which we're not addressing for
seamonkey) with the issue of 4.x parity WRT i18n searching (which we should
address for seamonkey)
Assignee | ||
Comment 54•25 years ago
|
||
>I _think_ so... we won't know for certain until we have a UI.
Do we have a bug for that? As soon as that is resolved iqa can test i18n search.
Comment 55•25 years ago
|
||
ok, does anyone object to me marking this a dupe of 11659 (which has been marked
fixed) then? bienvenu has appearantly got IMAP search working, and I have
supposedly made the whole search backend i18n friendly...
*** This bug has been marked as a duplicate of 11659 ***
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → DUPLICATE
Comment 56•25 years ago
|
||
nhotta, see bug 33101 for the search UI frontend bug.
I just added you to the CC
Comment 57•25 years ago
|
||
i object actually. Alecf, this bug refers to a specic algorith for imap4
searching that escalation engineering implemented in 4.6. This bug is track this
when we implement search for imap.
It's separate from the random i18n filter and search bug you marked it a dup of.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Comment 58•25 years ago
|
||
I don't really see how i18n search can be done, despite what Alec has done. My
understanding was that the i18n group had to provide us with API's that existed
in 4.5 but no longer exist in 6.0 in order for i18n search to work. Of course,
I've been out of it for a long time, but that was my understanding.
Assignee | ||
Comment 59•25 years ago
|
||
The last time we talked about NNTP search API and we dropped that from beta2.
For I map, I think what I mentioned in 2000-05-01 16:00 are available in 6.0
(e.g. getting a folder charset, conversion from unicode to a folder
charset, etc.).
Comment 60•25 years ago
|
||
is the latter also true for local? Convert the headers to unicode using the
charset and do a unicode comparison with the utf8->unicode converted search
string? What about message bodies? We can't really convert the whole message
body to unicode in memory, can we?
Comment 61•25 years ago
|
||
I believe that some of the search code converts the unicode search term to the
folder's charset, then performs the search with this converted string.
Comment 62•25 years ago
|
||
alecf's description is how local searching is supposed to work (and did in 4.x).
Assignee | ||
Comment 63•25 years ago
|
||
For local search, header search requires MIME decoding. Sicne the MIME decoder
returns unicode, that can be compared with the search term.
For body local search, I believe we converted the body (not the search term).
Here is a 4.x code, I belive DO_I18N was defined in the 4.x (otherwise japanese
search wouldn't work).
http://lxr.mozilla.org/mozilla/source/mailnews/base/search/src/nsMsgSearchTerm.c
pp#687
739 #ifdef DO_I18N
740 // In here we do I18N conversion if we get the converter
741 char *newBody = nsnull;
742 newBody = (char *)INTL_CallCharCodeConverter(conv,
(unsigned char *) buf, (int32) PL_strlen(buf));
743 if (newBody && (newBody != buf))
744 {
745 // CharCodeConverter return the char* to the orginal
string
746 // we don't want to free body in that case
747 compare = newBody;
748 }
749 #endif
Comment 64•25 years ago
|
||
DO_I18N is not in the 4.5 code; it was added to 6.0 so the code would compile
because things like INTL_CreateCharCodeConverter don't exist in 6.0 - I think
this was one area where we need a 6.0 equivalent way of doing this.
Comment 65•25 years ago
|
||
you know, it's actually going to be EASIER for me to convert the user-entered
value to the folder's charset and do the body search that way. Anyone object if
I do it that way? It'll be faster too.
Comment 66•25 years ago
|
||
I take that back, it's not as simple as I had hoped.. converting the body is the
easy way right now.
Assignee | ||
Comment 67•25 years ago
|
||
Reposting my comment in 2000-05-01 16:00 which contains I18N requirement for
nsbeta2.
> As the bug is old and the original comment is not consistent with what we need
> for beta2, I am rewriting the i18n requirement for beta2 (which is the same
> level of support as the current 4.x cleint). I also changed the summary.
> For beta2, we need US-ASCII search and charset specified search (i18n search).
>
> Here is how we can do,
> * Apply 7 bit check against search string. Assuming the search string is
unicode
> (PRUnichar* or UTF-8), we can check < 128 against the search string.
> * If the search string is 7bit then the do US-ASCII search (search with no
> charset specified).
> * If the search string is 8bit then get the folder charset, convert the
unicode
> string to the folder charset and specify the charset in the search command.
Assignee | ||
Comment 68•25 years ago
|
||
Adding jaimejr@netscape.com and putterman@netscape.com to cc.
Comment 69•25 years ago
|
||
Added myself to Cc.
Assignee | ||
Comment 70•25 years ago
|
||
I and taka started to look at the code. The search criteria string is UTF-8 and
there is also a function to get a folder charset. 7 bit check can be done
easily agains a UTF-8 string. Also, we can convert the string from UTF-8 to a
folder charset.
Taka pointed that we can use literal string instead of quoted string (which
needs escaping for some charset, e.g. ISO-2022-JP).
Assignee | ||
Comment 71•25 years ago
|
||
The patch (hooked up charset conversion) was reviewed. I will probably check in
tomorrow.
Assignee | ||
Comment 72•25 years ago
|
||
Checked in, testable once the UI is functional again.
Status: REOPENED → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → FIXED
Comment 73•25 years ago
|
||
** Checked with 7/10/2000 Win32 build **
OK, we are finally able to check on this because I can now
see attribute names.
Here's what works:
1. With the default view charset set to ISO-2022-JP, a single condition
or "OR" with more than 1 attributes work OK to find relevant messages
when we input Japanese search keys.
What does not work:
1. Any search after the first one using a Japanese word produces no
change even if you change an attribute value to another Japanese
word. Even if you close the Search window and re-open it,
it does not seem possible to do any search. If you use ASCII values,
you can do more than 1 search at a time succesfully. This problem
seems to be due to the use of non-ASCII data as search keys.
2. Any change in attribute category changes, e.g. from Subject to
Sender, or from "OR" conjunction to "AND" conjunction. This type
of change forces the server to send an error message saying that
"Required argument was missing."
This problem happens regardless of the charset of the attribute values
used.
There are other problems but I have not sorted them out yet.
For item 2, I'll look for an existing bug. But for Item 1, I need to
re-open this bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 74•25 years ago
|
||
Reassing to me.
Can we file a separate bug for the first problem since the international search
itself has been enabled?
Assignee: mscott → nhotta
Status: REOPENED → NEW
Comment 75•25 years ago
|
||
Can we get a bit more analysis before deciding to file another
bug? If we know for sure that it has nothing to do with the
way non-ASCII was implemented, then let's file a new bug.
Problem #1 makes Search in Japanese very difficult since
users often try one key and then another in case the first didn't work.
Unless I am mistaken, the user will have to reboot Mozilla to try
the next search. That is really bad.
Comment 76•25 years ago
|
||
I've looked at Problem #1 a bit further and it seems that
the problem is a bit more complex than I had described above.
It seems that if you pick certain Japanese words, you can
do more than 1 search at a time. When you use some other word,
it does not work until you use some other data that do not have
this problem. One example of a problem word is "Ni-hon" (Japan
in Kanji). I have not been able to do any search with it.
Assignee | ||
Comment 77•25 years ago
|
||
I used win32 build ID 2000071008 on WinNT 4 Japanese and I can search Japanese
strings more than once.
First, I searched "mail" in Japanese then got some results.
Then I searched "homepage" in Japanese then got additional results and they were
appended to the search result.
And I searched "welcome" in Japanese then got additional results and they were
appended to the search result.
So I cannot reproduce the problem. There may be a condition to reproduce this.
Anyway, I prefer the problem to be filed separately.
Comment 78•25 years ago
|
||
Would you try "nihon" in Kanji and see if that works?
Comment 79•25 years ago
|
||
There seems to be another problem in search string
formation to send to the server. See the SCOPUS bug
we dealt with for Communicator, Bug ID 343598. The example
string described there, Hiragana "a", causes a server
error in Mozilla.
Comment 80•25 years ago
|
||
Please file seperate bug instead reopen this feature bug. Individual bugs will
help us track different cases.
Comment 81•25 years ago
|
||
OK. I'll verify that the basic Intl IMAP functionality is
working.
There are soem misses and they will be filed as separate bugs.
Status: NEW → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → FIXED
Comment 82•25 years ago
|
||
** Checked on 7/10/2000 Win32, Mac, and Linux builds **
On the above builds, basic non-ASCII search function is now
working as long the search keys match the default view charset
set in the Preferences dilaog.
Marking it verifies ad fixed.
We will file new bugs for thsi new feature in separate bugs.
Status: RESOLVED → VERIFIED
Updated•20 years ago
|
Product: MailNews → Core
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•