Closed Bug 166127 Opened 22 years ago Closed 20 years ago

Search string in Japanese doesn't match exactly

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: kazhik, Assigned: jshin1987)

References

Details

(Keywords: intl)

Attachments

(1 file, 3 obsolete files)

Search string in Japanese doesn't match exactly if there is a linebreak before kinsoku characters. Original report in Bugzilla-jp: http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2565 Testcase: http://bugzilla.mozilla.gr.jp/showattachment.cgi?attach_id=1083 Patch: http://bugzilla.mozilla.gr.jp/showattachment.cgi?attach_id=1105 Related: bug 135323
Shanjian, can you review the patch by saito?
I am a little confused. Could you explain the problem and how your patch fix it in English? thanks.
Keywords: intl
QA Contact: ruixu → kasumi
Steps to reproduce: (1) Open <http://bugzilla.mozilla.gr.jp/showattachment.cgi?attach_id=1083>. (2) Open "Find in This Page" dialog and enter "hairetsu" in Japanese. (3) Push Find button twice. "retsu no" is highlighted instead of "hairetsu" in the first paragraph. If the paragraph doesn't have linebreaks in it, String search works fine. Linebreaks are deleted in rendering, but aren't deleted for string search. I missed the saito's second patch. http://bugzilla.mozilla.gr.jp/showattachment.cgi?attach_id=1104 His patch deletes linebreaks for string search.
It supplements. The kinsoku means Japanese hyphenation, however, a hyphen character does not generate. In a peculiar to Japanese, there is usually a character which is not placed at the head of a line. In the layout of a character, it means moving to the head of a line without the Japanese character of several characters taking a break. layout/html/base/src/nsTextFrame.cpp - if (1 == wordLen && contentLen == 2 && IS_CJ_CHAR(*bp)) { + if ((contentLen - wordLen) == 1 && IS_CJ_CHAR(*bp)) { The above and a changed part mean that wordLen may become two or more.
It supplements also about the second patch. http://bugzilla.mozilla.gr.jp/showattachment.cgi?attach_id=1104 When searching a string, in order to search to the original text containing the new-line code between Kanji characters, the string cannot be searched. However, it will not approve of this patch, since it is copying using new operator.
A patch attachment 113902 [details] [diff] [review] of Bug 156369 has an effect on this ploblem.
I think both roy and me are off mozilla for more than 2 years. If these bugs are still here now, I think the real stauts is 'won't fix'. If you want to reopen it, please find a new owner for it first.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → WONTFIX
Mass Reassign Please excuse the spam
Assignee: tetsuroy → nobody
Mass Re-opening Bugs Frank Tang Closed on Wensday March 02 for no reason, all the spam is his fault feel free to tar and feather him
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam
Assignee: nobody → jshin1987
Status: REOPENED → NEW
Attached patch patch (obsolete) (deleted) — Splinter Review
Attachment #177093 - Flags: review?(jshin1987)
jshin, I asked you to review. Please refer to comment #4 and comment #5 for the reasons of the changes, that is, taking Japanese hyphenation into the changes of Bug 135323 and ignoring one return code between kanjis for string-search.
Comment on attachment 177093 [details] [diff] [review] patch Sorry, I found a mistake.
Attachment #177093 - Attachment is obsolete: true
Attachment #177093 - Flags: review?(jshin1987)
Attached patch patch (obsolete) (deleted) — Splinter Review
Attached patch patch (obsolete) (deleted) — Splinter Review
Attachment #177107 - Attachment is obsolete: true
Attachment #177109 - Flags: review?(jshin1987)
Attachment #177109 - Flags: superreview?(jst)
Attachment #177109 - Flags: review?(jshin1987)
Attachment #177109 - Flags: review?(akkzilla)
The nsFind.cpp test looks safe. Can someone please run through the tests at http://www.mozilla.org/quality/browser/front-end/testcases/xpapps-gui/find-on-page/find-in-page.html and make sure it doesn't introduce any regressions? (I'd check it myself, but my devel machine is down with a bad motherboard and I'm leaving town tomorrow.) But it looks like this shouldn't affect anything but CJ characters, so I don't expect regressions. r=akkana on that part if someone runs through the tests. Unfortunately I'm not familiar with the nsTextFrame.cpp code being modified, and it'll be weeks before I could try to figure it out. You should probably get someone else to review that change. This change particularly made me wonder: - i = contentLen; + i = 2; If you can't find anyone else to review that file, I can look at it in a few weeks if you explain what the changes are doing, particularly looping back from 2 instead of from contentLen. The other change block after that makes more sense, but I'm not clear why you left the first while (--i >= 0) in place while introducing a second one (or is that just diff being wonky?)
Comment on attachment 177109 [details] [diff] [review] patch Looks right to me, I'll r=, but I think dbaron should have a look at at least the nsTextFrame.cpp changes before this goes in. Requesting his sr...
Attachment #177109 - Flags: superreview?(jst)
Attachment #177109 - Flags: superreview?(dbaron)
Attachment #177109 - Flags: review?(akkzilla)
Attachment #177109 - Flags: review+
I've looked at the nsTextFrame.cpp change a whole bunch of times, and I still have no idea what it's changing, or, for that matter, what the code that it's modifying does. Care to explain?
I think that the mapped index of the contents needs to match with the index of the displaied texts. In a following example, the mapped index of |sp| is equal to the mapped index of the next kanji character if this patch is applied. contents kk sp kk kk kk (|kk| means a kanji, |sp| means a white space) displaied texts kk kk kk kk mapped index of the contents kk sp kk kk kk 0 1 1 2 3 |kk| is obtained by the first call of GetNextWord and |kkkkkk| is obtained by the next call of it. |sp| is skipped by the second call. |kkkkkk| means that these characters are continuing and it cannot divide with reason of Japanese hyphenation.
(In reply to comment #19) > contents kk sp kk kk kk (|kk| means a kanji, |sp| means a white space) > displaied texts kk kk kk kk > > mapped index of the contents > kk sp kk kk kk > 0 1 1 2 3 Without the patch, 'kk sp kk sp(or other word-delimeters) .....' can be handled, but 'kk sp kk kk kk ...' cannot be handled. Am I right?
> Without the patch, 'kk sp kk sp(or other word-delimeters) .....' can be handled, It can be handled. > but 'kk sp kk kk kk ...' cannot be handled. Am I right? When |kkkkkk| can not be divided with reason of Japanese hyphenation, it can not be handled.
> contents kk sp kk kk kk (|kk| means a kanji, |sp| means a white space) I correct a mistake.&#12288;|sp| means a '\n'. The skipped character is one '\n' between CJ characters but not white space.
Depends on: 289857
Comment on attachment 177109 [details] [diff] [review] patch Reported problem will be solved by the fix of Bug 289857. However, the problem which cannot search CJ characters containing '\n' still remains. I will post a patch except for changes to nsTextFrame::PrepareUnicodeText.
Attachment #177109 - Flags: superreview?(dbaron)
Attached patch patch (deleted) — Splinter Review
patch to skip a '\n' between CJ characters in order to search only displayed CJ characters.
Attachment #177109 - Attachment is obsolete: true
Attachment #181498 - Flags: superreview?(jst)
Attachment #181498 - Flags: review?(jst)
Comment on attachment 181498 [details] [diff] [review] patch r+sr=jst
Attachment #181498 - Flags: superreview?(jst)
Attachment #181498 - Flags: superreview+
Attachment #181498 - Flags: review?(jst)
Attachment #181498 - Flags: review+
Comment on attachment 181498 [details] [diff] [review] patch In searching text, this patch affect only a '\n' with CJ characters, I think that this patch is low-risk.
Attachment #181498 - Flags: approval1.8b2?
Attachment #181498 - Flags: approval1.8b2? → approval1.8b2+
Johnny, this patch was approved for 1.8b2. Could you check in to the trunk?
jst, could you check in to the trunk and mark FIXED?
landed on the trunk Checking in embedding/components/find/src/nsFind.cpp; /cvsroot/mozilla/embedding/components/find/src/nsFind.cpp,v <-- nsFind.cpp new revision: 1.32; previous revision: 1.31
Status: NEW → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: