Open
Bug 1302492
Opened 8 years ago
Updated 2 years ago
The new Find Whole Word/ Find Exact String Option does not find Chinese words
Categories
(Toolkit :: Find Toolbar, defect)
Toolkit
Find Toolbar
Tracking
()
NEW
People
(Reporter: jonathan_walden, Unassigned)
References
Details
(Keywords: intl)
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Build ID: 20160725105554
Steps to reproduce:
In nightly beta, use find in page with the new whole word option to search for the single character word 在 on the page https://zh.wikipedia.org/wiki/Wikipedia:%E9%A6%96%E9%A1%B5
Actual results:
only one instance is found - " 在1960" - the one in which the word is surrounded by non-CJK characters.
Expected results:
many matches should have shown up, basically the same set as when whole word matching was not used.
If you try a similar search in other browsers that support the whole word option, they do find Chinese words even with the whole word option selected.
The current code appears to rely on a word break which determines breaks as changes in character class. I suspect the current code does not work well for any language that does not separate words by spaces -- Thai, Chinese, Japanese. There is some more info here about languages that do not use spaces https://r12a.github.io/scripts/tutorial/part5
"which only matches strings surrounded by word-breaking characters, like spaces or punctuation marks in latin-derived languages.", from bug 1282759.
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•