Closed Bug 515560 Opened 15 years ago Closed 15 years ago

Decide on CJK plan for 3.0

Categories

(Thunderbird :: Search, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME
Thunderbird 3.0rc1

People

(Reporter: wsmwk, Assigned: asuth)

References

Details

(Whiteboard: [no l10n impact][no in-product strings, relnote strings])

disable gloda search for CJK for 3.0 builds (followup to IRC conversation) CJK search returns no results, which will be bad user experience for users of CJK locale builds. So quick search only. ref: Bug 472764 - gloda full-text search always uses SQLite's porter stemmer without regard to effective locale, etc.
We should relnote this if we're not disabling it before b4.
Flags: blocking-thunderbird3?
Keywords: relnote
(In reply to comment #0) > CJK search returns no results, which will be bad user experience for users of > CJK locale builds. So quick search only. Again, not specific to CJK locale builds, but also when searching CJK in other locales, such as en_US, gloda search will also return zero results.
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Whiteboard: [has l10n impact]
Target Milestone: --- → Thunderbird 3.0rc1
Great news! Bug 472764 looks like it has a promising patch, and if that's fixed in the TB3 timeframe, this bug can be resolved invalid.
Depends on: 472
Depends on: 472764
No longer depends on: 472
Component: Build Config → Search
QA Contact: build-config → search
I'm morphing this bug into a decision bug. My understanding of the state of affairs: 1) Currently CJK tokenizing doesn't work, so gloda searches over messages in those languages pretty much fail. As a proxy for "detecting emails written in CJK languages", we'd disable gloda for the relevant locales. We know it doesn't cover searches for CJK messages that may be in other localized builds, but it's the best we can do quickly. 2) There is a patch in bug 472764 which looks good, but hasn't been reviewed yet, and it likely needs more tests. 3) If the aforementioned patch doesn't make it in the tree soon enough (and by that I mean well ahead of the code freeze), then we will ship without CJK, and then have to disable gloda on those builds as described in 1). If the patch does land well, then we don't need to do anything particular for those locales. 4) _as far as I know_, in either possible future, there are no actual in-product string changes needed in product. The release notes would be different for those locales, but that's not subject to the same string freeze date. Based on that understanding, I'm changing the l10n label of this bug. If [has l10n] is meant to go beyond string changes, then we can revert that change. 5) Andrew is the one best qualified to assess that patch, hence this bug. But he's going on a previously scheduled vacation next week. I'm assigning the bug to him to reflect that. 6) If we don't have time to get the CJK patch in 3.0, we'll most likely get it in a subsequent 1.9.2-based release, which we're hoping is not far in the future. 7) Gary -- asuth mentioned that he would love some more test data to add to the test suite. He'd like data that I think is like this: (sentence, word, is_word_in_sentence) If you could rustle up some examples in C, and maybe through your contacts, J or K, that'd be good! UTF8 strings only. 8) Planning for the worse case scenario, it'd be good to get a list of the locales that we think would be most affected and where we expect most users to have mostly CJK messages. Sipaq, can you get that list?
Assignee: nobody → bugmail
Summary: disable gloda search for CJK for 3.0 builds → Decide on CJK plan for 3.0
Whiteboard: [has l10n impact] → [no l10n impact][no in-product strings, relnote strings]
(In reply to comment #4) > 8) Planning for the worse case scenario, it'd be good to get a list of the > locales that we think would be most affected and where we expect most users to > have mostly CJK messages. Sipaq, can you get that list? Sure, as CJK stands for _C_hinese, _J_apanese, _K_orean the following supported locales would be affected: ja (and ja-JP-mac), ko, zh-CN, zh-TW Please note, that zh-CN was not part of beta4, but will very likely be part of the final release. Japanese is a tier1 locale. That basically means (if we follow the same tier-system as Firefox does, which we outline in https://developer.mozilla.org/En/Thunderbird_Localization#Locale_Tiers) that Japanese is equivalent to en-US in importance, so a broken feature there, blocks the release. I'm not sure what the download numbers for ko, zh-CN and zh-TW are. As all these locales were available for TB2, Rafael might have some up-to-date download numbers/usage ratios of those, which could help us determine how important those locales really are for us. My guess: Very important.
Since bug 472764 has a reviewed patch that only needs trivial code changes and I'm pretty confident about it, I'm declaring our strategy to land that and this bug resolved.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.