Closed
Bug 1119426
Opened 10 years ago
Closed 8 years ago
Add Xhosa dictionary/wordlist
Categories
(Firefox OS Graveyard :: Gaia::Keyboard, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: delphine, Assigned: kscanne)
References
Details
Attachments
(6 files)
Xhosa shipping on 2.0. Needs dictionary/wordlist.
Friedel, Kevin: can you help out on this as well?
thanks! :)
Reporter | ||
Comment 1•10 years ago
|
||
[Blocking Requested - why for this release]:
As per our call with Bus Dev, Rel Man and l10n team this morning, it is confirmed that this is needed on 2.0 (and onwards). Thus nominating for 2.0 work
blocking-b2g: --- → 2.0?
Updated•10 years ago
|
blocking-b2g: 2.0? → 2.0+
Comment 2•10 years ago
|
||
This is a complex problem, just as with Zulu (bug 1113395) and Swahili (bug 1098502). I'm not sure there is a quick solution for 2.0.
Kevin do you think it is worthwhile to simply brute force it to try and get something minimally useful together? I think I'll have to rely on you for a good corpus. I can add some of what I have here, but it won't be enough on its own. I'm not sure how well it would work... maybe we can discuss outside the bug if you have time.
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Friedel Wolff from comment #2)
> Kevin do you think it is worthwhile to simply brute force it to try and get
> something minimally useful together?
I don't think it's worth doing by brute force. I know you understand these issues well but let me attach some numbers to this for the people working on FxOS who might not be familiar with Bantu languages.
Here's a quick experiment one can do for any language: take a sample corpus of, say, 250k words in the language and create a autocorrect dictionary from that - just include all the words you see. Then choose a disjoint test corpus with ~25k words and count the percentage of words that are recognized by the autocorrect dictionary. For web text in English, Spanish, Italian, French, etc. this number hovers around 96-97%. Turkish, Basque, Finnish, Hungarian, Georgian are between 87-90% which is bordering on unusable (> 1 out of every 10 words unrecognized). When I do this for Zulu I get 82% and Xhosa 81%, more like 1 out of every 5 words unrecognized.
Comment 4•10 years ago
|
||
Hi Kevin, Friedel,
Thanks for your input. Then in your opinion, is it feasible to have something basic ready for 2.0?
David
Comment 5•10 years ago
|
||
I don't think we can do anything that can be beneficial for the user in v2.0 time frame, per comment 3. I recommend we remove the 2.0 requirement for this feature, and properly work on this in a later release.
Reporter | ||
Comment 6•10 years ago
|
||
Hey Kevin and Friedel,
I know this is kind of last minute, but do you think we could have something ready by this upcoming Friday? We might also be able to find someone who can help you out as well if needed.
And if this is really unrealistic date, can you please take this work up so that it will be available in the future? thanks!
Flags: needinfo?(kscanne)
Flags: needinfo?(friedel)
Assignee | ||
Comment 7•10 years ago
|
||
See comment 3. In short, anything we do via the existing autocorrect engine is going to be virtually unusable for Xhosa and Zulu.
Flags: needinfo?(kscanne)
Flags: needinfo?(friedel)
Reporter | ||
Comment 8•10 years ago
|
||
Is it possible to generate a dictionary/wordlist in the meantime, while we figure out internally the engine issue? (sorry if I've missed something, this is really tricky for me ;) )
Comment 9•10 years ago
|
||
Dear Delphine,
I just discussed with Wesly and I think we need to rejustify whether this is really needed in 2.0 for Fire E according to launch plan of https://mana.mozilla.org/wiki/display/PM/T2M. Wesly also mentioned he will discuss with partner to see whether they can create Xhosa dictionary/wordlist by themselves.
I agree we should try to have our own Xhosa dictionary/wordlist thus I am nominating this to 2.2?
Dear Howie,
Please help to triage this bug for 2.2 Thanks!
blocking-b2g: 2.0+ → 2.2?
Flags: needinfo?(hochang)
Reporter | ||
Comment 10•10 years ago
|
||
Josh: according to https://mana.mozilla.org/wiki/display/PM/Firefox+OS+Wave+Launch+Cross+Functional+View and our weekly discussion with Business development and PMs, Xhosa is committed to 2.0.
David Palomino mentioned he was supposed to hand this to partners last Friday (comment 6). I have to admit I don't understand why we're still going back and forth on this and can't define clearly the scope, after multiple conversations. If this has changed, then the mana has to be updated.
Flagging Karen Ward and David Palomino so they can confirm the scope of this and advise on how to go forward. thanks
Flags: needinfo?(kward)
Flags: needinfo?(dpalomino.bugzilla)
Comment 11•10 years ago
|
||
Hi Delphine,
There have been no changes in the schedule for the launch (apart from a couple of days as we're closing some details regarding preload of apps). We just needed to confirm that everything was ready from the l10n part. TCL will generate the build in one or two days.
And a BIG thanks to all of you for the effort committing the South African languages on time to launch with the partners. This for sure will help a lot to the launch (for product, marketing, etc).
Thanks!
David
Flags: needinfo?(dpalomino.bugzilla)
Reporter | ||
Comment 13•10 years ago
|
||
hi David. Basically this bug means that there will not be a dictionary/wordlist available for Xhosa on time for the launch. Is that ok? Just want to make sure we'll are on the same page. thanks
Flags: needinfo?(dpalomino.bugzilla)
Comment 14•10 years ago
|
||
Hi Delphine,
I think we have no choice here, IMO is not a blocking issue, but it is something that it'd be very nice to have in the future, so agree to have this in 2.2. If there would be also some plans regarding 2.1 I'll let you know.
Thanks!
David
Flags: needinfo?(dpalomino.bugzilla)
Comment 15•10 years ago
|
||
Per David's non-blocking comment #14, removing the 2.2? nom.
blocking-b2g: 2.2? → ---
Comment 16•10 years ago
|
||
(In reply to Stephany Wilkes from comment #15)
> Per David's non-blocking comment #14, removing the 2.2? nom.
Hi Stephany,
I meant in comment #14 that we cannot delay 2.0 launch because of not having the wordlist for Xhosa, but definitely we're missing a lot that functionality. I think we need to include this in future releases, restoring 2.2? nom
Cheers,
David
blocking-b2g: --- → 2.2?
Comment 17•10 years ago
|
||
Hi,
As commented in bug #1113395, we'd need this for 2.2 (even not having committed launches for 2.2 yet). Just copying here the comment.
South Africa is one of our tier 1 countries, so it is expected to continue the work there with 2.2.
The problem is that the timing managed by carriers and OEMs is different than ours, and when they will decide to go with 2.2, it will be probably too late to include this in 2.2, or even 2.2 would be closed.
Please, let me know if we need to include this info in mana to get the 2.2+ (I think it can add some confusion, I'd prefer not to include this if it's not needed).
Cheers,
David
Comment 18•10 years ago
|
||
Triage: Not blocking, it's too late for 2.2 feature. But to keep moving forward.
Comment 19•10 years ago
|
||
Kevin,
FYI
Reporter | ||
Comment 20•10 years ago
|
||
Spoke offline with Josh, explained Howie's concern. This needs min. engineering resource and has min. risk to land on 2.2. Patch done by community/contractors, just needs to land once it's there
Renominating for 2.2. Thanks!
blocking-b2g: - → 2.2?
Comment 21•10 years ago
|
||
Thanks for the clarification on Comment 20, blocking as 2.2+
blocking-b2g: 2.2? → 2.2+
tracking-b2g:
+ → ---
Updated•10 years ago
|
Comment 22•10 years ago
|
||
Delphine, which patch are you referring to? I didn't see any mention of a patch, just repeated comments that this is currently an unsolved problem.
Flags: needinfo?(lebedel.delphine)
Updated•10 years ago
|
Assignee: nobody → ian.henderson
Comment 23•10 years ago
|
||
FF37 corpus text
Comment 24•10 years ago
|
||
FF37 corpus text
Comment 25•10 years ago
|
||
FF37 corpus text
Comment 26•10 years ago
|
||
I have added various xh corpus files. That is about as much as we are able to do.
Assignee: ian.henderson → kscanne
Reporter | ||
Comment 27•10 years ago
|
||
(Friedel: meant when patch will be there. Was talking quickly and on multiple bugs ;) )
Flags: needinfo?(lebedel.delphine)
Comment 28•10 years ago
|
||
Comment 29•10 years ago
|
||
Comment 30•10 years ago
|
||
This keyboard is confirmed complete: http://www.101languages.net/xhosa/keyboard/
Comment 31•10 years ago
|
||
Kevin,
I believe Dwayne gave you access to location.org. The files are complete as well for Xhosa.
http://mozilla.locamotion.org/xx/mozilla_lang/main.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/mozorg/home/index.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/firefox/os/index.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/firefox/os/devices.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/firefox/os/faq.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/firefox/partners/index.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/firefoxos/firefoxos.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/legal/index.lang.po
http://mozilla.locamotion.org/xx/mozilla_lang/tabzilla/tabzilla.lang.po
Comment 32•10 years ago
|
||
So are we ready to land and close this one?
Flags: needinfo?(ian.henderson)
Reporter | ||
Comment 33•10 years ago
|
||
Hi Howie: Ian actually doesn't land stuff, he works with Kevin Scannell on generating wordlists for this. Also, there's no patch in this bug.
I think this is currently still a WIP.
Reporter | ||
Comment 34•10 years ago
|
||
Also, as per comment 7, we need to work on an autocorrect engine to make this happen. We're looking into resolving this issue
Updated•10 years ago
|
Flags: needinfo?(ian.henderson)
Comment 35•10 years ago
|
||
Moving this out of 2.2 as our engine needs an update. Bug 1139255 will need to be completed first.
blocking-b2g: 2.2+ → ---
Comment 36•10 years ago
|
||
Kevin,
Some more raw data for your word lists:
https://localize.mozilla.org/xh/masterfirefoxos/
https://localize.mozilla.org/xh/sumo/
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•