Closed
Bug 838227
Opened 12 years ago
Closed 12 years ago
Implement predictive text using ternary DAGs
Categories
(Firefox OS Graveyard :: Gaia::Keyboard, defect)
Tracking
(blocking-b2g:-, b2g18+ fixed)
VERIFIED
FIXED
blocking-b2g | - |
People
(Reporter: ckerschb, Assigned: ckerschb)
References
Details
(Keywords: access, Whiteboard: ux-tracking)
Attachments
(1 file)
(deleted),
patch
|
gal
:
review+
gwagner
:
review+
gal
:
approval-gaia-v1+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:18.0) Gecko/20100101 Firefox/18.0 Build ID: 20130116073211
Updated•12 years ago
|
Assignee: nobody → mozilla
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•12 years ago
|
Component: DOM: Device Interfaces → Gaia::Keyboard
Product: Core → Boot2Gecko
QA Contact: wachen
Assignee | ||
Comment 1•12 years ago
|
||
dictionary size improvement: ternary DAGs vs current implementation de.dict 3.4MB (5.6MB) en_us.dict 1.8MB (3.6MB) es.dict 1.7MB (3.1MB) fr.dict 2.8MB (4.7MB) pt_br.dict 2.0MB (3.7MB)
Comment 2•12 years ago
|
||
Apologies; a commit landed with this bug number instead of Bug 838827. Leaving this as a breadcrumb trail. https://hg.mozilla.org/mozilla-central/rev/595da71ae45f
OS: Mac OS X → Gonk (Firefox OS)
Hardware: x86 → All
Assignee | ||
Comment 3•12 years ago
|
||
Performance comparison for input word 'restaurant'. (left = DAG, right = prefix trie) r 48 ms | 97 ms re 49 ms | 192 ms res 72 ms | 141 ms rest 56 ms | 108 ms resta 65 ms | 85 ms restau 69 ms | 49 ms restaur 75 ms | 50 ms
Assignee | ||
Comment 4•12 years ago
|
||
Currently we are evaluating two different node encodings: node { int16 ch; int16 lPtr; // left child int16 cPtr; // center child int16 rPtr; // right child int16 nPtr; // next child, holds a pointer to the node with the next highest frequency int16 high; // holds an overflow byte for lPtr, cPtr, rPtr, nPtr in16 frequency; } de.dict 3.4MB en_us.dict 1.8MB es.dict 1.7MB fr.dict 2.8MB pt_br.dict 2.0MB r 48 ms re 49 ms res 72 ms rest 56 ms resta 65 ms restau 69 ms restaur 75 ms source: https://github.com/ckerschb/gaia/tree/dag_ll ============================================================== node { int16 frequency; int16 ch; int32 lPtr; int32 cPtr; int32 rPtr; } de.dict 6.6MB en_us.dict 3.1MB es.dict 2.8MB fr.dict 4.9MB pt_br.dict 3.1MB r 14 ms re 78 ms res 70 ms rest 42 ms resta 6 ms restau 26 ms restaur 6 ms source: https://github.com/ckerschb/gaia/tree/dag_typos
Assignee | ||
Comment 5•12 years ago
|
||
Next, we should try to apply the same technique with the pointer overflow byte to the second approach and see if we can further reduce the size of the dictionaries.
Assignee | ||
Comment 6•12 years ago
|
||
We improved the quality of the engine, by doing: 1) insertion; deletion; replacement; and transposition of characters 2) upper/lower case predictions 3) diacritics (umlaut) Performance: r 49 ms re 51 ms res 25 ms rest 4 ms resta 8 ms restau 8 ms restaur 5 ms restaura 54 ms restauran 7 ms find the code here: https://github.com/ckerschb/gaia/tree/tst I guess, this is ready to merge, I will do a pull request!
Assignee | ||
Comment 7•12 years ago
|
||
Attachment #721279 -
Flags: review+
Updated•12 years ago
|
Attachment #721279 -
Flags: review+ → review?(dflanagan)
Assignee | ||
Comment 8•12 years ago
|
||
https://github.com/mozilla-b2g/gaia/pull/8454
Comment 9•12 years ago
|
||
Sorry the review is taking a while. Its in my queue, but I've got some blocker bugs that are taking up my time. I look forward to reviewing this soon!
Comment 10•12 years ago
|
||
r=me with the nits by gregor and me addressed. David, feel free to request additional changes as a follow-up when you get around to this. Very nice work guys.
Updated•12 years ago
|
Attachment #721279 -
Flags: review?(dflanagan)
Attachment #721279 -
Flags: review?
Attachment #721279 -
Flags: review+
Assignee | ||
Comment 11•12 years ago
|
||
updated patch following the suggestions of Gregor and Andreas.
Assignee | ||
Comment 12•12 years ago
|
||
please find the new pull request here: https://github.com/mozilla-b2g/gaia/pull/8605
Comment 13•12 years ago
|
||
Guys, can you clarify the benefits of this implementation? I did some background reading at http://www.codeproject.com/Articles/32089/Using-Ternary-DAGs-for-Spelling-Correction my understanding is Ternary DAG enables 1) smaller dictionary file sizes and 2) faster word suggestions. If so, is the determination of _how_ to surface those recommendations (eg: as a row of buttons across the top of the keyboard, versus as a button hovering near the cursor in the text field) an independent issue?
Updated•12 years ago
|
Whiteboard: u=user c=keyboard s=ux-most-wanted
Assignee | ||
Comment 14•12 years ago
|
||
both correct, ternary dags allow for 1) smaller dictionaries and 2) faster suggestions. This implementation solely focuses on the prediction engine, which takes an input word (partial word) and returns an array of suggestions, everything else is an independent issue.
Updated•12 years ago
|
Attachment #721279 -
Flags: review? → review+
Comment 15•12 years ago
|
||
https://github.com/mozilla-b2g/gaia/commit/a783fe0514709000bc63ae73ea5138707b82348d
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 16•12 years ago
|
||
tracking? -- significant improvements in text prediction
tracking-b2g18:
--- → ?
Comment 17•12 years ago
|
||
This looks like a feature and we're now past the 3/15 feature complete date - is there a partner or product push for this to be uplifted?
Comment 18•12 years ago
|
||
Adding access keyword because this helps accessibility (input impairment).
Keywords: access
Comment 19•12 years ago
|
||
(In reply to Lukas Blakk [:lsblakk] from comment #17) > This looks like a feature and we're now past the 3/15 feature complete date > - is there a partner or product push for this to be uplifted? What does this mean now? I understand that we don't want to take it for 1.0.1 but we are shipping future versions based on b2g18. Does this mean we are not allowed to put any code in b2g18 that is not requested by a partner? That doesn't sound right to me.
Comment 20•12 years ago
|
||
No partner is (yet) asking for this -- this one is sponsored by Mozilla. UX can probably best speak to how important they think this is for users. needinfo to Josh. Ideally I think we'd like to ship with text prediction turned on but we were forced to turn it off by default in 1.0.1 because performance was too slow.
Flags: needinfo?(jcarpenter)
Comment 21•12 years ago
|
||
Its definitely too late for 1.0.1. 1.1 could be an option but we are late in the cycle for features too and there's no room in that schedule for chasing regressions and performance work for non-blocking features. IF we want to land it we should do so now though to enable maximum testing time.
Comment 22•12 years ago
|
||
Sorry, I didn't mean to imply 1.0.1 -- I would like to lobby for 1.1, though.
Comment 23•12 years ago
|
||
If the UX team is making a case for this in v1.1 go ahead with nomination for uplift.
Comment 24•11 years ago
|
||
Hello folks. Sorry for the slow response; was on PTO last week. (In reply to Lucas Adamski [:ladamski] (plz needinfo) from comment #21) > Its definitely too late for 1.0.1. 1.1 could be an option but we are late > in the cycle for features too and there's no room in that schedule for > chasing regressions and performance work for non-blocking features. IF we > want to land it we should do so now though to enable maximum testing time. In general we're strongly in favor of improvements to keyboard quality. If this improves performance of Word Suggestions to the point that we can turn them on by default, that's quite exciting. Who can I work with to get this loaded onto a build and tested? If the responsiveness if acceptable I'd push strongly for a v1.1 uplift.
Flags: needinfo?(jcarpenter)
Comment 25•11 years ago
|
||
Hey folks, I'm working w/ Gordon to test this and will follow up tomorrow (Friday Mar 29).
Comment 26•11 years ago
|
||
Hi all, let's uplift this. Improved keyboard performance is one of our highest priorities right now, and this looks like it may enable us to turn suggestions back on, which is huge. Let me know what I can do to nudge the process along. I'm not up to speed on the latest processes for nominations, uplifts, etc.
Comment 27•11 years ago
|
||
Comment on attachment 721279 [details] [diff] [review] new pull request on github NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): User impact if declined: Current performance of predictive text is known to be slow and partners are unhappy. We are unable to turn on word suggestions by default so the keyboard experience is compromised Testing completed: Risk to taking this patch (and alternatives if risky): relatively low risk in that we have only changed the way we generate the predictions and have not made any changes to UX. However, we will also need to fix bug 851565 in a follow up. String or UUID changes made by this patch: none
Attachment #721279 -
Flags: approval-gaia-v1?
Comment 28•11 years ago
|
||
Comment on attachment 721279 [details] [diff] [review] new pull request on github Approving for v1 due to the keyboard performance concerns people are raising.
Attachment #721279 -
Flags: approval-gaia-v1? → approval-gaia-v1+
Comment 30•11 years ago
|
||
The pull request doesn't show it being merged to master.... did it land there? What commit is it on master branch?
Comment 31•11 years ago
|
||
(In reply to John Ford [:jhford] from comment #30) > The pull request doesn't show it being merged to master.... did it land > there? What commit is it on master branch? Hm isn't this master? https://github.com/mozilla-b2g/gaia/commit/a783fe0514709000bc63ae73ea5138707b82348d
Comment 32•11 years ago
|
||
Ahh, didn't see comment 15 and the pull request was closed without the merge taking place there. You also already cherry-picked this onto v1-train :) v1-train: 1dfb75d6a6c7478b2d3251ea510769fc4a8a26b9
status-b2g18:
--- → fixed
Comment 33•11 years ago
|
||
I don't think is the right time to add this to v1.0.1, but happy to hear other opinions.
blocking-b2g: tef? → -
Comment 34•11 years ago
|
||
Christoph, As a very late review comment, I'd like to ask you to document (here, or even better in the code) how the linked list (nPtr) works. I haven't been able to figure it out by reading your code, and I have serious concerns about our ability to make good predictions because of the way the conversion from TST to DAG combines suffixes and discards word frequency information. I assume that the linked list is intended to work around this problem, but I'd like to understand the workaround before we go any further with the DAG-based dictionaries. http://www.strchr.com/ternary_dags is about offering suggestions for misspelled words. In this case the input is a complete word that the user is done typing. It is only necessary to find a full list of correctly-spelled words that are close to it. This can be done adequately without word frequency information. But for word prediction, we're trying to guess what the user wants to type, and I think that doing that requires us to know how common different words are. Since the DAG approach combines common suffixes, how do we store the frequency for two words (like "reading" and "writing") that end with the same suffix? The word "being" has a frequency of 167, and the word "yacking" has a frequency of 1. I see code in the dictionary creation script that averages node frequencies. Is that what's happening in this case? Would being and yacking end up with the same frequency? Or does the linked list allow us to avoid this problem? For quality auto-correction and word prediction, I don't think we want to compromise on the word frequency data. So if the linked list isn't a fully-adequate workaround, I think we should use a bigger dictionary to get better results. We could switch back to a plain TST (doubling the dictionary size) or compromise and only allow combining of suffixes for words whose frequency is less than some threshold. (In the english dictionary, there are 12.5K words with a frequency of 100 and above, and 148K words with frequencies 99 and below, so it might not be too big a memory hit to create a complete tree for all of the high-frequency words).
Assignee | ||
Comment 35•11 years ago
|
||
David, for illustration purposes I created a XML file based on your example: <wordlist locale="en_US" description="English (US)" date="1340038693" version="16"> <w f="167" flags="">being</w> <w f="2" flags="">yacking</w> </wordlist> Please note, that I updated the frequency of yacking from 1 to 2 because we skip words with a frequency less or equal to 1. I compiled the XML to a dict file, using: python3 xml2dict.py dictionaries/test.xml -v -o test.dict Using the -v option allows us to inspect the generated TST (including the DAG). [1] { ch: b, L: 0, C: 2, R: 3, N: 3, h: 0, f: 167} [2] { ch: e, L: 0, C: 4, R: 0, N: 0, h: 0, f: 167} [3] { ch: y, L: 0, C: 5, R: 0, N: 0, h: 0, f: 2} [4] { ch: i, L: 0, C: 6, R: 0, N: 0, h: 0, f: 84} [5] { ch: a, L: 0, C: 7, R: 0, N: 0, h: 0, f: 2} [6] { ch: n, L: 0, C: 8, R: 0, N: 0, h: 0, f: 84} [7] { ch: c, L: 0, C: 9, R: 0, N: 0, h: 0, f: 2} [8] { ch: g, L: 0, C: 10, R: 0, N: 0, h: 0, f: 84} [9] { ch: k, L: 0, C: 4, R: 0, N: 0, h: 0, f: 2} [10] { ch: $, L: 0, C: 0, R: 0, N: 0, h: 0, f: 84} You are right, we store the average for frequencies in common suffix nodes (e.g. node [4]). Loading the dict file and applying the following diff shows what happens behind the scenes: diff --git a/apps/keyboard/js/imes/latin/predictions.js b/apps/keyboard/js/imes/latin/predictions.js index c4adf30..cc66e84 100644 --- a/apps/keyboard/js/imes/latin/predictions.js +++ b/apps/keyboard/js/imes/latin/predictions.js @@ -277,6 +277,7 @@ var Predictions = function() { } // Record the suggestion and move to the next best candidate if (!(prefix in _suggestions_index)) { + dump("cand: " + cand.prefix + ", sugg: " + prefix + ", cand.freq: " + cand.freq + ", mult: " + cand.multiplier + "\n"); _suggestions.push(prefix); _suggestions_index[prefix] = true; } The user taps 'b': cand: b, sugg: being, cand.freq: 668, mult: 4 cand: , sugg: yacking, cand.freq: 2, mult: 1 The user taps 'y': cand: , sugg: being, cand.freq: 167, mult: 1 cand: y, sugg: yacking, cand.freq: 8, mult: 4 We see that both, 'being' for 'b' and 'yacking' for 'y' are suggested displaying the right frequency. For longer input words, the user taps 'bei': cand: bei, sugg: being, cand.freq: 168, mult: 2 You were again right with your assumption that we now use the average frequency in the node. In my opinion, this doesn't really matter because we have already narrowed the search space to 'bei'. Anyway, one of the big issues we had with our previous engine was the dictionary size and lookup speed. We have optimized the file size of the dictionary as well as the lookup speed using TDAGs. In my opinion we are providing accurate predictions even with averaging the suffix frequencies in the node. I guess it's not my decision, Andreas?
Comment 36•11 years ago
|
||
Thanks for working through the example. I wish I'd proposed something more realistic! I'm not so worried about the frequency of "being" being under-estimated when the user types "bei" as I am about the frequency of "yacking" being inflated when the user types "yacki". If "yackic" was a real word with a medium frequency of 50 or something, wouldn't the low-frequency "yacking" be suggested first? Again, I wish I had a realistic example. http://www.strchr.com/ternary_dags mentions "head" and "read" as words that share a suffix (and I'd add "bead", "dead","lead", and "mead"). The fact that these have common suffixes so close to their stem means that their frequency will be indistinguishable after the user has typed the e. "lead" is common and "mead" is uncommon. But l and m are close on the keyboard and one could be mis-typed for the other. Once we start looking at the actual x,y coordinates of the user's touch, consider what happens if the user types right in between the l and the m and then types e and a (and hits those two key right in the center so they are unambiguous). If we don't have frequency information to distinguish the common word from the uncommon one we can't make a meaningful prediction here. I'm not sure that exaample is one that would make a practical difference, so I'm sort of playing devil's advocate here, but I am still concerned about this
Assignee | ||
Comment 37•11 years ago
|
||
David, I am totally with you. There is a realistic chance that this algorithm overestimates the actual frequency of a candidate. On the other hand, we return 3 suggestions total, which means we would still display 2 other suggestions in such a case. Having the same overestimation problem on all of the 3 suggestions is rather unlikely, isn't it? As already mentioned, the dictionary size was a bottleneck in the previous implementation of the engine. Therefore I applied the following patch and compiled all dictionaries again to see what difference it would make if we only compress suffix-nodes with the same frequency (which would avoid averaging the frequency in compressed nodes). diff --git a/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py b/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py index e7ba3e1..04d0243 100644 --- a/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py +++ b/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py @@ -218,6 +218,7 @@ class TSTTree: # children are equal mapping[nodeA] = nodeB return mapping if nodeA.ch == nodeB.ch and \ + nodeA.frequency == nodeB.frequency and \ self.equal(nodeA.left, nodeB.left, mapping) and \ self.equal(nodeA.center, nodeB.center, mapping) and \ self.equal(nodeA.right, nodeB.right, mapping) else False This minor fix compresses suffix nodes only if the frequency of every node matches in addition to the character, which is reflected in the bigger file size of the dictionaries: de.dict 3.4MB VS 11.9MB en_us.dict 1.8MB VS 5.5MB es.dict 1.7MB VS 5.7MB fr.dict 2.8MB VS 8.9MB pt_br.dict 2.0MB VS 6.4MB As we can see, compressing suffix nodes only if the character and the frequency in the node match increases the file size dramatically. For the en_us.dict for example, we create: (474298 - 79632 = 394666 nodes) in comparsion when we would average the frequency in compressed nodes: (474298 - 345723 = 128575 nodes) This means, our algorithm compresses 345k nodes if only the character in the node matches whereas we can only compress 79k nodes if the character and the frequency in the nodes matches. I guess if we really want to get more accuracy here, it's a better idea to switch algorithms again, because such big dictionaries are not an option what I remember.
Updated•11 years ago
|
Whiteboard: u=user c=keyboard s=ux-most-wanted → ux-tracking
Comment 38•11 years ago
|
||
Predictive Text keyboard feature has been implemented and is functional on the current master build. Tested on b2g/nightly/mozilla-central-unagi/latest. Kernel Date: Dec 5 Gecko: http://hg.mozilla.org/mozilla-central/rev/3c6f2394995d Gaia: d2ad0f1a5a6be8a2d6df721ba3c2b33e037c423b
Status: RESOLVED → VERIFIED
Comment 39•11 years ago
|
||
Please add a testcase for this bug to moztrap for 1.1 testsuite. If yes, mark this in-moztrap+ when completed. If not, mark this in-moztrap-.
Flags: in-moztrap?(cschmoeckel)
Comment 40•11 years ago
|
||
Added Keyboard Suite Test Case #8501 - [Keyboard] Auto Correct's text prediction works correctly in the English Language & TC #8502 [Keyboard] Text prediction text entry in English keeps the capitalization entered by the user
Flags: in-moztrap?(cschmoeckel) → in-moztrap+
You need to log in
before you can comment on or make changes to this bug.
Description
•