838227 - Implement predictive text using ternary DAGs

Assignee

Description

•

12 years ago

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:18.0) Gecko/20100101 Firefox/18.0
Build ID: 20130116073211

Gregor Wagner [:gwagner]

Updated

•

12 years ago

Assignee: nobody → mozilla

Status: UNCONFIRMED → NEW

Ever confirmed: true

Gregor Wagner [:gwagner]

Updated

•

12 years ago

Component: DOM: Device Interfaces → Gaia::Keyboard

Product: Core → Boot2Gecko

QA Contact: wachen

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 1

•

12 years ago

dictionary size improvement: ternary DAGs vs current implementation
de.dict    3.4MB (5.6MB)
en_us.dict 1.8MB (3.6MB)
es.dict    1.7MB (3.1MB)
fr.dict    2.8MB (4.7MB)
pt_br.dict 2.0MB (3.7MB)

Richard Newman [:rnewman]

Comment 2

•

12 years ago

Apologies; a commit landed with this bug number instead of Bug 838827. Leaving this as a breadcrumb trail.

https://hg.mozilla.org/mozilla-central/rev/595da71ae45f

OS: Mac OS X → Gonk (Firefox OS)

Hardware: x86 → All

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 3

•

12 years ago

Performance comparison for input word 'restaurant'. (left = DAG, right = prefix trie)

r         48 ms  |   97 ms
re        49 ms  |  192 ms
res       72 ms  |  141 ms
rest      56 ms  |  108 ms
resta     65 ms  |   85 ms
restau    69 ms  |   49 ms
restaur   75 ms  |   50 ms

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 4

•

12 years ago

Currently we are evaluating two different node encodings:

node {
  int16 ch;
  int16 lPtr;  // left child
  int16 cPtr;  // center child
  int16 rPtr;  // right child
  int16 nPtr;  // next child, holds a pointer to the node with the next highest frequency
  int16 high;  // holds an overflow byte for lPtr, cPtr, rPtr, nPtr
  in16 frequency;
}

de.dict    3.4MB
en_us.dict 1.8MB
es.dict    1.7MB
fr.dict    2.8MB
pt_br.dict 2.0MB

r         48 ms
re        49 ms
res       72 ms
rest      56 ms
resta     65 ms
restau    69 ms
restaur   75 ms

source: https://github.com/ckerschb/gaia/tree/dag_ll

==============================================================

node {
  int16 frequency;
  int16 ch;
  int32 lPtr;
  int32 cPtr;
  int32 rPtr;
}

de.dict    6.6MB
en_us.dict 3.1MB
es.dict    2.8MB
fr.dict    4.9MB
pt_br.dict 3.1MB

r         14 ms
re        78 ms
res       70 ms
rest      42 ms
resta      6 ms
restau    26 ms
restaur    6 ms

source: https://github.com/ckerschb/gaia/tree/dag_typos

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 5

•

12 years ago

Next, we should try to apply the same technique with the pointer overflow byte to the second approach and see if we can further reduce the size of the dictionaries.

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 6

•

12 years ago

We improved the quality of the engine, by doing:
1) insertion; deletion; replacement; and transposition of characters
2) upper/lower case predictions
3) diacritics (umlaut)

Performance:
r          49 ms
re         51 ms
res        25 ms
rest        4 ms
resta       8 ms
restau      8 ms
restaur     5 ms
restaura   54 ms
restauran   7 ms

find the code here:
https://github.com/ckerschb/gaia/tree/tst


I guess, this is ready to merge, I will do a pull request!

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 7

•

12 years ago

Attached patch new pull request on github (deleted) — Details — Splinter Review

Attachment #721279 - Flags: review+

Gregor Wagner [:gwagner]

Updated

•

12 years ago

Attachment #721279 - Flags: review+ → review?(dflanagan)

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 8

•

12 years ago

https://github.com/mozilla-b2g/gaia/pull/8454

David Flanagan [:djf]

Comment 9

•

12 years ago

Sorry the review is taking a while. Its in my queue, but I've got some blocker bugs that are taking up my time.  I look forward to reviewing this soon!

Andreas Gal :gal

Comment 10

•

12 years ago

r=me with the nits by gregor and me addressed. David, feel free to request additional changes as a follow-up when you get around to this. Very nice work guys.

Andreas Gal :gal

Updated

•

12 years ago

Attachment #721279 - Flags: review?(dflanagan)

Attachment #721279 - Flags: review?

Attachment #721279 - Flags: review+

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 11

•

12 years ago

updated patch following the suggestions of Gregor and Andreas.

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 12

•

12 years ago

please find the new pull request here:

https://github.com/mozilla-b2g/gaia/pull/8605

Josh Carpenter [:jcarpenter]

Comment 13

•

12 years ago

Guys, can you clarify the benefits of this implementation? I did some background reading at http://www.codeproject.com/Articles/32089/Using-Ternary-DAGs-for-Spelling-Correction my understanding is Ternary DAG enables 1) smaller dictionary file sizes and 2) faster word suggestions. 

If so, is the determination of _how_ to surface those recommendations (eg: as a row of buttons across the top of the keyboard, versus as a button hovering near the cursor in the text field) an independent issue?

Josh Carpenter [:jcarpenter]

Updated

•

12 years ago

Whiteboard: u=user c=keyboard s=ux-most-wanted

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 14

•

12 years ago

both correct, ternary dags allow for 1) smaller dictionaries and 2) faster suggestions.

This implementation solely focuses on the prediction engine, which takes an input word (partial word) and returns an array of suggestions, everything else is an independent issue.

Gregor Wagner [:gwagner]

Updated

•

12 years ago

Attachment #721279 - Flags: review? → review+

Gregor Wagner [:gwagner]

Comment 15

•

12 years ago

https://github.com/mozilla-b2g/gaia/commit/a783fe0514709000bc63ae73ea5138707b82348d

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Dylan Oliver [:doliver]

Comment 16

•

12 years ago

tracking? -- significant improvements in text prediction

tracking-b2g18: --- → ?

Rudy Lu [:rudyl] (inactive after 2015/7/3)

Updated

•

12 years ago

Blocks: 850256

Staś Małolepszy :stas

Updated

•

12 years ago

Blocks: 851314

Lukas Blakk [:lsblakk] use ?needinfo

Comment 17

•

12 years ago

This looks like a feature and we're now past the 3/15 feature complete date - is there a partner or product push for this to be uplifted?

David Bolter [:davidb] (NeedInfo me for attention)

Comment 18

•

12 years ago

Adding access keyword because this helps accessibility (input impairment).

Keywords: access

Gregor Wagner [:gwagner]

Comment 19

•

12 years ago

(In reply to Lukas Blakk [:lsblakk] from comment #17)
> This looks like a feature and we're now past the 3/15 feature complete date
> - is there a partner or product push for this to be uplifted?

What does this mean now? I understand that we don't want to take it for 1.0.1 but we are shipping future versions based on b2g18. Does this mean we are not allowed to put any code in b2g18 that is not requested by a partner? That doesn't sound right to me.

Dylan Oliver [:doliver]

Comment 20

•

12 years ago

No partner is (yet) asking for this -- this one is sponsored by Mozilla. UX can probably best speak to how important they think this is for users. needinfo to Josh. 

Ideally I think we'd like to ship with text prediction turned on but we were forced to turn it off by default in 1.0.1 because performance was too slow.

Flags: needinfo?(jcarpenter)

Lucas Adamski [:ladamski]

Comment 21

•

12 years ago

Its definitely too late for 1.0.1.  1.1 could be an option but we are late in the cycle for features too and there's no room in that schedule for chasing regressions and performance work for non-blocking features.  IF we want to land it we should do so now though to enable maximum testing time.

Dylan Oliver [:doliver]

Comment 22

•

12 years ago

Sorry, I didn't mean to imply 1.0.1 -- I would like to lobby for 1.1, though.

Lukas Blakk [:lsblakk] use ?needinfo

Comment 23

•

12 years ago

If the UX team is making a case for this in v1.1 go ahead with nomination for uplift.

tracking-b2g18: ? → +

Josh Carpenter [:jcarpenter]

Comment 24

•

11 years ago

Hello folks. Sorry for the slow response; was on PTO last week. 

(In reply to Lucas Adamski [:ladamski] (plz needinfo) from comment #21)
> Its definitely too late for 1.0.1.  1.1 could be an option but we are late
> in the cycle for features too and there's no room in that schedule for
> chasing regressions and performance work for non-blocking features.  IF we
> want to land it we should do so now though to enable maximum testing time.

In general we're strongly in favor of improvements to keyboard quality. If this improves performance of Word Suggestions to the point that we can turn them on by default, that's quite exciting. Who can I work with to get this loaded onto a build and tested? If the responsiveness if acceptable I'd push strongly for a v1.1 uplift.

Flags: needinfo?(jcarpenter)

Josh Carpenter [:jcarpenter]

Comment 25

•

11 years ago

Hey folks, I'm working w/ Gordon to test this and will follow up tomorrow (Friday Mar 29).

Dylan Oliver [:doliver]

Updated

•

11 years ago

Depends on: 851565

Josh Carpenter [:jcarpenter]

Comment 26

•

11 years ago

Hi all, let's uplift this. Improved keyboard performance is one of our highest priorities right now, and this looks like it may enable us to turn suggestions back on, which is huge. Let me know what I can do to nudge the process along. I'm not up to speed on the latest processes for nominations, uplifts, etc.

Dylan Oliver [:doliver]

Comment 27

•

11 years ago

Comment on attachment 721279 [details] [diff] [review]
new pull request on github

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: Current performance of predictive text is known to be slow and partners are unhappy. We are unable to turn on word suggestions by default so the keyboard experience is compromised
Testing completed: 
Risk to taking this patch (and alternatives if risky): relatively low risk in that we have only changed the way we generate the predictions and have not made any changes to UX. However, we will also need to fix bug 851565 in a follow up.
String or UUID changes made by this patch: none

Attachment #721279 - Flags: approval-gaia-v1?

Josh Carpenter [:jcarpenter]

Updated

•

11 years ago

Blocks: 856521

Andreas Gal :gal

Comment 28

•

11 years ago

Comment on attachment 721279 [details] [diff] [review]
new pull request on github

Approving for v1 due to the keyboard performance concerns people are raising.

Attachment #721279 - Flags: approval-gaia-v1? → approval-gaia-v1+

Mike Lee [:mlee]

Comment 29

•

11 years ago

Marking as blocking tef? and requesting uplift.

blocking-b2g: --- → tef?

John Ford [:jhford] CET/CEST Berlin Time

Comment 30

•

11 years ago

The pull request doesn't show it being merged to master....  did it land there?  What commit is it on master branch?

Gregor Wagner [:gwagner]

Updated

•

11 years ago

Depends on: 857239

Gregor Wagner [:gwagner]

Comment 31

•

11 years ago

(In reply to John Ford [:jhford] from comment #30)
> The pull request doesn't show it being merged to master....  did it land
> there?  What commit is it on master branch?

Hm isn't this master?
https://github.com/mozilla-b2g/gaia/commit/a783fe0514709000bc63ae73ea5138707b82348d

John Ford [:jhford] CET/CEST Berlin Time

Comment 32

•

11 years ago

Ahh, didn't see comment 15 and the pull request was closed without the merge taking place there.  You also already cherry-picked this onto v1-train :)

v1-train: 1dfb75d6a6c7478b2d3251ea510769fc4a8a26b9

status-b2g18: --- → fixed

Daniel Coloma:dcoloma

Comment 33

•

11 years ago

I don't think is the right time to add this to v1.0.1, but happy to hear other opinions.

blocking-b2g: tef? → -

David Flanagan [:djf]

Comment 34

•

11 years ago

Christoph,

As a very late review comment, I'd like to ask you to document (here, or even better in the code) how the linked list (nPtr) works.

I haven't been able to figure it out by reading your code, and I have serious concerns about our ability to make good predictions because of the way the conversion from TST to DAG combines suffixes and discards word frequency information. I assume that the linked list is intended to work around this problem, but I'd like to understand the workaround before we go any further with the DAG-based dictionaries.

http://www.strchr.com/ternary_dags is about offering suggestions for misspelled words. In this case the input is a complete word that the user is done typing. It is only necessary to find a full list of correctly-spelled words that are close to it. This can be done adequately without word frequency information.

But for word prediction, we're trying to guess what the user wants to type, and I think that doing that requires us to know how common different words are. Since the DAG approach combines common suffixes, how do we store the frequency for two words (like "reading" and "writing") that end with the same suffix? The word "being" has a frequency of 167, and the word "yacking" has a frequency of 1. I see code in the dictionary creation script that averages node frequencies. Is that what's happening in this case? Would being and yacking end up with the same frequency? Or does the linked list allow us to avoid this problem?

For quality auto-correction and word prediction, I don't think we want to compromise on the word frequency data. So if the linked list isn't a fully-adequate workaround, I think we should use a bigger dictionary to get better results. We could switch back to a plain TST (doubling the dictionary size) or compromise and only allow combining of suffixes for words whose frequency is less than some threshold. (In the english dictionary, there are 12.5K words with a frequency of 100 and above, and 148K words with frequencies 99 and below, so it might not be too big a memory hit to create a complete tree for all of the high-frequency words).

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 35

•

11 years ago

David,

for illustration purposes I created a XML file based on your example:

<wordlist locale="en_US" description="English (US)" date="1340038693" version="16">
<w f="167" flags="">being</w>
<w f="2" flags="">yacking</w>
</wordlist>

Please note, that I updated the frequency of yacking from 1 to 2 because we skip words with a frequency less or equal to 1.

I compiled the XML to a dict file, using:
python3 xml2dict.py dictionaries/test.xml -v -o test.dict

Using the -v option allows us to inspect the generated TST (including the DAG).

[1] { ch: b, L: 0, C: 2, R: 3, N: 3, h: 0, f: 167}
[2] { ch: e, L: 0, C: 4, R: 0, N: 0, h: 0, f: 167}
[3] { ch: y, L: 0, C: 5, R: 0, N: 0, h: 0, f: 2}
[4] { ch: i, L: 0, C: 6, R: 0, N: 0, h: 0, f: 84}
[5] { ch: a, L: 0, C: 7, R: 0, N: 0, h: 0, f: 2}
[6] { ch: n, L: 0, C: 8, R: 0, N: 0, h: 0, f: 84}
[7] { ch: c, L: 0, C: 9, R: 0, N: 0, h: 0, f: 2}
[8] { ch: g, L: 0, C: 10, R: 0, N: 0, h: 0, f: 84}
[9] { ch: k, L: 0, C: 4, R: 0, N: 0, h: 0, f: 2}
[10] { ch: $, L: 0, C: 0, R: 0, N: 0, h: 0, f: 84}

You are right, we store the average for frequencies in common suffix nodes (e.g. node [4]).

Loading the dict file and applying the following diff shows what happens behind the scenes:

diff --git a/apps/keyboard/js/imes/latin/predictions.js b/apps/keyboard/js/imes/latin/predictions.js
index c4adf30..cc66e84 100644
--- a/apps/keyboard/js/imes/latin/predictions.js
+++ b/apps/keyboard/js/imes/latin/predictions.js
@@ -277,6 +277,7 @@ var Predictions = function() {
       }
       // Record the suggestion and move to the next best candidate
       if (!(prefix in _suggestions_index)) {
+        dump("cand: " + cand.prefix + ", sugg: " + prefix + ", cand.freq: " + cand.freq + ", mult: " + cand.multiplier + "\n");
         _suggestions.push(prefix);
         _suggestions_index[prefix] = true;
       }

The user taps 'b':
  cand: b, sugg: being, cand.freq: 668, mult: 4
  cand: , sugg: yacking, cand.freq: 2, mult: 1

The user taps 'y':
  cand: , sugg: being, cand.freq: 167, mult: 1
  cand: y, sugg: yacking, cand.freq: 8, mult: 4

We see that both, 'being' for 'b' and 'yacking' for 'y' are suggested displaying the right frequency.

For longer input words, the user taps 'bei':
  cand: bei, sugg: being, cand.freq: 168, mult: 2

You were again right with your assumption that we now use the average frequency in the node.
In my opinion, this doesn't really matter because we have already narrowed the search space to 'bei'.

Anyway, one of the big issues we had with our previous engine was the dictionary size and lookup speed.
We have optimized the file size of the dictionary as well as the lookup speed using TDAGs.

In my opinion we are providing accurate predictions even with averaging the suffix frequencies in the node.

I guess it's not my decision, Andreas?

David Flanagan [:djf]

Comment 36

•

11 years ago

Thanks for working through the example. I wish I'd proposed something more realistic!

I'm not so worried about the frequency of "being" being under-estimated when the user types "bei" as I am about the frequency of "yacking" being inflated when the user types "yacki".  If "yackic" was a real word with a medium frequency of 50 or something, wouldn't the low-frequency "yacking" be suggested first?

Again, I wish I had a realistic example. http://www.strchr.com/ternary_dags mentions "head" and "read" as words that share a suffix (and I'd add "bead", "dead","lead", and "mead"). The fact that these have common suffixes so close to their stem means that their frequency will be indistinguishable after the user has typed the e.  "lead" is common and "mead" is uncommon. But l and m are close on the keyboard and one could be mis-typed for the other.  Once we start looking at the actual x,y coordinates of the user's touch, consider what happens if the user types right in between the l and the m and then types e and a (and hits those two key right in the center so they are unambiguous). If we don't have frequency information to distinguish the common word from the uncommon one we can't make a meaningful prediction here.

I'm not sure that exaample is one that would make a practical difference, so I'm sort of playing devil's advocate here, but I am still concerned about this

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 37

•

11 years ago

David,

I am totally with you. There is a realistic chance that this algorithm overestimates the actual frequency of a candidate. On the other hand, we return 3 suggestions total, which means we would still display 2 other suggestions in such a case. Having the same overestimation problem on all of the 3 suggestions is rather unlikely, isn't it?

As already mentioned, the dictionary size was a bottleneck in the previous implementation of the engine. Therefore I applied the following patch and compiled all dictionaries again to see what difference it would make if we only compress suffix-nodes with the same frequency (which would avoid averaging the frequency in compressed nodes).

diff --git a/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py b/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py
index e7ba3e1..04d0243 100644
--- a/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py
+++ b/apps/keyboard/js/imes/latin/dictionaries/xml2dict.py
@@ -218,6 +218,7 @@ class TSTTree:
         # children are equal
         mapping[nodeA] = nodeB
         return mapping if nodeA.ch == nodeB.ch and \
+               nodeA.frequency == nodeB.frequency and \
                self.equal(nodeA.left,   nodeB.left, mapping) and \
                self.equal(nodeA.center, nodeB.center, mapping) and \
                self.equal(nodeA.right,  nodeB.right, mapping) else False

This minor fix compresses suffix nodes only if the frequency of every node matches in addition to the character, which is reflected in the bigger file size of the dictionaries:

de.dict    3.4MB VS 11.9MB
en_us.dict 1.8MB VS  5.5MB
es.dict    1.7MB VS  5.7MB
fr.dict    2.8MB VS  8.9MB
pt_br.dict 2.0MB VS  6.4MB

As we can see, compressing suffix nodes only if the character and the frequency in the node match increases the file size dramatically.

For the en_us.dict for example, we create:
  (474298 - 79632 = 394666 nodes)

in comparsion when we would average the frequency in compressed nodes:
  (474298 - 345723 = 128575 nodes)

This means, our algorithm compresses 345k nodes if only the character in the node matches whereas we can only compress 79k nodes if the character and the frequency in the nodes matches.

I guess if we really want to get more accuracy here, it's a better idea to switch algorithms again, because such big dictionaries are not an option what I remember.

David Flanagan [:djf]

Updated

•

11 years ago

Blocks: 797170

Stephany Wilkes

Updated

•

11 years ago

Whiteboard: u=user c=keyboard s=ux-most-wanted → ux-tracking

Leo

Updated

•

11 years ago

Blocks: 873934

:CSchmoeckel

Comment 38

•

11 years ago

Predictive Text keyboard feature has been implemented and is functional on the current master build. 
Tested on b2g/nightly/mozilla-central-unagi/latest.

Kernel Date: Dec 5
Gecko: http://hg.mozilla.org/mozilla-central/rev/3c6f2394995d
Gaia: d2ad0f1a5a6be8a2d6df721ba3c2b33e037c423b

Status: RESOLVED → VERIFIED

Tony Chung [:tchung]

Comment 39

•

11 years ago

Please add a testcase for this bug to moztrap for 1.1 testsuite.  If yes, mark this in-moztrap+ when completed.  If not, mark this in-moztrap-.

Flags: in-moztrap?(cschmoeckel)

:CSchmoeckel

Comment 40

•

11 years ago

Added Keyboard Suite Test Case #8501 - [Keyboard] Auto Correct's text prediction works correctly in the English Language & TC #8502 [Keyboard] Text prediction text entry in English keeps the capitalization entered by the user

Flags: in-moztrap?(cschmoeckel) → in-moztrap+