Closed
Bug 1180113
Opened 9 years ago
Closed 9 years ago
Allow for out of dictionary words to be recognized in Pocketsphinx
Categories
(Core :: Web Speech, defect)
Core
Web Speech
Tracking
()
Tracking | Status | |
---|---|---|
firefox42 | --- | fixed |
People
(Reporter: kdavis, Assigned: anatal)
References
Details
(Whiteboard: [webspeechapi][vaani])
Attachments
(2 files, 9 obsolete files)
(deleted),
patch
|
anatal
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
anatal
:
review+
|
Details | Diff | Splinter Review |
Currently Pocketsphinx only allows for words in its dictionary to be recognized. This is insufficient for our needs.
For example, for applications such as calling we need proper names such as "Sandip" to be recognized even if they are not in Pocketsphinx's dictionary.
The goal of this bug is to allow Pocketsphinx to recognize words not in its dictionary.
Assignee: nobody → anatal
Whiteboard: [webspeechapi][vaani][systemsfe]
Whiteboard: [webspeechapi][vaani][systemsfe] → [webspeechapi][vaani]
Blocks: Meta-Vaani
Assignee | ||
Comment 1•9 years ago
|
||
Part 1 of 2 : Adds the new Arpa model trained from CMU LM to the repo next to cmudict.
Assignee | ||
Comment 2•9 years ago
|
||
Part 1 of 2 : Adds the new Arpa model trained from CMU LM to the repo next to cmudict.
Attachment #8635099 -
Attachment is obsolete: true
Assignee | ||
Comment 3•9 years ago
|
||
Part 2 of 2: Introduces the g2p algorithm integrated into pocketsphinx codebase
Assignee | ||
Comment 4•9 years ago
|
||
Assignee | ||
Comment 6•9 years ago
|
||
(In reply to kdavis from comment #5)
> Who should review this?
I am waiting the successful end of the run on try and the CMU review to then ask it inside Mozilla, but if you want to do it before, it's up to you.
Ok. Once I'm done some work on Bug 1178326 I'll take a look too.
Attachment #8635100 -
Flags: review?(kdavis)
Attachment #8635100 -
Flags: review?(bugs)
Attachment #8635101 -
Flags: review?(kdavis)
Attachment #8635101 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635100 -
Attachment is obsolete: true
Attachment #8635100 -
Flags: review?(kdavis)
Attachment #8635100 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635101 -
Attachment is obsolete: true
Attachment #8635101 -
Flags: review?(kdavis)
Attachment #8635101 -
Flags: review?(bugs)
Assignee | ||
Comment 9•9 years ago
|
||
Assignee | ||
Comment 10•9 years ago
|
||
Assignee | ||
Comment 11•9 years ago
|
||
Assignee | ||
Comment 12•9 years ago
|
||
Remarks:
1 - This patch was reviewed by pocketsphinx group and will eventually land in their repo and then maybe cmudict will be no more required.
2 - This algorithm works to every language that have a proper arpa model generated from a statistical language model
3 - Try is still being executed and when finish the review flags will be requested
Assignee | ||
Updated•9 years ago
|
Attachment #8635217 -
Flags: review?(kdavis)
Attachment #8635217 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635218 -
Flags: review?(kdavis)
Attachment #8635218 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635217 -
Attachment is obsolete: true
Attachment #8635217 -
Attachment is patch: false
Attachment #8635217 -
Flags: review?(kdavis)
Attachment #8635217 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635218 -
Attachment is obsolete: true
Attachment #8635218 -
Flags: review?(kdavis)
Attachment #8635218 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635217 -
Attachment is obsolete: false
Attachment #8635217 -
Attachment is patch: true
Attachment #8635217 -
Flags: review?(kdavis)
Attachment #8635217 -
Flags: review?(bugs)
Assignee | ||
Comment 13•9 years ago
|
||
Attachment #8635608 -
Flags: review?(kdavis)
Attachment #8635608 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8635608 -
Attachment description: This parch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars → This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
Assignee | ||
Updated•9 years ago
|
Attachment #8635217 -
Attachment description: Bug 1180113 - Part 1 : Add arpa model into gecko for g2p generation → This patch adds arpa model into gecko for g2p generation
Assignee | ||
Comment 14•9 years ago
|
||
Comment on attachment 8635608 [details] [diff] [review]
This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
This version have improvements searching the best path.
Assignee | ||
Comment 15•9 years ago
|
||
Comment 16•9 years ago
|
||
Comment on attachment 8635217 [details] [diff] [review]
This patch adds arpa model into gecko for g2p generation
I assume the license model for en-US.dic.dmp is something we can use, and not anything special. Could you please check.
Attachment #8635217 -
Flags: review?(bugs) → review+
Comment 17•9 years ago
|
||
Comment on attachment 8635608 [details] [diff] [review]
This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
>- char const *dictfile = NULL, *fillerfile = NULL;
>+ char const *dictfile = NULL, *fillerfile = NULL, *arpafile = NULL;
>
> if (config) {
> dictfile = cmd_ln_str_r(config, "-dict");
> fillerfile = cmd_ln_str_r(config, "-fdict");
>+ arpafile = string_join(dictfile, ".dmp", NULL);
> }
>
> /*
>@@ -303,6 +304,17 @@ dict_init(cmd_ln_t *config, bin_mdef_t * mdef)
> * Also check for type size restrictions.
> */
> d = (dict_t *) ckd_calloc(1, sizeof(dict_t)); /* freed in dict_free() */
>+
>+ if (arpafile) {
>+ ngram_model_t *ngram_g2p_model = ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath);
>+ if (!ngram_g2p_model) {
>+ E_ERROR("No arpa model found \n");
>+ return NULL;
You leak arpafile here.
And in fact you leak if if the method returns anytime before the current ckd_free(arpafile); call.
So, please do
if (config) {
arpafile = string_join(dictfile, ".dmp", NULL);
}
right before if (arpafile) { and then free the allocated string buffer after ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath); call.
I can't review the rest without having some kind of spec for the file type.
Say, I have no idea what dict_split_unigram should do. I can only read what its current implementation does.
Are you kdavis more familiar with this stuff?
Attachment #8635608 -
Flags: review?(bugs)
Assignee | ||
Comment 18•9 years ago
|
||
(In reply to Olli Pettay [:smaug] (about to have some vacation, expect slower than usual reviews) from comment #16)
> Comment on attachment 8635217 [details] [diff] [review]
> This patch adds arpa model into gecko for g2p generation
>
> I assume the license model for en-US.dic.dmp is something we can use, and
> not anything special. Could you please check.
I compiled the .dmp using sphinx_lm_convert from its text version [1] that was generated from standard pocketsphinx language model using phonetisaurus.
The license is same as everything else from pocketsphinx [2]
[1] https://www.dropbox.com/s/bj3mgli5iv5qdcx/cmudict-g2p.lm?dl=0
[2] https://github.com/cmusphinx/pocketsphinx/blob/master/LICENSE
Assignee | ||
Comment 19•9 years ago
|
||
Attachment #8635608 -
Attachment is obsolete: true
Attachment #8635608 -
Flags: review?(kdavis)
Attachment #8635877 -
Flags: review?(kdavis)
Assignee | ||
Comment 20•9 years ago
|
||
(In reply to Olli Pettay [:smaug] (about to have some vacation, expect slower than usual reviews) from comment #17)
> Comment on attachment 8635608 [details] [diff] [review]
> This patch introduces new g2p algorithm into pocketsphinx to allow out of
> dictionary words to be added to grammars
>
> >- char const *dictfile = NULL, *fillerfile = NULL;
> >+ char const *dictfile = NULL, *fillerfile = NULL, *arpafile = NULL;
> >
> > if (config) {
> > dictfile = cmd_ln_str_r(config, "-dict");
> > fillerfile = cmd_ln_str_r(config, "-fdict");
> >+ arpafile = string_join(dictfile, ".dmp", NULL);
> > }
> >
> > /*
> >@@ -303,6 +304,17 @@ dict_init(cmd_ln_t *config, bin_mdef_t * mdef)
> > * Also check for type size restrictions.
> > */
> > d = (dict_t *) ckd_calloc(1, sizeof(dict_t)); /* freed in dict_free() */
> >+
> >+ if (arpafile) {
> >+ ngram_model_t *ngram_g2p_model = ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath);
> >+ if (!ngram_g2p_model) {
> >+ E_ERROR("No arpa model found \n");
> >+ return NULL;
> You leak arpafile here.
> And in fact you leak if if the method returns anytime before the current
> ckd_free(arpafile); call.
> So, please do
> if (config) {
> arpafile = string_join(dictfile, ".dmp", NULL);
> }
> right before if (arpafile) { and then free the allocated string buffer after
> ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath); call.
>
Ok, fixed on new patch.
> I can't review the rest without having some kind of spec for the file type.
> Say, I have no idea what dict_split_unigram should do. I can only read what
> its current implementation does.
I sent the text arpa lm file on comment #18
Reporter | ||
Comment 21•9 years ago
|
||
(In reply to Andre Natal from comment #20)
> (In reply to Olli Pettay [:smaug] (about to have some vacation, expect
> slower than usual reviews) from comment #17)
> > Comment on attachment 8635608 [details] [diff] [review]
> > This patch introduces new g2p algorithm into pocketsphinx to allow out of
> > dictionary words to be added to grammars
> >
> > >- char const *dictfile = NULL, *fillerfile = NULL;
> > >+ char const *dictfile = NULL, *fillerfile = NULL, *arpafile = NULL;
> > >
> > > if (config) {
> > > dictfile = cmd_ln_str_r(config, "-dict");
> > > fillerfile = cmd_ln_str_r(config, "-fdict");
> > >+ arpafile = string_join(dictfile, ".dmp", NULL);
> > > }
> > >
> > > /*
> > >@@ -303,6 +304,17 @@ dict_init(cmd_ln_t *config, bin_mdef_t * mdef)
> > > * Also check for type size restrictions.
> > > */
> > > d = (dict_t *) ckd_calloc(1, sizeof(dict_t)); /* freed in dict_free() */
> > >+
> > >+ if (arpafile) {
> > >+ ngram_model_t *ngram_g2p_model = ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath);
> > >+ if (!ngram_g2p_model) {
> > >+ E_ERROR("No arpa model found \n");
> > >+ return NULL;
> > You leak arpafile here.
> > And in fact you leak if if the method returns anytime before the current
> > ckd_free(arpafile); call.
> > So, please do
> > if (config) {
> > arpafile = string_join(dictfile, ".dmp", NULL);
> > }
> > right before if (arpafile) { and then free the allocated string buffer after
> > ngram_model_read(NULL,arpafile,NGRAM_AUTO,logmath); call.
> >
>
> Ok, fixed on new patch.
>
> > I can't review the rest without having some kind of spec for the file type.
> > Say, I have no idea what dict_split_unigram should do. I can only read what
> > its current implementation does.
>
Andre, I think you forgot to add Olli as a reviewer on the new patch?
Comment 22•9 years ago
|
||
(In reply to Andre Natal from comment #18)
> I compiled the .dmp using sphinx_lm_convert from its text version [1] that
> was generated from standard pocketsphinx language model using phonetisaurus.
Would it be possible to have also the text version in our repository?
Or perhaps only that and run sphinx_lm_convert when compiling Gecko?
Reporter | ||
Comment 23•9 years ago
|
||
Comment on attachment 8635217 [details] [diff] [review]
This patch adds arpa model into gecko for g2p generation
Review of attachment 8635217 [details] [diff] [review]:
-----------------------------------------------------------------
I am wondering about new languages.
It seems to me that there are several steps that are hidden
here, and it’d be great if we had the tool chain used to do
these hidden steps so we can use it for new languages.
As far as I can surmise this process starts with two things:
* Phonetic dictionary
* Text corpus
The process then is:
1. One converts the text corpus to phonemes using the phonetic dictionary to obtain a phonetic text corpus
2. One coverts this phonetic text corpus to a "phonetic language model" in lm format
3. One converts the lm format "phonetic language model" to a dmp file using sphinx_lm_convert
It’d be great if you could create a github repo that contains the scripts
that go from a phonetic dictionary and text corpus to the final dmp so
we can apply these scripts to new languages. Can you open a new bug for
this task and put me on CC?
For FxOS 2.5 we are committed to the languages English, Spanish,
and French. For each of these languages CMU provides a phonetic
dictionary. Also, I’ve created text corpora for each of these
languages on S3 from Wikipedia dumps. So, all we need now is your
github repo with the associated scripts!
From a legal standpoint I guess my question is:
"What is the licensing status of cmudict-g2p.lm?”
You mentioned its the same as Pocketsphinx, but I don’t see where
one can download cmudict-g2p.lm from CMU with an associated
license. Can you send out the link?
Attachment #8635217 -
Flags: review?(kdavis) → review+
Attachment #8635877 -
Flags: review?(bugs)
Reporter | ||
Comment 24•9 years ago
|
||
(In reply to Olli Pettay [:smaug] (away-ish until July 26, expect slower than usual reviews) from comment #17)
> Comment on attachment 8635608 [details] [diff] [review]
>
> Say, I have no idea what dict_split_unigram should do. I can only read what
> its current implementation does.
As Andre hasn't chimed in to explain his code, I'll give my take on what the
function dict_split_unigram does.
The function dict_split_unigram is tasked with parsing a "word", not really
a word but more on that later, into a unigram_t, a mapping of one, or more,
graphemes (the smallest unit used in the writing system of a language) to
one or more phonemes (basic unit of a language's phonology).
More specifically the dmp file, which is only a reformatted lm file, contains
n-grams, not for words, but for phonemes and their associated graphemes.
So, for example the lm version of the dmp file would have lines of the form...
...
-5.2688 a|c}AA -0.5826
-5.0135 a|c}AE -1.7016
-5.3657 a|g}AE -0.6989
-3.4066 a|i}EH -1.6086
...
These lines are for 1-grams, and, for example, the line
-5.2688 a|c}AA -0.5826
indicates that the graphemes 'a' or 'c' are pronounced as 'AA' (This sounds
like the beginning of the word "odd") with probability P where the log base
10 of P is -5.2688 The final number -0.5826 is a backoff weight which I will
purposely ignore for now.
The "word" I talked about earlier is not really a word, but is one of the
string entries, including the pipe '|' and '}', from the lm file. So from
our example all the "words" are as follows
...
a|c}AA
a|c}AE
a|g}AE
a|i}EH
...
what the function dict_split_unigram does it to, given one of these words,
"a|c}AA" for example, split it into 'letter' part, in this case "ac", and
a 'phone' part, in this case "AA" then creates a unigram_t from the two
parts as follows
unigram_t unigram = { letter , phone};
and returns this to the caller.
Reporter | ||
Comment 25•9 years ago
|
||
(In reply to Olli Pettay [:smaug] (away-ish until July 26, expect slower than usual reviews) from comment #17)
> Comment on attachment 8635608 [details] [diff] [review]
>
> I can't review the rest without having some kind of spec for the file type.
As Andre hasn't chimed in to explain the file types, I'll try to do so.
The lm file format, that Andre linked to on dropbox, is used to describe n-grams.
There is no "official" description of the file format. But, for example, a good
to adequate description of the lm file format is given here:
http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html
The dmp file format serves the same purpose as the lm file format, describing of
n-grams. However, the dmp file format is binary. The tool Andre mentions above,
sphinx_lm_convert, is used to convert from an lm file to a dmp file.
As to the format of the dmp file, the only "documentation" I know of is the source
of the file that does the conversion from a ngram_model_t to a dmp file:
https://github.com/skerit/cmusphinx/blob/01f79d8bf76f94318b4b9f37ab28fbbe9e78d4bf/multisphinx/sphinxbase/ngram_model_dmp.c#L872
Unfortunately, there doesn't seem to be any real human readable documentation of
the file format.
Reporter | ||
Comment 26•9 years ago
|
||
(In reply to kdavis from comment #23)
> Comment on attachment 8635217 [details] [diff] [review]
> This patch adds arpa model into gecko for g2p generation
>
> From a legal standpoint I guess my question is:
>
> "What is the licensing status of cmudict-g2p.lm?”
>
> You mentioned its the same as Pocketsphinx, but I don’t see where
> one can download cmudict-g2p.lm from CMU with an associated
> license. Can you send out the link?
Flags: needinfo?(anatal)
Assignee | ||
Comment 27•9 years ago
|
||
(In reply to kdavis from comment #24)
> (In reply to Olli Pettay [:smaug] (away-ish until July 26, expect slower
> than usual reviews) from comment #17)
> > Comment on attachment 8635608 [details] [diff] [review]
> >
> > Say, I have no idea what dict_split_unigram should do. I can only read what
> > its current implementation does.
>
> As Andre hasn't chimed in to explain his code, I'll give my take on what the
> function dict_split_unigram does.
I was about to do it, but I talked with Olli by email and he said was in vacations and only next week would do it. But congratulations for your explanation. It is not completely correct but give an idea.
Assignee | ||
Comment 28•9 years ago
|
||
Hi Andrew,
can you please help with this?
Flags: needinfo?(overholt)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(anatal)
Reporter | ||
Comment 29•9 years ago
|
||
(In reply to Andre Natal from comment #27)
> (In reply to kdavis from comment #24)
> > (In reply to Olli Pettay [:smaug] (away-ish until July 26, expect slower
> > than usual reviews) from comment #17)
> > > Comment on attachment 8635608 [details] [diff] [review]
> > >
> > > Say, I have no idea what dict_split_unigram should do. I can only read what
> > > its current implementation does.
> >
> > As Andre hasn't chimed in to explain his code, I'll give my take on what the
> > function dict_split_unigram does.
>
> I was about to do it, but I talked with Olli by email and he said was in
> vacations and only next week would do it. But congratulations for your
> explanation. It is not completely correct but give an idea.
Could you explain what's wrong with my explanation? I think knowing will
help us now and in future with the code.
Reporter | ||
Comment 30•9 years ago
|
||
Comment on attachment 8635877 [details] [diff] [review]
This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
Review of attachment 8635877 [details] [diff] [review]:
-----------------------------------------------------------------
Generally this seems fine
There are a few little changes that should be made
* Removal of some trailing white space
* Light documentation of the new methods (See Comment 17)
* Change signature of dict_get_winner_wid to not take total_unigrams
* Fix the leaks mentioned in my comments
In addition
* Olli should review this code before it is committed
* The toolchain for dmp files for other languages should be made available
::: media/pocketsphinx/src/dict.c
@@ +503,5 @@
> E_INFO_NOFN("\n");
> }
> +
> +int
> +dict_starts_with(const char *pre, const char *str)
Please add a comment describing this function.
Remove the trailing white space.
@@ +510,5 @@
> + return lenstr < lenpre ? 0 : strncmp(pre, str, lenpre) == 0;
> +}
> +
> +unigram_t
> +dict_split_unigram (const char * word)
Please add a comment describing this function.
@@ +530,5 @@
> + token_pos = w;
> + continue;
> + }
> + if (!token_pos)
> + total_letters++;
In a multi-byte encoding this is the total number of bytes, not letters.
@@ +532,5 @@
> + }
> + if (!token_pos)
> + total_letters++;
> + else
> + total_phone++;
In a multi-byte encoding this is the total number of bytes, not letters.
@@ +562,5 @@
> + return unigram;
> +};
> +
> +struct winner_t
> +dict_get_winner_wid(ngram_model_t *model, const char * word_grapheme, glist_t history_list, const int32 total_unigrams,
Please add a comment describing this function
Why does this function take the argument 'total_unigrams'? This value can
be derived from 'model' with a single call to ngram_model_get_counts().
@@ +586,5 @@
> + }
> +
> + for (i = 0; i < total_unigrams; i++) {
> + vocab = ngram_word(model, i);
> + unigram = dict_split_unigram(vocab);
What is the logic of allocating and deallocating unigram.word and unigram.phone
in this loop again and again?
As these are simply the unigrams from the dmp file, wouldn't it make more sense
to allocate then once and be done with it?
@@ +611,5 @@
> + return winner;
> +}
> +
> +char *
> +dict_g2p(char const *word_grapheme, ngram_model_t *ngram_g2p_model)
Please add a comment describing this function
Remove the trailing white space.
@@ +636,5 @@
> + winner = dict_get_winner_wid(ngram_g2p_model, word_grapheme, history_list, *total_unigrams, word_offset);
> + increment = winner.length_match;
> + if (increment == 0) {
> + E_ERROR("Error trying to find matching phoneme (%s) Exiting.. \n" , word_grapheme);
> + return NULL;
Leaking a history_list here
@@ +653,5 @@
> + continue;
> + }
> + word = ngram_word(ngram_g2p_model, gnode_int32(gn));
> +
> + if (!word)
Why would 'word' be null here? Didn't we construct history list only
from elements in ngram_gp2_model?
@@ +681,5 @@
> + return final_phone;
> +}
> +
> +int
> +dict_add_g2p_word(dict_t * dict, char const *word)
Please add a comment describing this function
::: media/pocketsphinx/src/dict.h
@@ +115,5 @@
> * Return ptr to dict_t if successful, NULL otherwise.
> */
> dict_t *dict_init(cmd_ln_t *config, /**< Configuration (-dict, -fdict, -dictcase) or NULL */
> + bin_mdef_t *mdef, /**< For looking up CI phone IDs (or NULL) */
> + logmath_t *logmath // To load ngram_model for g2p load
Can you document that logmath must be retained with logmath_retain()
if it is to be used elsewhere.
@@ +218,5 @@
> void dict_report(dict_t *d /**< A dictionary structure */
> );
>
> +// g2p functions
> +int dict_add_g2p_word(dict_t * dict, char const *word);
For consistency use
int dict_add_g2p_word(dict_t *dict, char const *word);
and not
int dict_add_g2p_word(dict_t * dict, char const *word);
Attachment #8635877 -
Flags: review?(kdavis) → review+
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(overholt)
Reporter | ||
Comment 31•9 years ago
|
||
(In reply to kdavis from comment #26)
> (In reply to kdavis from comment #23)
> > Comment on attachment 8635217 [details] [diff] [review]
> > This patch adds arpa model into gecko for g2p generation
> >
> > From a legal standpoint I guess my question is:
> >
> > "What is the licensing status of cmudict-g2p.lm?”
> >
> > You mentioned its the same as Pocketsphinx, but I don’t see where
> > one can download cmudict-g2p.lm from CMU with an associated
> > license. Can you send out the link?
Just to make sure there are no legal problems, I think we should double
check the licensing status of cmudict-g2p.lm and the associated dmp.
So, I'm going to needinfo Andrew on this again, because, as far as I can
tell, there was no concrete resolution.
Flags: needinfo?(overholt)
Comment 32•9 years ago
|
||
(In reply to kdavis from comment #31)
> (In reply to kdavis from comment #26)
> > (In reply to kdavis from comment #23)
> > > Comment on attachment 8635217 [details] [diff] [review]
> > > This patch adds arpa model into gecko for g2p generation
> > >
> > > From a legal standpoint I guess my question is:
> > >
> > > "What is the licensing status of cmudict-g2p.lm?”
> > >
> > > You mentioned its the same as Pocketsphinx, but I don’t see where
> > > one can download cmudict-g2p.lm from CMU with an associated
> > > license. Can you send out the link?
>
> Just to make sure there are no legal problems, I think we should double
> check the licensing status of cmudict-g2p.lm and the associated dmp.
>
> So, I'm going to needinfo Andrew on this again, because, as far as I can
> tell, there was no concrete resolution.
If you are thinking about the legal situation, I suggest we ask Gerv.
Flags: needinfo?(overholt) → needinfo?(gerv)
Assignee | ||
Comment 33•9 years ago
|
||
(In reply to kdavis from comment #30)
> Comment on attachment 8635877 [details] [diff] [review]
> This patch introduces new g2p algorithm into pocketsphinx to allow out of
> dictionary words to be added to grammars
>
> Review of attachment 8635877 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Generally this seems fine
>
> There are a few little changes that should be made
>
> * Removal of some trailing white space
Fixed
> * Light documentation of the new methods (See Comment 17)
Done
> * Change signature of dict_get_winner_wid to not take total_unigrams
Fixed
> * Fix the leaks mentioned in my comments
Fixed
> In addition Olli should review this code before it is committed
Ok, sure.
> * The toolchain for dmp files for other languages should be made available
>
Ok, created. Please refer bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1184849 and repo: https://github.com/mozilla/g2p/
> ::: media/pocketsphinx/src/dict.c
> @@ +503,5 @@
> > E_INFO_NOFN("\n");
> > }
> > +
> > +int
> > +dict_starts_with(const char *pre, const char *str)
>
> Please add a comment describing this function.
>
> Remove the trailing white space.
Done
> @@ +510,5 @@
> > + return lenstr < lenpre ? 0 : strncmp(pre, str, lenpre) == 0;
> > +}
> > +
> > +unigram_t
> > +dict_split_unigram (const char * word)
>
> Please add a comment describing this function.
>
Done
> @@ +530,5 @@
> > + token_pos = w;
> > + continue;
> > + }
> > + if (!token_pos)
> > + total_letters++;
>
> In a multi-byte encoding this is the total number of bytes, not letters.
>
> @@ +532,5 @@
> > + }
> > + if (!token_pos)
> > + total_letters++;
> > + else
> > + total_phone++;
>
> In a multi-byte encoding this is the total number of bytes, not letters.
>
Correct, Thanks. I changed it to total_graphemes, that makes more sense than bytes and letters
> @@ +562,5 @@
> > + return unigram;
> > +};
> > +
> > +struct winner_t
> > +dict_get_winner_wid(ngram_model_t *model, const char * word_grapheme, glist_t history_list, const int32 total_unigrams,
>
> Please add a comment describing this function
>
Done.
> Why does this function take the argument 'total_unigrams'? This value can
> be derived from 'model' with a single call to ngram_model_get_counts().
>
> @@ +586,5 @@
> > + }
> > +
> > + for (i = 0; i < total_unigrams; i++) {
Correct, thanks, just fixed.
> > + vocab = ngram_word(model, i);
> > + unigram = dict_split_unigram(vocab);
>
> What is the logic of allocating and deallocating unigram.word and
> unigram.phone
> in this loop again and again?
>
> As these are simply the unigrams from the dmp file, wouldn't it make more
> sense
> to allocate then once and be done with it?
>
The unigrams are already allocated inside ngram_model_t *model object.
The reason of the loop is to search for the next winner unigram at that position in the word. This is the reason the strucuture unigram is deallocated on the end of the loop and allocated again when next unigram from the model ( vocab = ngram_word(model, i);) comes to the challenge with others.. and this continues until the last unigram from the model compete.
> @@ +611,5 @@
> > + return winner;
> > +}
> > +
> > +char *
> > +dict_g2p(char const *word_grapheme, ngram_model_t *ngram_g2p_model)
>
> Please add a comment describing this function
>
> Remove the trailing white space.
Ok, done.
>
> @@ +636,5 @@
> > + winner = dict_get_winner_wid(ngram_g2p_model, word_grapheme, history_list, *total_unigrams, word_offset);
> > + increment = winner.length_match;
> > + if (increment == 0) {
> > + E_ERROR("Error trying to find matching phoneme (%s) Exiting.. \n" , word_grapheme);
> > + return NULL;
>
> Leaking a history_list here
Fixed, thank you.
>
> @@ +653,5 @@
> > + continue;
> > + }
> > + word = ngram_word(ngram_g2p_model, gnode_int32(gn));
> > +
> > + if (!word)
>
> Why would 'word' be null here? Didn't we construct history list only
> from elements in ngram_gp2_model?
>
We can have winners with no phoneme in the case of models generated with errors, for example. In this particular case, if I have a wid with problem that does not match a unigram from ngram_word , I remove it from the composition of the final phoneme word.
> @@ +681,5 @@
> > + return final_phone;
> > +}
> > +
> > +int
> > +dict_add_g2p_word(dict_t * dict, char const *word)
>
> Please add a comment describing this function
>
Ok, done.
> ::: media/pocketsphinx/src/dict.h
> @@ +115,5 @@
> > * Return ptr to dict_t if successful, NULL otherwise.
> > */
> > dict_t *dict_init(cmd_ln_t *config, /**< Configuration (-dict, -fdict, -dictcase) or NULL */
> > + bin_mdef_t *mdef, /**< For looking up CI phone IDs (or NULL) */
> > + logmath_t *logmath // To load ngram_model for g2p load
>
> Can you document that logmath must be retained with logmath_retain()
> if it is to be used elsewhere.
>
Done.
> @@ +218,5 @@
> > void dict_report(dict_t *d /**< A dictionary structure */
> > );
> >
> > +// g2p functions
> > +int dict_add_g2p_word(dict_t * dict, char const *word);
>
> For consistency use
>
> int dict_add_g2p_word(dict_t *dict, char const *word);
>
> and not
>
> int dict_add_g2p_word(dict_t * dict, char const *word);
Thanks for noticing this, fixed.
Assignee | ||
Comment 34•9 years ago
|
||
Attachment #8635877 -
Attachment is obsolete: true
Attachment #8635877 -
Flags: review?(bugs)
Attachment #8639696 -
Flags: review?(bugs)
Assignee | ||
Comment 35•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #22)
> (In reply to Andre Natal from comment #18)
> > I compiled the .dmp using sphinx_lm_convert from its text version [1] that
> > was generated from standard pocketsphinx language model using phonetisaurus.
> Would it be possible to have also the text version in our repository?
> Or perhaps only that and run sphinx_lm_convert when compiling Gecko?
Olli, we have a repo with the models and the toolchain to generate them: https://github.com/mozilla/g2p/
Ships the binaries is better. Generate them gecko's compilation is impracticable, since depends from a lot of other stuff.
Assignee | ||
Comment 36•9 years ago
|
||
Attachment #8639696 -
Attachment is obsolete: true
Attachment #8639696 -
Flags: review?(bugs)
Attachment #8639704 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8639704 -
Attachment description: 0001-Bug-1180113-Part-2-Introducing-g2p-algorithm-inside-.patch → This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
Comment 37•9 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #32)
> If you are thinking about the legal situation, I suggest we ask Gerv.
Would someone be kind enough to clearly define the question, along with links to relevant source code / license text? Thanks :-)
Gerv
Flags: needinfo?(gerv)
Reporter | ||
Comment 38•9 years ago
|
||
(In reply to Gervase Markham [:gerv] from comment #37)
> (In reply to Andrew Overholt [:overholt] from comment #32)
> > If you are thinking about the legal situation, I suggest we ask Gerv.
>
> Would someone be kind enough to clearly define the question, along with
> links to relevant source code / license text? Thanks :-)
>
> Gerv
This question is really the same question you just answered here[1].
So, you answered the question here without even knowing it.
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1184849#c7
Reporter | ||
Comment 39•9 years ago
|
||
I guess in light of Bug 1184849 Comment7 from Gerv we should
regenerate the dmp file before this is committed.
Comment 40•9 years ago
|
||
Comment on attachment 8639704 [details] [diff] [review]
This patch introduces new g2p algorithm into pocketsphinx to allow out of dictionary words to be added to grammars
Sorry about delay with this stuff. vacation and tons of other stuff to review.
>+++ b/media/pocketsphinx/src/dict.c
>@@ -8,27 +8,27 @@
> * are met:
> *
> * 1. Redistributions of source code must retain the above copyright
>- * notice, this list of conditions and the following disclaimer.
>+ * notice, this list of conditions and the following disclaimer.
> *
> * 2. Redistributions in binary form must reproduce the above copyright
> * notice, this list of conditions and the following disclaimer in
> * the documentation and/or other materials provided with the
> * distribution.
> *
>- * This work was supported in part by funding from the Defense Advanced
>- * Research Projects Agency and the National Science Foundation of the
>+ * This work was supported in part by funding from the Defense Advanced
>+ * Research Projects Agency and the National Science Foundation of the
> * United States of America, and the CMU Sphinx Speech Consortium.
> *
>- * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
>- * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
>+ * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
>+ * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
> * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> *
Random whitespace change here. Please drop them. pocketsphinx is external code, so
better to try to keep our modifications to it minimal.
> * ====================================================================
>@@ -51,7 +51,7 @@
>
> #if WIN32
> #define snprintf sprintf_s
>-#endif
>+#endif
same also here.
>+// This function splits an unigram received (in format e|w}UW) and return a structure
>+// containing two fields: the grapheme (before }) in unigram.word and the phoneme (after }) unigram.phone
>+unigram_t
>+dict_split_unigram (const char * word)
nit, for consistency with the existing code
dict_split_unigram(const char *word)
>+ letter = ckd_calloc(1,total_graphemes+1);
based on the existing code, the coding style should be
letter = ckd_calloc(1, total_graphemes + 1);
>+ letter[w-add] = word[w];
and here
letter[w - add] = word[w];
consistency in coding style greatly helps with code readability.
>+// This function calculates the most likely unigram to appear in the current position at the word
>+// based on the three latest chosen/winners unigrams (history) and return a structure containing
>+// the word id (wid), and lengths of the phoneme and the word
>+struct winner_t
>+dict_get_winner_wid(ngram_model_t *model, const char * word_grapheme, glist_t history_list, int word_offset)
const char *word_grapheme
>+{
>+ int32 current_prob = -2147483647;
Why not initialize to LONG_MIN
>+ int32 ngram_order = ngram_model_get_size(model);
>+ int32 *history = ckd_calloc((size_t)ngram_order+1, sizeof(int32));
space before and after +
but it isn't clear to me why + 1 (and why we start assigning value in the for-loop below from index ngram_order, and not
ngram_order - 1 which would be the normal last element of an array without this + 1 here.)
Please explain why this way, or remove + 1 and start indexing from ngram_order - j - 1
(and hopefully ngram_order isn't anything too huge)
>+ for (gn = history_list; gn; gn = gnode_next(gn)) {
>+ history[ngram_order-j] = gnode_int32(gn);
Please add a comment why we build history from last to first.
(and if gn is null before history is full, history starts with some 0 entries. I assume ngram_ng_prob can deal with that.)
>+ if (unigram.word)
>+ ckd_free(unigram.word);
>+ if (unigram.phone)
>+ ckd_free(unigram.phone);
I would have added a helper function to free unigram_t members and the use the method in
dict_get_winner_wid and dict_g2p. That would simplify the code a bit.
Something like
void
free_unigram_t(unigram_t *unigram)
{
ckd_free(unigram->word);
ckd_free(unigram->phone);
}
(free is null safe)
>+ if (dict_starts_with(unigram.word, sub)){
existing coding style seems to hint there should be space between ) and {
>+ struct winner_t winner;
>+ int32 i = 0, j = 0;
>+ int nused;
>+ int32 ngram_order = ngram_model_get_size(model);
>+ int32 *history = ckd_calloc((size_t)ngram_order+1, sizeof(int32));
again some missing spaces before and after +
>+// This functions manages the winner unigrams and build the history of winners to properly generate the final phoneme.
s/build/builds/
>+ In the first part,
>+// I gets the most likely unigrams which graphemes compose the word and build a history of wids that is used in this search. In second part, the we
s/I/It/
>+// This functions just receives the dict lacking word from fsg_search, call the main function dict_g2p, and then add the word to the memory dict.
s/functions/function/
s/add/adds/
>+// The second part of this function is the same as pocketsphinx.c: https://github.com/cmusphinx/pocketsphinx/blob/ba6bd21b3601339646d2db6d2297d02a8a6b7029/src/libpocketsphinx/pocketsphinx.c#L816
Could we not share the implementation then?
But I guess this is fine too.
> * are met:
> *
> * 1. Redistributions of source code must retain the above copyright
>- * notice, this list of conditions and the following disclaimer.
>+ * notice, this list of conditions and the following disclaimer.
> *
> * 2. Redistributions in binary form must reproduce the above copyright
> * notice, this list of conditions and the following disclaimer in
> * the documentation and/or other materials provided with the
> * distribution.
> *
>- * This work was supported in part by funding from the Defense Advanced
>- * Research Projects Agency and the National Science Foundation of the
>+ * This work was supported in part by funding from the Defense Advanced
>+ * Research Projects Agency and the National Science Foundation of the
> * United States of America, and the CMU Sphinx Speech Consortium.
> *
>- * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
>- * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
>+ * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
>+ * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
> * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> *
> * ====================================================================
>@@ -39,11 +39,12 @@
> #define _S3_DICT_H_
>
> /** \file dict.h
>- * \brief Operations on dictionary.
>+ * \brief Operations on dictionary.
No need for all these whitespace changes
>
>-/**
>+/**
> \struct dictword_t
>- \brief a structure for one dictionary word.
>+ \brief a structure for one dictionary word.
> */
> typedef struct {
> char *word; /**< Ascii word string */
>@@ -68,9 +69,9 @@ typedef struct {
> s3wid_t basewid; /**< Base pronunciation id */
> } dictword_t;
>
>-/**
>+/**
> \struct dict_t
>- \brief a structure for a dictionary.
>+ \brief a structure for a dictionary.
> */
>
not all these whitespace changes
r+ with all those fixed/explained
Attachment #8639704 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 41•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #40)
> Comment on attachment 8639704 [details] [diff] [review]
> This patch introduces new g2p algorithm into pocketsphinx to allow out of
> dictionary words to be added to grammars
>
> Sorry about delay with this stuff. vacation and tons of other stuff to
> review.
>
> >+++ b/media/pocketsphinx/src/dict.c
> >@@ -8,27 +8,27 @@
> > * are met:
> > *
> > * 1. Redistributions of source code must retain the above copyright
> >- * notice, this list of conditions and the following disclaimer.
> >+ * notice, this list of conditions and the following disclaimer.
> > *
> > * 2. Redistributions in binary form must reproduce the above copyright
> > * notice, this list of conditions and the following disclaimer in
> > * the documentation and/or other materials provided with the
> > * distribution.
> > *
> >- * This work was supported in part by funding from the Defense Advanced
> >- * Research Projects Agency and the National Science Foundation of the
> >+ * This work was supported in part by funding from the Defense Advanced
> >+ * Research Projects Agency and the National Science Foundation of the
> > * United States of America, and the CMU Sphinx Speech Consortium.
> > *
> >- * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
> >- * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> >+ * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
> >+ * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> > * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> > * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
> > * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> >- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> >+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > *
> Random whitespace change here. Please drop them. pocketsphinx is external
> code, so
> better to try to keep our modifications to it minimal.
>
>
Ok. I changed my editor to remove every trailing space after the requests on this comments (https://bugzilla.mozilla.org/show_bug.cgi?id=1180113#c30) and did mess with everything. I wasn't expecting their code to have trailing spaces.
>
> > * ====================================================================
> >@@ -51,7 +51,7 @@
> >
> > #if WIN32
> > #define snprintf sprintf_s
> >-#endif
> >+#endif
> same also here.
>
Fixed.
> >+// This function splits an unigram received (in format e|w}UW) and return a structure
> >+// containing two fields: the grapheme (before }) in unigram.word and the phoneme (after }) unigram.phone
> >+unigram_t
> >+dict_split_unigram (const char * word)
> nit, for consistency with the existing code
> dict_split_unigram(const char *word)
>
fixed
> >+ letter = ckd_calloc(1,total_graphemes+1);
> based on the existing code, the coding style should be
> letter = ckd_calloc(1, total_graphemes + 1);
>
fixed, thanks.
> >+ letter[w-add] = word[w];
> and here
> letter[w - add] = word[w];
>
> consistency in coding style greatly helps with code readability.
>
>
> >+// This function calculates the most likely unigram to appear in the current position at the word
> >+// based on the three latest chosen/winners unigrams (history) and return a structure containing
> >+// the word id (wid), and lengths of the phoneme and the word
> >+struct winner_t
> >+dict_get_winner_wid(ngram_model_t *model, const char * word_grapheme, glist_t history_list, int word_offset)
> const char *word_grapheme
>
> >+{
> >+ int32 current_prob = -2147483647;
> Why not initialize to LONG_MIN
>
Ok, but I needed to add #include <limits.h>.
> >+ int32 ngram_order = ngram_model_get_size(model);
> >+ int32 *history = ckd_calloc((size_t)ngram_order+1, sizeof(int32));
> space before and after +
> but it isn't clear to me why + 1 (and why we start assigning value in the
> for-loop below from index ngram_order, and not
> ngram_order - 1 which would be the normal last element of an array without
> this + 1 here.)
> Please explain why this way, or remove + 1 and start indexing from
> ngram_order - j - 1
>
> (and hopefully ngram_order isn't anything too huge)
>
ok, I changed to this way. ngram_order in this case is 3, but in the future can go expand but not too much.
> >+ for (gn = history_list; gn; gn = gnode_next(gn)) {
> >+ history[ngram_order-j] = gnode_int32(gn);
> Please add a comment why we build history from last to first.
> (and if gn is null before history is full, history starts with some 0
> entries. I assume ngram_ng_prob can deal with that.)
>
>
Yes, ngram_ng_prob can deal with that. I added the comment (we need to build history from last to first because glist returns itens from last to first)
> >+ if (unigram.word)
> >+ ckd_free(unigram.word);
> >+ if (unigram.phone)
> >+ ckd_free(unigram.phone);
> I would have added a helper function to free unigram_t members and the use
> the method in
> dict_get_winner_wid and dict_g2p. That would simplify the code a bit.
>
> Something like
> void
> free_unigram_t(unigram_t *unigram)
> {
> ckd_free(unigram->word);
> ckd_free(unigram->phone);
> }
>
> (free is null safe)
>
Good idea, added it. Thanks
>
> >+ if (dict_starts_with(unigram.word, sub)){
> existing coding style seems to hint there should be space between ) and {
>
thanks, fixed.
> >+ struct winner_t winner;
> >+ int32 i = 0, j = 0;
> >+ int nused;
> >+ int32 ngram_order = ngram_model_get_size(model);
> >+ int32 *history = ckd_calloc((size_t)ngram_order+1, sizeof(int32));
> again some missing spaces before and after +
>
Ok, but I removed the +1 following your suggestion above.
> >+// This functions manages the winner unigrams and build the history of winners to properly generate the final phoneme.
> s/build/builds/
>
fixed
> >+ In the first part,
> >+// I gets the most likely unigrams which graphemes compose the word and build a history of wids that is used in this search. In second part, the we
> s/I/It/
fixed
> >+// This functions just receives the dict lacking word from fsg_search, call the main function dict_g2p, and then add the word to the memory dict.
> s/functions/function/
> s/add/adds/
>
fixed
> >+// The second part of this function is the same as pocketsphinx.c: https://github.com/cmusphinx/pocketsphinx/blob/ba6bd21b3601339646d2db6d2297d02a8a6b7029/src/libpocketsphinx/pocketsphinx.c#L816
> Could we not share the implementation then?
> But I guess this is fine too.
We can't since only some part is shared. The main function have more code.
> > * are met:
> > *
> > * 1. Redistributions of source code must retain the above copyright
> >- * notice, this list of conditions and the following disclaimer.
> >+ * notice, this list of conditions and the following disclaimer.
> > *
> > * 2. Redistributions in binary form must reproduce the above copyright
> > * notice, this list of conditions and the following disclaimer in
> > * the documentation and/or other materials provided with the
> > * distribution.
> > *
> >- * This work was supported in part by funding from the Defense Advanced
> >- * Research Projects Agency and the National Science Foundation of the
> >+ * This work was supported in part by funding from the Defense Advanced
> >+ * Research Projects Agency and the National Science Foundation of the
> > * United States of America, and the CMU Sphinx Speech Consortium.
> > *
> >- * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
> >- * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> >+ * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
> >+ * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> > * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> > * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
> > * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> >- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> >+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > *
> > * ====================================================================
> >@@ -39,11 +39,12 @@
> > #define _S3_DICT_H_
> >
> > /** \file dict.h
> >- * \brief Operations on dictionary.
> >+ * \brief Operations on dictionary.
> No need for all these whitespace changes
>
Removed
> >
> >-/**
> >+/**
> > \struct dictword_t
> >- \brief a structure for one dictionary word.
> >+ \brief a structure for one dictionary word.
> > */
> > typedef struct {
> > char *word; /**< Ascii word string */
> >@@ -68,9 +69,9 @@ typedef struct {
> > s3wid_t basewid; /**< Base pronunciation id */
> > } dictword_t;
> >
> >-/**
> >+/**
> > \struct dict_t
> >- \brief a structure for a dictionary.
> >+ \brief a structure for a dictionary.
> > */
> >
> not all these whitespace changes
>
>
Removed
>
> r+ with all those fixed/explained
Assignee | ||
Comment 42•9 years ago
|
||
We did too many deep changes, so I created a new try before commit
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f548fe8fdb40
Assignee | ||
Comment 43•9 years ago
|
||
Attachment #8639704 -
Attachment is obsolete: true
Assignee | ||
Comment 44•9 years ago
|
||
This is the arpa model generated by mitlm
Attachment #8635217 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8643527 -
Flags: review+
Assignee | ||
Updated•9 years ago
|
Attachment #8643528 -
Flags: review+
Reporter | ||
Comment 45•9 years ago
|
||
Did you see the B2G Desktop OS X debug build failure?
Assignee | ||
Comment 46•9 years ago
|
||
This is the right Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ece372204e9b
Assignee | ||
Comment 47•9 years ago
|
||
(In reply to kdavis from comment #45)
> Did you see the B2G Desktop OS X debug build failure?
That was already fixed. See https://bugzilla.mozilla.org/show_bug.cgi?id=1180113#c46
Reporter | ||
Comment 48•9 years ago
|
||
Cool. I only saw the try from comment 42
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 49•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/5eafba8094d0
https://hg.mozilla.org/integration/mozilla-inbound/rev/9238f5ea7bde
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/5eafba8094d0
https://hg.mozilla.org/mozilla-central/rev/9238f5ea7bde
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox42:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
You need to log in
before you can comment on or make changes to this bug.
Description
•