Expose SpeechRecognition to the web
Categories
(Core :: Web Speech, enhancement, P1)
Tracking
()
People
(Reporter: sebo, Assigned: anatal, NeedInfo)
References
(Depends on 9 open bugs, Blocks 4 open bugs, )
Details
(Keywords: dev-doc-needed, feature, Whiteboard: [webcompat:p1])
Attachments
(1 file, 6 obsolete files)
(deleted),
text/x-phabricator-request
|
Details |
Updated•9 years ago
|
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment 4•8 years ago
|
||
Reporter | ||
Comment 5•8 years ago
|
||
Updated•8 years ago
|
Updated•8 years ago
|
Comment 7•8 years ago
|
||
Updated•8 years ago
|
Comment 8•8 years ago
|
||
Comment 9•8 years ago
|
||
Comment 10•8 years ago
|
||
Reporter | ||
Comment 11•8 years ago
|
||
Comment 12•8 years ago
|
||
Comment 13•8 years ago
|
||
Comment 14•8 years ago
|
||
Comment 15•8 years ago
|
||
Comment 16•8 years ago
|
||
Updated•8 years ago
|
Assignee | ||
Comment 17•8 years ago
|
||
Updated•7 years ago
|
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
Comment 20•7 years ago
|
||
Comment 21•7 years ago
|
||
Comment 22•7 years ago
|
||
Comment 23•7 years ago
|
||
Comment 24•7 years ago
|
||
Comment 25•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 28•6 years ago
|
||
Comment 29•6 years ago
|
||
Assignee | ||
Comment 30•6 years ago
|
||
Comment 31•6 years ago
|
||
Comment 32•6 years ago
|
||
Reporter | ||
Comment 33•6 years ago
|
||
Comment 34•6 years ago
|
||
Reporter | ||
Comment 35•6 years ago
|
||
Comment 36•6 years ago
|
||
Reporter | ||
Comment 37•6 years ago
|
||
Comment 38•6 years ago
|
||
Assignee | ||
Comment 39•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 40•6 years ago
|
||
Updated•6 years ago
|
Comment 42•6 years ago
|
||
Assignee | ||
Comment 43•6 years ago
|
||
Assignee | ||
Comment 44•6 years ago
|
||
Assignee | ||
Comment 45•6 years ago
|
||
Assignee | ||
Comment 46•6 years ago
|
||
Assignee | ||
Comment 47•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 48•6 years ago
|
||
Comment 49•6 years ago
|
||
Comment 50•6 years ago
|
||
Comment 51•6 years ago
|
||
Updated•6 years ago
|
Assignee | ||
Comment 53•6 years ago
|
||
Comment 54•6 years ago
|
||
Updated•6 years ago
|
Comment 55•6 years ago
|
||
Updated•6 years ago
|
Comment 56•6 years ago
|
||
Andre, Oli, and Andreas what's the current status of this?
It looks like Andreas' comments in his last review have not been addressed.
Alex was thinking about finishing this patch and integrating Deep Speech too.
Should he just jump in, fix Andreas' request for changes, and do it?
Comment 57•6 years ago
|
||
Andre is driving this so check with him. Last I heard he was working on setting up a mock server with the mochitests, but then he has gotten disrupted by other work a number of times too.
I'm happy to review any media bits, whoever writes them.
Comment 58•6 years ago
|
||
Many of the Andreas' comments were already addressed after the last All Hands, including a major refactor of the media part. As he said, I was working on the mochitests and also finding the root of a memory leak that occurs when closing the tabs while the microphone capture is in process.
If the ultimate goal is to integrate Deep Speech, I believe a better use for Alex' time would be to work in the backend instead the frontend being discussed here, since they should be totally decoupled, i.e, finish the docker containing deepspeech and deploy it to Mozilla's services cloud infrastructure, for online decoding, and/or, create another bug and patch just to integrate deepspeech's inference stack and models into gecko, plus the HTTP service which will receive the requests from the frontend here, in the case of offline.
Both are still missing and currently needs more attention than the patch being worked here, which is already functional if manually applied to Gecko.
Comment 59•6 years ago
|
||
create another bug and patch just to integrate deepspeech's inference stack and models into gecko
For offline there is already #1474084
Comment 60•6 years ago
|
||
(In reply to Andre Natal from comment #58)
If the ultimate goal is to integrate Deep Speech, I believe a better use for Alex' time would be to work in the backend instead the frontend being discussed here, since they should be totally decoupled, i.e, finish the docker containing deepspeech and deploy it to Mozilla's services cloud infrastructure, for online decoding, and/or, create another bug and patch just to integrate deepspeech's inference stack and models into gecko, plus the HTTP service which will receive the requests from the frontend here, in the case of offline.
There's not that much to finish, it's working, and is iso-feature to the other implem served by speech proxy. I guess it would be more a question of production deployment etc. :)
Comment 61•6 years ago
|
||
(In reply to Andre Natal from comment #58)
If the ultimate goal is to integrate Deep Speech, I believe a better use for Alex' time would be to work in the backend instead the frontend being discussed here, since they should be totally decoupled, i.e, finish the docker containing deepspeech and deploy it to Mozilla's services cloud infrastructure, for online decoding, and/or, create another bug and patch just to integrate deepspeech's inference stack and models into gecko, plus the HTTP service which will receive the requests from the frontend here, in the case of offline.
Both are still missing and currently needs more attention than the patch being worked here, which is already functional if manually applied to Gecko.
The Deep Speech backend exists already here [https://gitlab.com/deepspeech/ds-srv].
The associated Docker file is there too [https://gitlab.com/deepspeech/ds-srv/blob/master/Dockerfile.gpu]
So there is no blocker in that regard.
However, Alex and I are interested in bringing STT on device, bug 1474084, so no servers are required, and Alex wants to resolve this bug and bug 1474084.
Comment 62•6 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #60)
(In reply to Andre Natal from comment #58)
If the ultimate goal is to integrate Deep Speech, I believe a better use for Alex' time would be to work in the backend instead the frontend being discussed here, since they should be totally decoupled, i.e, finish the docker containing deepspeech and deploy it to Mozilla's services cloud infrastructure, for online decoding, and/or, create another bug and patch just to integrate deepspeech's inference stack and models into gecko, plus the HTTP service which will receive the requests from the frontend here, in the case of offline.
There's not that much to finish, it's working, and is iso-feature to the other implem served by speech proxy. I guess it would be more a question of production deployment etc. :)
Okay, I'll create a thread including you and Mozilla Services to roll that out to production
Comment 63•6 years ago
|
||
(In reply to kdavis from comment #61)
(In reply to Andre Natal from comment #58)
If the ultimate goal is to integrate Deep Speech, I believe a better use for Alex' time would be to work in the backend instead the frontend being discussed here, since they should be totally decoupled, i.e, finish the docker containing deepspeech and deploy it to Mozilla's services cloud infrastructure, for online decoding, and/or, create another bug and patch just to integrate deepspeech's inference stack and models into gecko, plus the HTTP service which will receive the requests from the frontend here, in the case of offline.
Both are still missing and currently needs more attention than the patch being worked here, which is already functional if manually applied to Gecko.
The Deep Speech backend exists already here [https://gitlab.com/deepspeech/ds-srv].
The associated Docker file is there too [https://gitlab.com/deepspeech/ds-srv/blob/master/Dockerfile.gpu]
So there is no blocker in that regard.
However, Alex and I are interested in bringing STT on device, bug 1474084, so no servers are required, and Alex wants to resolve this bug and bug 1474084.
Cool, so is better to work on the tidbits to integrate deepspeech into gecko on bug 1474084 instead this one.
The goal of the patch here is to create an agnostic frontend which can communicate to whichever decoder through an HTTP Rest API, regardless of online or offline.
If the goal is to create a local deepspeech speech server exposed via http, you can use this as a frontend, but if the goal is to do something different, like for example injecting the frames directly into the inference stack, then is better to create a completely new SpeechRecognitionService instead injecting decoder specific code in this patch.
Comment 64•6 years ago
|
||
(In reply to Andre Natal from comment #63)
The goal of the patch here is to create an agnostic frontend which can communicate to whichever decoder through an HTTP Rest API, regardless of online or offline.
We want to use REST for offline?
Comment 65•6 years ago
|
||
I meant if you add a local http service to DEFAULT_RECOGNITION_ENDPOINT, it should work.
If that's not your goal, you should work on a whole new SpeechRecognitionService containing deep speech specific code on bug 1474084.
Hope that helps.
Comment 66•6 years ago
|
||
(not sure I have anything to say here. I'm just reviewing whatever is dumped to my review queue ;)
And happy to review DOM/Gecko side of this)
Comment 67•6 years ago
|
||
(In reply to Andre Natal from comment #62)
Okay, I'll create a thread including you and Mozilla Services to roll that out to production
Could you put me on CC. Thanks
Assignee | ||
Comment 69•6 years ago
|
||
Assignee | ||
Comment 70•6 years ago
|
||
This patch introduces a Speech Recognition Service which interfaces with Mozilla's remote STT endpoint which is currently being used by multiple services
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 71•6 years ago
|
||
try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c95e0fea71e&selectedJob=238007445
Please note that try is now displaying the leaks described on the blocking bugs I opened yesterday
https://bugzilla.mozilla.org/show_bug.cgi?id=1541298
https://bugzilla.mozilla.org/show_bug.cgi?id=1541290
Comment 72•6 years ago
|
||
See bug 1547409. Migrating webcompat priority whiteboard tags to project flags.
Assignee | ||
Updated•6 years ago
|
Updated•5 years ago
|
Comment 73•5 years ago
|
||
For testing reference, there are web platform tests for SpeechRecognition, though many aren't all appearing on wpt.fyi and it's unclear to me how good the coverage is. It appears that Blink has a few extra tests.
Andre, is this still on track for 70?
And, can you suggest a release note (either for 70 or for whatever future release this ends up in)? Thanks!
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Affects Firefox for Android]:
[Suggested wording]:
[Links (documentation, blog post, etc)]:
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 75•5 years ago
|
||
(In reply to Liz Henry (:lizzard) from comment #74)
Andre, is this still on track for 70?
And, can you suggest a release note (either for 70 or for whatever future release this ends up in)? Thanks!
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Affects Firefox for Android]:
[Suggested wording]:
[Links (documentation, blog post, etc)]:
Hi Liz, we moved it to 71. It's possible to track this there?
Thanks!
Comment 76•5 years ago
|
||
Updated to track our 71 release. André, will this need a mention in our release notes, a blog post or a mention on our Nightly twitter account to ask our core community to test it? Thanks
Updated•5 years ago
|
Updated•5 years ago
|
Comment 77•5 years ago
|
||
(In reply to Pascal Chevrel:pascalc from comment #76)
Updated to track our 71 release. André, will this need a mention in our release notes, a blog post or a mention on our Nightly twitter account to ask our core community to test it? Thanks
It's not going to ride the trains. The plan is to hold the feature in Nightly. I guess that means it doesn't go into the release notes. But asking for folks to flip the pref and test it in Nightly would be good.
Comment 78•5 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #77)
(In reply to Pascal Chevrel:pascalc from comment #76)
Updated to track our 71 release. André, will this need a mention in our release notes, a blog post or a mention on our Nightly twitter account to ask our core community to test it? Thanks
It's not going to ride the trains. The plan is to hold the feature in Nightly. I guess that means it doesn't go into the release notes. But asking for folks to flip the pref and test it in Nightly would be good.
We have release notes for the Nightly channel https://www.mozilla.org/en-US/firefox/71.0a1/releasenotes/
Comment 79•5 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #77)
It's not going to ride the trains. The plan is to hold the feature in Nightly. I guess that means it doesn't go into the release notes. But asking for folks to flip the pref and test it in Nightly would be good.
Are there any instructions for testing this in nightly? I'm a web developer who runs Firefox Nightly and has given several talks on the Web Speech API. Very keen to help with testing.
Assignee | ||
Comment 80•5 years ago
|
||
Hi Pascal, yes, let's do it, but let's wait until we have the code fully merged to start this discussion. Currently the date of landing is still uncertain since the code wasn't fully reviewed yet.
Assignee | ||
Comment 81•5 years ago
|
||
(In reply to jason.oneil from comment #79)
(In reply to Nils Ohlmeier [:drno] from comment #77)
It's not going to ride the trains. The plan is to hold the feature in Nightly. I guess that means it doesn't go into the release notes. But asking for folks to flip the pref and test it in Nightly would be good.
Are there any instructions for testing this in nightly? I'm a web developer who runs Firefox Nightly and has given several talks on the Web Speech API. Very keen to help with testing.
Hi Jason,
the API hasn't landed and isn't available in Nightly yet , but as soon there is, to enable it will be just matter of switching a couple of flags on (if at all)
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 82•5 years ago
|
||
We will land this in Nightly 72.
Assignee | ||
Updated•5 years ago
|
Comment 83•5 years ago
|
||
Pushed by nbeleuzu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e322e2112b1f
Introducing an online speech recognition service for Web Speech API r=smaug,pehrsons,padenot
Comment 84•5 years ago
|
||
bugherder |
Comment 85•5 years ago
|
||
WOOHOOO!! Congratulations everyone! So excited to see this ship after filing this bug four years ago (whoa, time flies!).
Comment 86•5 years ago
|
||
(er, duped, but filed a bunch of these originally, so seeing the notifications coming in!!)
Updated•5 years ago
|
Updated•5 years ago
|
Comment 88•5 years ago
|
||
Is the speech recognition code free open source software?
Why is the speech reccognition code not shipped with the browser?
Comment 89•5 years ago
|
||
(In reply to guest271314 from comment #88)
Is the speech recognition code free open source software?
Why is the speech reccognition code not shipped with the browser?
There is a related issue regarding shipping DeepSpeech with Firefox https://bugzilla.mozilla.org/show_bug.cgi?id=1474084 . The code for deep speech is, of course, open source. It can be found here https://github.com/mozilla/DeepSpeech
Comment 90•5 years ago
|
||
(In reply to brandmairstefan from comment #89)
(In reply to guest271314 from comment #88)
Is the speech recognition code free open source software?
Why is the speech reccognition code not shipped with the browser?
There is a related issue regarding shipping DeepSpeech with Firefox https://bugzilla.mozilla.org/show_bug.cgi?id=1474084 . The code for deep speech is, of course, open source. It can be found here https://github.com/mozilla/DeepSpeech
Tried to use SpeechRecognition
at Nightly 73 - with necessary flags set - and the browser crashed.
Am trying to compare the results of various STT services. So far tested https://speech-to-text-demo.ng.bluemix.net/ and https://cloud.google.com/speech-to-text/, which provide different results or a 33 second audio file (WAV
).
Is there a means to send an audio file to the Mozilla end point without using SpeechRecognition
?
Comment 91•5 years ago
|
||
SpeechRecognition
appears to stop at a brief pause between sound given 33 seconds of audio.
audiostart 266094
start 266095
speechstart 266419
speechend 267451
audioend 267453
end 268580
Updated•5 years ago
|
Comment 92•4 years ago
|
||
Not working for me. I have updated to Firefox nightly v79 18/06/2020.
I have set both value to true, relaunch browser. After visiting the site https://speechnotes.co, unable to click on the MIcrophone. While I tried with another site called https://dictation.io/speech, here tool is not recognizing my voice.
Comment 93•4 years ago
|
||
Greetings to everyone, I'm new here, but I've filed speechrecognition issues with Chromium / Android, they will solve this year, as they did answer.
I would like to give you an idear of a different approach, I experienced ...
If we take a whatever word let it be " house " and build a container around this object, something similair to a cell which has a DNA and organelles, we could achieve the following :
Steps:
1.) "house" + a collection of soundapi equalizer registrations which gets aquired and updates this dna chain
2.) pass the collection through a sound fingerprinting to compute common patterns (phillips technology Netherlands, but they do cost)
3.) place the fingerprinting patterns into the house dna chain / house object.
redo 1 to 3 on regular basis, let it be every 10 / 20 new equalizer registrations you do obtain by a user requesting to recognize his words.
Serverwise place the fingerprints into it's memory
Once voicerecognition is requested, preprocess the input voice as fingerprint
Get the fingerprint or fingerprints from the server memory
and return the word " house ", contained in the object.
Nobody will forbit you to combine or connect or link different objects into one, such as :
house, Haus (german), casa (italian), casa (portughese), maison (french)
and obtain this way also immediate translation possibilities, or same soundalike meaning words.
The way to build this thing as single objects and not as a database becomes necessary for performance issues, as a database becomes slower and slower the more data it contains. I know this from musik recognition engines used to monitor radio stations and televisions and report the author rights to be settled. (A little like Shazan)
The other reason is, that you do not need to search for a pattern if the pattern is already named as it's result, so it becomes nothing else than a lookup of a file, which is much faster than a database lookup.
An example of a data container :
describing the content (which could be the soundapi equalizer recordings and the fingerprintings)
data-styled="{"x":161,"y":403,"width":"79vw","height":"49vh","top":"47vh","right":"91vw","bottom":"96vh","left":"12vw","font-family":"Times","font-weight":"400","font-size":"6.884480746791131vmin","line-height":"normal","text-align":"start","objHref":"","dna-mouse-up":"exchangeForm,","dna-enter":"secondary,","objCode":"1603813965238","objTitle":"Museo Gregoriano Egizio","objUrl":"","objData":"","objMime":"image/jpeg","objText":"","objScreenX":"","objScreenY":"","objScreenZ":"","objName":"","objAddr":"","objEmail":"","objWhatsapp":"","objSkype":"","objPhone":"","objWWW":"","objBuild":"Tue, 27 Oct 2020 15:52:45 GMT","objExpire":"","objGroup":"","objOwner":"","objCmd":"pasted image","objUpdate":"Fri, 01 Jan 2021 15:28:20 GMT","objParent":"1603813859288","name":"","org-height":"734","org-width":"981","org-size":"","compression":"0.6","res-width":"981","res-height":"734","res-size":"154787","scale-factor":"1","play-dna":"","objTime":16856080079020.5,"objStart":1609514900047,"background-color":"Tomato"}"
and allowing the data with :
"dna-mouse-up":"exchangeForm,",
"dna-enter":"secondary,",
to execute commands / programs based here based upon mouseup or screenarea entering.
(I took it from a website I've build)
or with such a set of istructions
data-styled="{"clickdog29021":"50,50,50,50,2000,5","clickdog41510":"89.47368421052632,39.39393939393939,7.236842105263158,39.928698752228165,412,3","clickdog51838":"7.894736842105263,40.28520499108734,75.6578947368421,40.106951871657756,352,-3","clickdog59010":"12.5,45.27629233511586,73.02631578947368,7.8431372549019605,468,9","clickdog65855":"73.35526315789474,16.22103386809269,17.105263157894736,47.41532976827095,354,-9","clickdog77738":"50,50,50,50,2000,6"}"
the numbers are unique processing id, starting x,y ending x,y,(vh/vw) , time used (ms) and acting the way
to run a hand to great (5), then show the visitor how to swipe to change page forwards and backwards (3, -3), last and first page(9,-9) then finally greet for a good bye (6).
clicking would be 1, long click 2, and very long click 4 (used to issue different commands available on the screen)
The advantage of such a thing is, that each single datacontainer will continue to learn and process, and upon the real world usage of those datacontainers creating a major " understanding " (datacollection) of the word " house ", those containers less used or unused will timeout ("objExpire":"") or /and ("objUpdate":"Fri, 01 Jan 2021 15:28:20 GMT") and you have something as a natural selection or evolution of containers adapting themself to the millions of speakers using the system to dictate their hopefully intelligent words.
I hope I could explain the idea or technique the way that it became understandable.
At your disposal
(I do not place a link to above shows, I would consider as publicity, but the thing is also on the web and works)
Claudio Klemp
Reporter | ||
Comment 94•4 years ago
|
||
Hi Claudio and Happy New Year! Please note that Bugzilla is meant to track the implementation of specific feature requests or bug fixes. In this case it is the SpeechRecognition
Interface of the Web Speech API and this bug is already closed.
More general discussions should happen at https://discourse.mozilla.org/, which might then end up as specific requests here.
Sebastian
Comment 95•4 years ago
|
||
(In reply to André Natal from comment #81)
(In reply to jason.oneil from comment #79)
(In reply to Nils Ohlmeier [:drno] from comment #77)
It's not going to ride the trains. The plan is to hold the feature in Nightly. I guess that means it doesn't go into the release notes. But asking for folks to flip the pref and test it in Nightly would be good.
Are there any instructions for testing this in nightly? I'm a web developer who runs Firefox Nightly and has given several talks on the Web Speech API. Very keen to help with testing.
Hi Jason,
the API hasn't landed and isn't available in Nightly yet , but as soon there is, to enable it will be just matter of switching a couple of flags on (if at all)
Which "couple of flags"? I observed the devtools.chrome.enabled = true
and media.webspeech.recognition.enable = true
from this discussion and set those accordingly. I am testing on https://mdn.github.io/web-speech-api/speech-color-changer/index.html. But page does not load properly due to breakage on webkitSpeechRecognition is not defined
error. Testing on Nightly 86.0a1 (2021-01-03).
Comment 96•4 years ago
|
||
Additional info: Using Macos version 10.13.6
Comment 97•4 years ago
|
||
#95 The last time I tested this was the page I used for information re speech recognition https://wiki.mozilla.org/Web_Speech_API_-_Speech_Recognition#How_can_I_test_with_Deep_Speech.3F, see https://bugzilla.mozilla.org/show_bug.cgi?id=1604994.
Comment 98•4 years ago
|
||
(In reply to guest271314 from comment #97)
#95 The last time I tested this was the page I used for information re speech recognition https://wiki.mozilla.org/Web_Speech_API_-_Speech_Recognition#How_can_I_test_with_Deep_Speech.3F, see https://bugzilla.mozilla.org/show_bug.cgi?id=1604994.
I get "Voice input isn't supported on this browser".
When running a simple script, window.SpeechRecognition
is undefined.
Comment 99•4 years ago
|
||
You'll need to enable both media.webspeech.recognition.enable
and media.webspeech.recognition.force_enable
to test out the API in its current state.
Comment 100•4 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #99)
You'll need to enable both
media.webspeech.recognition.enable
andmedia.webspeech.recognition.force_enable
to test out the API in its current state.
Thank you, that works. media.webspeech.recognition.force_enable
wasn't mentioned in this discussion before.
Am I to understand that setting media.webspeech.service.endpoint
to https://dev.speaktome.nonprod.cloudops.mozgcp.net/
will cause SpeechRecognition
to use Deepspeech instead of the OS native service?
Comment 101•4 years ago
|
||
to use Deepspeech instead of the OS native service?
What is meant by "OS native service"?
Which OS are you running?
AFAIK On Linux there is no "native OS service" for speech recognition.
Comment 102•4 years ago
|
||
(In reply to JulianHofstadter from comment #100)
(In reply to Andreas Pehrson [:pehrsons] from comment #99)
You'll need to enable both
media.webspeech.recognition.enable
andmedia.webspeech.recognition.force_enable
to test out the API in its current state.Thank you, that works.
media.webspeech.recognition.force_enable
wasn't mentioned in this discussion before.Am I to understand that setting
media.webspeech.service.endpoint
tohttps://dev.speaktome.nonprod.cloudops.mozgcp.net/
will causeSpeechRecognition
to use Deepspeech instead of the OS native service?
See this wiki page as mentioned in comment 97. It lists the default destination, and further down how to switch to deep speech.
Comment 103•4 years ago
|
||
In trying to understand the docs, I understand the interimResults
[1] property to mean that results will be returned at intervals while speaking, thus triggering a result
event more than one time before speech recognition ends. However in practice I am observing that setting interimResults
to true has no effect, and only a single result
is returned at the end of recognition.
Am I interpreting this incorrectly, and if so, can someone tell me what practical difference I should see between setting this to true
and false
?
[1]https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/interimResults
Comment 104•4 years ago
|
||
This implementation is far from the spec, but at least basic recognition should work. I don't think it allows much of configuring at all, but if you want an authoritative answer, read the source.
That said, this bug is not the place to keep discussions, so this will be my last comment. Feel free to continue on Matrix.
Comment 105•4 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comments)
See this wiki page as mentioned in comment 97. It lists the default destination, and further down how to switch to deep speech.
Very informative, that helps a lot, thank you.
This implementation is far from the spec, but at least basic recognition should work. I don't think it allows much of configuring at all
Ah, that's helpful.
That said, this bug is not the place to keep discussions, so this will be my last comment. Feel free to continue on Matrix.
Thanks, I'll take your advice.
Updated•2 years ago
|
Description
•