WebSpeech API GetUserMedia request should use some optimized audio settings
Categories
(Core :: Web Speech, enhancement)
Tracking
()
People
(Reporter: gerard-majax, Assigned: gerard-majax)
References
Details
Attachments
(1 obsolete file)
We should enable noise cancellation at the very least, and maybe others.
Assignee | ||
Comment 1•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 2•5 years ago
|
||
I have attempted to verify this implementation with the following procedure:
- I had 2 builds opened at the same time, to compare how Google Translate recognizes verbal sounds on each of them.
build a: Nightly v72.0a1 from 2019-11-25
build b: Nightly v72.0a1 from https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5bb8a18c9e522895110785c741684eaa9478bc9 - I took a generic webcam to use as audio input because its microphone audio quality is most raw.
- I used a mobile phone to record my voice in samples that I feed as audio input.
- On another mobile phone, I would play different Youtube noise videos.
- I would try to find the best working set-up (so that the samples are barely understood) by modifying sample or noise volumes or moving them closer or further to the webcam's microphone. These set-ups would change from a sample to the other or from a noise video to another.
These are the test results: https://docs.google.com/spreadsheets/d/145pulOOjma7gfjLlxZXXC_txzRWcQesyVmCCybE16VM/edit?usp=sharing
In conclusion, I can definitely say that there is no noticeable improvement on the build with the audio quality improvements compared to the latest Nightly build. Please have your own opinion based on the test results. Thank you.
P.S. If this is a feature that is only intended to be activated during testing than it is not needed. I wanted to use noise reduction/cancellation hardware/software only to diversify testing procedures for the Web Speech API feature, not because it would not understand (translate voice into string) well enough.
Assignee | ||
Comment 3•5 years ago
|
||
(In reply to Bodea Daniel [:danibodea] from comment #2)
[...]
In conclusion, I can definitely say that there is no noticeable improvement on the build with the audio quality improvements compared to the latest Nightly build. Please have your own opinion based on the test results. Thank you.
So you tested that only against Google STT, right. Which is very likely to be much more robust to noise for now that the DeepSpeech implem. If it's not regressing and it can help QA'ing DeepSpeech, that's good for me.
Assignee | ||
Updated•5 years ago
|
So are we ignoring any benefit this could have for non-Google engines?
Comment 5•5 years ago
|
||
A better solution is https://github.com/WICG/speech-api/issues/66 to give the application full control (and where these settings are on by default).
The API won't ship before that's in the spec and implemented anyway, AIUI.
Updated•4 years ago
|
Description
•