I've seen quite a lot of projects using Web Speech API, the browser API for speech recognition.
While this is great as in we need more people building voice user interfaces in browsers, I'm not sure if it's at all suitable for production use.
First of all, it's severely lacking in browser support, meaning it's pretty much a Chrome (and derivates) specific feature.
Second, as it's only for speech recognition, it's not very easy to leverage it for anything that's not just replacing keyboard.
And if you think about it, speech is not at all like a keyboard with voice. Written and spoken language are pretty much two different languages and even if saying something like "Turn off the lights" is pretty intuitive and fast, typing that to do the same action would not be very good UX.
This is why any voice user interface will anyway need a natural language understanding component to extract meaning out of what the user said. Even with the most simple voice user interfaces, actual users will come up with very many different ways of expressing the same intent and grepping text transcripts is not a viable solution.
So this is my question: Are you using WebSpeech API and for what kind of projects? What do you think are the benefits of the API compared to some other solutions? Have you considered other options and why did you ditch those?
Top comments (0)