Mycroft-core: Local STT Fallback?

Created on 21 Jan 2017  ·  9Comments  ·  Source: MycroftAI/mycroft-core

I don't know how much effort this would take, or how much interest there may be in this.

While I understand that the typical local (offline) STT systems are not as good at recognition as the systems currently used, I would still both like the option to use them as well as an option to fallback if the online services take too long to response. (Also if the service goes down for whatever reason)

Personal Background:

Home internet is crappy, sometimes basic requests lag significantly, sometimes the connection goes out (while almost everything I use is online based, there will likely be eventual offline only functions like home automation).

Suggestion:

  • Add support for one or more local recognition tools.
  • Add a "fallback" option in the STT settings to use the local recognition if the remote service is unresponsive (no connection, or takes longer than X seconds to give a result)

Issues Resolved by this:

  • Privacy (users who are willing to sacrifice quality for privacy can do so, potentially with a wholly offline Mycroft)
  • Continue operating in poor connectivity (option to prefer a "poor response" over "no response")
Enhancement - proposed help wanted

All 9 comments

Agreed entirely! In fact, if I read one of the blog posts / newsletters correctly, I'd say this should be firmly in scope.

Hi,

See pull request #656 and related issue #655. It implements the necessary changes in client/speech to use local pocketsphinx as STT device.

It has been developed and tested thinking in the Spanish internalization, but must be valid to any other language supported by PocketSphinx.

See wiki page here for more details. In particular, the chapter titled "Select and configure an STT with Spanish support".

There is also pull request #438 if you want to use kaldi.
I'm not sure it will work on the raspberry pi, but works fine on my desktop.

Few years later, but sharing a non optimized way of using both online and offline STT's depending on the internet connectivity. If it can help someone. Personally using PocketSphinx as an offline STT

On mycroft.stt.__init__.py add _from mycroft.util import connected_
then on the create() method, add :

if connected() is True: return PocketSphinxSTT() else: try : ...

Then on your mainskill.__init__.py, one can add to the handle_boot_finished method this call :
self.schedule_repeating_event(self.check_connection_switchSTT, None, 30)
This will call the check_connection_switchSTT that checks internet connectivity, and restarts audio services if it changed, every 30 seconds.
My function personally looks like this :

def check_connection_switchSTT(self):

    if connected() is True:

        self.newstate = "yes"

    else:

        self.newstate = 'no'

    if self.prevstate != self.newstate:

        self.prevstate = self.newstate

        LOG.info("Internet connectivity changed")

        subprocess.call(['/path/to/stop-mycroft.sh', 'voice'])

        subprocess.call(['/path/to/start-mycroft.sh', 'voice'])

`
having previously defined self.prevstate depending on the connection during the skill's initialization

Not a very good way of solving the problem but couldn't find anything else on the subject, and needed badly to have a fallback when my internet's down. If anyone though about this issue during the past three years, please share.

pocketsphinx is a dead end, it sucks to the point of not being usable even as a fallback

i made https://github.com/MycroftAI/mycroft-core/pull/1184 to add a offline pocketsphinx STT, but live testing ended up 99,9% of times with me giving up and just using the cli

kaldi live streaming is an option, works slightly better, i only give up 75% of times!

your best option is to self host deepspeech/kaldi , but even that is not ready for prime time

you might be interested in my pocketsphinx local listener (limited vocab works ok) and kaldi spotter

those are proof of concepts but not really maintained projects, Pull Requests welcome to keep them up to date

i do think a fallback STT makes a lot of sense, but only once we have something usable, currently it's a waste of resources

I second having trouble with pocketsphinx, mostly due to low accuracy, personally I had way better luck with using kaldi with https://github.com/jcsilva/docker-kaldi-gstreamer-server

Last time I tried it was in 2017 so ymmv, hopefully I get some time to look at it again, I miss my mycroft :)

Thanks, the local listener is very interesting indeed, i'll PR with an update.

Vosk can run on RPI and has small (50MB) STT-models for currently 16 languages. I will give this a try now...

@domcross see vosk here https://github.com/HelloChatterbox/speech2text/blob/dev/speech2text/engines/kaldi.py

just waiting for #2594 to be merged so i can add it :)

Was this page helpful?
0 / 5 - 0 ratings