Javascript Implementation of Snowboy #98

mslinn · 2016-12-31T20:41:37Z

A version of Snowboy that could run in most popular web browsers would be really great!

Nixellion · 2017-01-05T19:00:17Z

I'll +1 to this issue. It would definitely be awesome to have some front-end javascript based hotword detection system. If i'm correct snowboy and sonus both require node.js and other server side stuff? So basically you can build a standalone Alexa-like hardware tool running Snowboy to detect hotwords from one microphone input (well, I know you can connect many mics and mix them into one channel, but still it'll be pain in the ass compared to web-based ui.

I'm writing my own home assistant bot as well, using Python for command processing, and I only use browser as a UI that recognizes speech and sends text commands to the Python Flask server.

I chose this approach, because this way I can just put a few cheap android or windows tablets around the house, instead of dealing with and mixing a lot of microphones routed to one pc, or building multiple RPi 'assistants'. It also allows me to use my AI when I'm not at home. So it makes it more like Cortana\OkGoogle\Alexa server.

So I'm really curious about how to detect hotwords with browser-side JS.
Not feeling like writing a standalone app for this yet :)

chenguoguo · 2017-01-05T19:09:35Z

It's not impossible to turn a C++ library/binary into javascript, e.g., I've done this for sox with Emscripten.

For Snowboy, however, there will be a lot of difficulties. E.g., how can we turn the CBLAS functions we use in Snowboy into Javascript? Also, Javascript basically means open sourcing it (well I mean the source code not just the library), so it's also a decision to make on our side...

I'll leave this issue open for a while.

mslinn · 2017-01-05T20:31:26Z

"You can use f2c to convert the BLAS/LAPACK code to c and then it compiles straightforwardly with emscripten... The GNU Scientific Library has a c implementation of BLAS, as well as a whole load of other useful stuff, and it also compiles well with emscripten" https://groups.google.com/d/msg/emscripten-discuss/4Qt1OXKCKrk/0ETZBsbFVxwJ Do not confuse open source licensing with source readability. Just because someone one can do something does not mean they are allowed to do so. The topics of licensing and source code availability are orthogonal. If a customer has read-only access to source, support costs go down. Right to modify for internal use could cost more. Tiered licensing can have a no cost or low cost entry tier, with successively more expensive tiers. The technique of using licensing as a marketing device has worked for many companies over the years. A startup that uses a free tier to launch is going to be able to pay as they gain traction. Those same companies would be unable to pay up front. Competitors that do not adopt this approach would be at a disadvantage. A dual license is also popular. This would provide widest possible distribution at no cost. Once usage by an organization or project grows beyond some metric, payments would be required. The market for handsfree voice control applications will explode throughout 2017. An open source library of this type is inevitable. Will it be yours, or a competitor? Mike

chenguoguo · 2017-01-05T20:44:22Z

Thanks @mslinn for the detailed writeup! We do have algorithms that we don't want to release to the public yet. If it's just a implementation of something well known, then as you suggested licensing should solve the issue. That's why I said "it's also a decision to make on our side".

mslinn · 2017-01-05T21:09:24Z

A JS obfuscator might help. Yes, obfuscators can be cracked. I believe your existing code is equally subject to reverse engineering. Make it easy to keep regular folk honest. Those bent on criminal behavior won't be deterred from hacking your existing product.

evancohen · 2017-01-06T23:37:05Z

@chenguoguo I'd be very willing to help with this if you choose to take that route :)

chenguoguo · 2017-01-06T23:52:23Z

@evancohen I'm seriously considering this, but no decision yet :-) so I'll leave this up for a while.

CBLAS functions like sgemm usually require quite some optimization at assembly code level, and generic implementations of the those functions can be very slow. So I'm also not sure how this will turn out.

evancohen · 2017-01-07T00:19:25Z

Weblas claims to have "performance comparable to native". That might be a good place to start. Having a truly cross-platform version of snowboy would be amazing!

Also, I'm just throwing this out there because I'm not sure how it handles libraries like CBLAS, but another option would be to use a Native Client. Unfortunately this would really only work in Chrome/Chromium, and having attempted to create one in the past, has its own drawbacks.

Thalhammer · 2017-05-08T08:27:56Z

How about WebAssembly?
It's far easier to port existing C++ code, faster than JS and would work in most major browsers.
Your algorithms would be protected no less than now.

gauthamzz · 2018-01-05T10:47:33Z

Any update on this,feature. Is this coming soon?

HeyFood · 2018-01-30T12:46:12Z

Also looking for updates on this feature?

chenguoguo · 2018-01-31T17:44:08Z

Not yet.

marcus7777 · 2018-02-09T11:50:35Z

need this too

cfmaley3 · 2018-02-10T20:43:54Z

This feature would be a big help.

gauthamzz · 2018-02-27T09:25:10Z

I would like to work on this. Could you guide me on how to do this.

Thalhammer · 2018-02-27T10:28:08Z

You would probably need the source of this library, which is not open source.

gauthamzz · 2018-02-27T10:34:50Z

So will this feature ever come ?

chenguoguo · 2018-04-11T05:38:26Z

We don't have resources for that at this point, and we put it in low priority category...

richtier · 2018-08-18T13:54:21Z

As a workaround I stream Html5 webaudio to a Django (python) websever via a websocket and convert the webaudio to wav and feed it into snowboy.

I'll share the code if there is any interest in it

gauthamzz · 2018-08-18T13:55:51Z

yes do share the code, it would a great help.

…

On Sat, Aug 18, 2018 at 7:24 PM Richard Tier ***@***.***> wrote: As a workaround I stream Html5 webaudio to a Django (python) websever via a websocket and convert the webaudio to wav and feed it into snowboy. I'll share the code if there is any interest in it — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#98 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALjL_LBBLWjmFJxAbt4sGg1fP7OgnX1mks5uSByRgaJpZM4LYne0> .

-- Gautham Santhosh

richtier · 2018-08-18T16:00:13Z

The webaudio to wav converter: https://github.com/richtier/voice-command-lifecycle/blob/1e03fb8e434a4ad86532c59952bcce91d82eca35/command_lifecycle/helpers.py#L14

I use this approach to use a browser as an Alexa sim: https://github.com/richtier/alexa-browser-client

You can fork alexa-browser-client and change the behaviour of here https://github.com/richtier/alexa-browser-client/blob/2ec5b215e4fd263b37b1c2526431a819d61aed84/alexa_browser_client/alexa_browser_client/helpers.py#L27

Nixellion · 2018-08-19T15:58:48Z

@richtier This is a good workaround, and I thought about it too. Thanks for sharing your implementation! It's awesome.

However one downside to this, which is the reason why I did not pursue this approach much, is that it means that audio will be transmitted almost constantly to a webserver, polluting local network. It may seem like not that much traffic, but add multiple clients and combine it with regular user's traffic and other things and it starts to look worse. Especially if it's noisy or if there's music playing, so it's not enough to filter it by sound level. It certainly is better than using google's speech recognition for hotword :D But much worse than client-side hotword detection.

richtier · 2018-08-20T07:15:59Z

@Nixellion good point.

To quantify that, currently every 0.37 seconds approx 0.5 megabytes (533800 bytes) are sent to the server.

Given that 162 payloads of 533800 bytes are sent per minute, that's:

86.5 megabytes a minute (86475600 bytes)
5.2 gigabytes per hour (5188536000 bytes)
124.5 gigabytes per day (124524864000 bytes)

For context, watching Netflix uses 2.5GB of data per hour, albeit external data from outside the local network (so a somewhat useful benchmark).

I'm planning on having at least four devices connected. 500 gigabytes of extra internal network usage looks bad, certainly, it will encourage me to optimize it by not transmitting "silence". In my case, only two people are in the house, and we close doors so normally only one device will be active at any given time.

I considered compressing the audio before transmitting it. A cursory gzipping actually increases the size. That's probably because compressing random data is "hard".

Perhaps we can compress by reducing the range from 32 bit too 16 bit somehow. Won't be lossless compression, but as long as it's good enough for snowboy to understand it should be good enough. That would cut the payload size in half.

I wonder though if high internal traffic is too much of a bad thing? Might require a router upgrade? Maybe increase electricity usage?

Thalhammer · 2018-08-20T17:11:55Z

Another way to reduce network usage would be to use a lower Samplerate and bit depth and only transmit 1 channel. Snowboy only accepts audio with 16k Samplerate, 16 bit and 1 channel anyway, so there is no reason to transmit higher quality.

16k * 16bit(2bytes) * 1 channel = 32kBytes/second uncompressed. You could use for example flac.js* to compress it even further but 32k/s seems low enough for an internal network.

https://blog.rillke.com/flac.js/

32k/s = 1,875M/minute = 112,5M/h = 2.4G per day

This, however, is only a solution if you control both client and server. A public website that streams all of my microphone input to a remote server (even if the intent is honest) would be a website I would never visit again.

richtier · 2018-08-20T20:37:43Z

@Thalhammer good tip. thanks!

My use case is indeed for my internal network. I'm using a browser as the human interface for controlling my smart home - primarily via voice commands. I'm not CIA :)

How would lowering the Samplerate help? no matter how small we cut a pizza, we're still left with one pizza.

Nixellion · 2018-08-22T14:00:07Z

@richtier Internal traffic usage is not THAT bad, it will certainly put much more work on the router, though. It's about how much it will saturate your network. If you're watching TV over network, play games, watch some youtube and download something - so if its uncompressed like that it may slow down other things on your network, add some ping or lag, etc. I'm no expert here though.

You can also consider filtering low frequencies, as they transmit through walls and such

richtier · 2018-08-23T22:47:25Z

@Nixellion that makes sense, thanks.

richtier · 2018-09-18T09:24:53Z

I now do the webaudio to wav encoding client side: ircleci.com/gh/uktrade.

This reduced data transfer by 90%

scredii · 2019-05-21T12:14:47Z

Any update ?

Sushantmkarande · 2019-08-01T06:26:06Z

@richtier I am

> As a workaround I stream Html5 webaudio to a Django (python) websever via a websocket and convert the webaudio to wav and feed it into snowboy.
I did not find any html code which which serves this purpose can you please help me this...

CT83 · 2020-04-05T13:05:05Z

I found this helpful as an alternative.

https://github.com/TalAter/annyang

evancohen mentioned this issue Jan 6, 2017

Hotword Detection TalAter/annyang#100

Open

evancohen mentioned this issue Jun 2, 2017

Better Node example? #205

Open

gauthamzz mentioned this issue Jan 3, 2018

Improve Recognition Anna-Assistant/Anna#16

Open

irux mentioned this issue Jul 21, 2019

Fatal signal 6 (SIGABRT), code -6 in tid 8320 #595

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Javascript Implementation of Snowboy #98

Javascript Implementation of Snowboy #98

mslinn commented Dec 31, 2016

Nixellion commented Jan 5, 2017

chenguoguo commented Jan 5, 2017

mslinn commented Jan 5, 2017 via email

chenguoguo commented Jan 5, 2017

mslinn commented Jan 5, 2017

evancohen commented Jan 6, 2017

chenguoguo commented Jan 6, 2017

evancohen commented Jan 7, 2017

Thalhammer commented May 8, 2017 •

edited

Loading

gauthamzz commented Jan 5, 2018

HeyFood commented Jan 30, 2018

chenguoguo commented Jan 31, 2018

marcus7777 commented Feb 9, 2018

cfmaley3 commented Feb 10, 2018

gauthamzz commented Feb 27, 2018

Thalhammer commented Feb 27, 2018

gauthamzz commented Feb 27, 2018

chenguoguo commented Apr 11, 2018

richtier commented Aug 18, 2018

gauthamzz commented Aug 18, 2018 via email

richtier commented Aug 18, 2018

Nixellion commented Aug 19, 2018 •

edited

Loading

richtier commented Aug 20, 2018 •

edited

Loading

Thalhammer commented Aug 20, 2018

richtier commented Aug 20, 2018 •

edited

Loading

Nixellion commented Aug 22, 2018

richtier commented Aug 23, 2018 •

edited

Loading

richtier commented Sep 18, 2018 •

edited

Loading

scredii commented May 21, 2019

Sushantmkarande commented Aug 1, 2019

CT83 commented Apr 5, 2020

Javascript Implementation of Snowboy #98

Javascript Implementation of Snowboy #98

Comments

mslinn commented Dec 31, 2016

Nixellion commented Jan 5, 2017

chenguoguo commented Jan 5, 2017

mslinn commented Jan 5, 2017 via email

chenguoguo commented Jan 5, 2017

mslinn commented Jan 5, 2017

evancohen commented Jan 6, 2017

chenguoguo commented Jan 6, 2017

evancohen commented Jan 7, 2017

Thalhammer commented May 8, 2017 • edited Loading

gauthamzz commented Jan 5, 2018

HeyFood commented Jan 30, 2018

chenguoguo commented Jan 31, 2018

marcus7777 commented Feb 9, 2018

cfmaley3 commented Feb 10, 2018

gauthamzz commented Feb 27, 2018

Thalhammer commented Feb 27, 2018

gauthamzz commented Feb 27, 2018

chenguoguo commented Apr 11, 2018

richtier commented Aug 18, 2018

gauthamzz commented Aug 18, 2018 via email

richtier commented Aug 18, 2018

Nixellion commented Aug 19, 2018 • edited Loading

richtier commented Aug 20, 2018 • edited Loading

Thalhammer commented Aug 20, 2018

richtier commented Aug 20, 2018 • edited Loading

Nixellion commented Aug 22, 2018

richtier commented Aug 23, 2018 • edited Loading

richtier commented Sep 18, 2018 • edited Loading

scredii commented May 21, 2019

Sushantmkarande commented Aug 1, 2019

CT83 commented Apr 5, 2020

Thalhammer commented May 8, 2017 •

edited

Loading

Nixellion commented Aug 19, 2018 •

edited

Loading

richtier commented Aug 20, 2018 •

edited

Loading

richtier commented Aug 20, 2018 •

edited

Loading

richtier commented Aug 23, 2018 •

edited

Loading

richtier commented Sep 18, 2018 •

edited

Loading