-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feedback] Voice Calls (alpha) #175
Comments
|
Suggestions: Suno Bark:an opensource alternative to elevenlabs api for speech synthesis.
Info about the free and open source speech synthesis model Bark: |
That's my feedback; hope it's useful. Keep up the good work with big-agi. I use it every day! |
See also #175. This accomplishes a similar function in an elegant way.
Any voice feature is not working at brave |
Yes, sadly Brave does not support the Web Speech API for voice input. |
Having issues in Firefox (on Mac). While I activated speech recognition in the browser settings, it does not seem to work: I talk but I get no reaction from the AI. |
I think the voice is a great feature - really been looking for something like this - but it would be best it it really worked like a phone call. right now the thing keeps chiming when "listening" and it's kinda annoying and discruptive to the conversation - especially if you try and put it in the hands free mode (as opposed to the push to talk) - I've seen this done in other chat via browser where it's more a stream listening to the microphone. In order to get rid of the sound looping where the AI hears it's self speaking and responds to itself - i've seen it implemented where when the computer is speaking it shuts off the microphone until the sound has stopped playing - (in the case i'm talking about voxta.ai - the microphone icon goes red with a slash showing that it's not listening when the ai is speaking) - this stops the sound looping so it even works without headphones. the implementation they have on voxta.ai work smoothly - you can go back and forth like using pi ai or the chatgpt conversation mode - it's really cool. when it's speaking if you are wearing headphones they even have a mode on the settings where you can interrupt it (so it's set to listen all the time even when it's speaking - but the interrupt feature works because if you have this mode on which is mean to be used with headphones, you can even interrupt the ai while it's speak ) If you could get the conversation mode to work more like either of those this would be the killer app - you get to pick the LLM you want, you get to customize things, and you can have 2 way seemless conversation back and forth with just about any LLM that there is especially with all the choices on something like openrouter.ai - it would be very very cool to be able to have smooth conversations with just about any LLM out there - using your software and smooth conversational ai - it'd really get to be like the movie Her. Great job on this software! One other thing - as it's implemented now - when in a "call" it didn't consistently play the speech responses - it was like hit or miss - sometime it would speak what the ai was saying back and other times it wouldn't. it always displayed the response - but every other time it didn't speak the response... |
This is very nifty, and almost anyone can set it up (as long as they use Google Chome on Desktop) But... any niceness gets erased when you have a great or funny conversation that's almost impossible to repeat, that you want to screenshot or record... and then you resize the window only to run into this:
Yes, resizing the window too. Really?!
Come on!!! What the hell? |
Instructions and feedback thread for Voice Calls in
big-AGI
.1. Start a Voice call
There are two ways of initiating a Voice Call from an existing chat:
2. System Check
Make sure all the checks are green, or try to resolve the issues before proceeding. This wizard will only be shown
the first time, unless the issues persist.
3. Call Options
During a call, you can switch "Push To Talk" on/off. If active (default) then the microphone needs to be
pushed before speaking. This is best to avoid echoes and other ambient noise.
Note - you can also say the following commands during a call. These single words will be interpreted as system commands:
Known limitations:
🙌
Looking forward to your feedback to prioritize the right integration and development!
🙌
The text was updated successfully, but these errors were encountered: