-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coqui Telegram Bot #173
Coqui Telegram Bot #173
Conversation
@@ -0,0 +1,36 @@ | |||
# client_backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for later - and let's add this to linear, this will need to be in docs/
# Return an AudioSegment object from the audio data | ||
return AudioSegment.from_wav(io.BytesIO(audio_data)) # type: ignore | ||
|
||
def get_request(self, text: str) -> tuple[str, dict[str, str], dict[str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd guess this line is what is breaking mypy, you'll need to use typing.Tuple
instead of tuple
and typing.Dict
instead of dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for compatibility with python 3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in other PR
continue | ||
# Concatenate the current chunk and the sentence, and add a period to the end | ||
proposed_chunk = current_chunk + sentence | ||
if len(proposed_chunk) > 250: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: magic number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in other PR
proposed_chunk = current_chunk + sentence | ||
if len(proposed_chunk) > 250: | ||
chunks.append(current_chunk.strip()) | ||
current_chunk = sentence + "." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as kian said before, this will need to preserve the correct sentence ending that it was split on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in other PR
# Create an aiohttp session and post the request asynchronously using await | ||
async with aiohttp.ClientSession() as session: | ||
async with session.post(url, headers=headers, json=body) as response: | ||
assert response.status == 201, ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for now, but this sort of assert that is "expected" (not something that should never happen) - should probably be some other error class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added linear followup
apps/telegram_bot/main.py
Outdated
) -> None: | ||
assert update.effective_chat, "Chat must be defined!" | ||
chat_id = update.effective_chat.id | ||
if type(self.synthesizer) is not CoquiSynthesizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit isinstance
instead of CoquiSynthesizer
- in case in the future we subclass CoquiSynthesizer
, for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
apps/telegram_bot/main.py
Outdated
if type(self.synthesizer) is not CoquiSynthesizer: | ||
await context.bot.send_message( | ||
chat_id=chat_id, | ||
text="Sorry, voice creation is only supported for Coqui TTS.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text="Sorry, voice creation is only supported for Coqui TTS.", | |
text="Sorry, voice creation is only supported for Coqui.", |
since we have a "Coqui TTS" synthesizer which is their OSS thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't see the change
apps/telegram_bot/main.py
Outdated
def get_agent(self, chat_id: int) -> ChatGPTAgent: | ||
# Get current voice name and description from DB | ||
_, voice_name, voice_description = self.db[chat_id].get( | ||
"current_voice", (None, None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can turn this tuple into a pydantic
class:
class Voice(pydantic.BaseModel):
voice_id...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
apps/telegram_bot/main.py
Outdated
chat_id = update.effective_chat.id | ||
user_voices = self.db[chat_id]["voices"] # array (id, name, description) | ||
# Make string table of id, name, description | ||
voices = "\n".join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
voices = "\n".join( | |
voices_formatted = "\n".join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
apps/telegram_bot/main.py
Outdated
- Use /help to see this help message again. | ||
""" | ||
assert update.effective_chat, "Chat must be defined!" | ||
if type(self.synthesizer) is CoquiSynthesizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit isinstance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! nice work
) -> None: | ||
self.transcriber = transcriber | ||
self.system_prompt = system_prompt | ||
self.synthesizer = synthesizer | ||
self.db = ChatsDB(db if db else {}) | ||
self.db: Dict[int, Chat] = defaultdict(Chat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, very clean
# Initialize an empty dictionary to store user data | ||
self.db = db | ||
# Define a Voice model with id, name and description fields | ||
class Voice(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
apps/telegram_bot/main.py
Outdated
if type(self.synthesizer) is not CoquiSynthesizer: | ||
await context.bot.send_message( | ||
chat_id=chat_id, | ||
text="Sorry, voice creation is only supported for Coqui TTS.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't see the change
@ajar98 Fixed the branding issue and merged again. Good to go from my end. |
* Add async synthesize, xtts, and prompt to coqui TB * add speechrecognition and aiohttp dependencies * add optional memory arg to turn-based ChatGPTAgent * add coqui telegram bot * fix mypy issue * pr feedback * Rename defaultdict * fix py3.8 typing issue * another py3.8 fix * [broken] pydantic progress * use pydantic and defaultdict * more nit fixes * Fix Coqui branding * fix type error
Prompt-To-Voice Telegram Bot
The bot (based on albertwujj's work) uses the python-telegram-bot library to handle user messages and commands, the
WhisperTranscriber
class to transcribe voice messages from users, and theChatGPTAgent
class to generate text responses based on a system prompt and the user input. The system prompt is even customized based on the voice name and description of the current voice. The bot also allows the user to select or create different voices using Coqui TTS's voice creation APIs.The bot supports the following commands for the user to interact with it:
/start
: Initializes the user data and sends a welcome message./voice <voice_id>
: Changes the current voice to the one with the given id and resets the conversation. The voice id must be an integer corresponding to one of the available voices./create <voice_description>
: Creates a new Coqui TTS voice from a text prompt and switches to it. The voice description must be a string that describes how the voice should sound like./list
: Lists all the available voices with their ids, names, and descriptions (if any)./who
: Shows the name and description (if any) of the current voice./help
: Shows a help message with all the available commands.TODO:
InMemoryDB
wrapper is the best way to handle non-existent users or if there is a better alternative.