API ‐ Standard TTS Generation API

This endpoint allows you to generate Text-to-Speech (TTS) audio based on text input. It supports both character and narrator speech generation.

Endpoint Details

URL: http://{ipaddress}:{port}/api/tts-generate
Method: POST
Content-Type: application/x-www-form-urlencoded

Request Parameters

Parameter	Type	Description
`text_input`	string	The text you want the TTS engine to produce.
`text_filtering`	string	Filter for text. Options: `none`, `standard`, `html`
`character_voice_gen`	string	The name of the character's voice file (WAV format).
`rvccharacter_voice_gen`	string	The name of the RVC voice file for the character. Format: `folder\file.pth` or `Disabled`
`rvccharacter_pitch`	integer	The pitch for the RVC voice for the character. Range: -24 to 24
`narrator_enabled`	boolean	Enable or disable the narrator function.
`narrator_voice_gen`	string	The name of the narrator's voice file (WAV format).
`rvcnarrator_voice_gen`	string	The name of the RVC voice file for the narrator. Format: `folder\file.pth` or `Disabled`
`rvcnarrator_pitch`	integer	The pitch for the RVC voice for the narrator. Range: -24 to 24
`text_not_inside`	string	Specify handling of lines not inside quotes or asterisks. Options: `character`, `narrator`, `silent`
`language`	string	Choose the language for TTS. (See supported languages below)
`output_file_name`	string	The name of the output file (excluding the .wav extension).
`output_file_timestamp`	boolean	Add a timestamp to the output file name.
`autoplay`	boolean	Enable or disable playing the generated TTS to your standard sound output device.
`autoplay_volume`	float	Set the autoplay volume. Range: 0.1 to 1.0
`speed`	float	Set the speed of the generated audio. Range: 0.25 to 2.0
`pitch`	integer	Set the pitch of the generated audio. Range: -10 to 10
`temperature`	float	Set the temperature for the TTS engine. Range: 0.1 to 1.0
`repetition_penalty`	float	Set the repetition penalty for the TTS engine. Range: 1.0 to 20.0

Supported Languages

Code	Language
`ar`	Arabic
`zh-cn`	Chinese (Simplified)
`cs`	Czech
`nl`	Dutch
`en`	English
`fr`	French
`de`	German
`hi`	Hindi (limited support)
`hu`	Hungarian
`it`	Italian
`ja`	Japanese
`ko`	Korean
`pl`	Polish
`pt`	Portuguese
`ru`	Russian
`es`	Spanish
`tr`	Turkish

Example Requests

Standard TTS Speech Example

Generate a time-stamped file for standard text and play the audio at the command prompt/terminal:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest" \
     -d "text_filtering=standard" \
     -d "character_voice_gen=female_01.wav" \
     -d "narrator_enabled=false" \
     -d "narrator_voice_gen=male_01.wav" \
     -d "text_not_inside=character" \
     -d "language=en" \
     -d "output_file_name=myoutputfile" \
     -d "output_file_timestamp=true" \
     -d "autoplay=false" \
     -d "autoplay_volume=0.8"

Narrator Example

Generate a time-stamped file for text with narrator and character speech and play the audio at the command prompt/terminal:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=*This is text spoken by the narrator* \"This is text spoken by the character\". This is text not inside quotes." \
     -d "text_filtering=standard" \
     -d "character_voice_gen=female_01.wav" \
     -d "narrator_enabled=true" \
     -d "narrator_voice_gen=male_01.wav" \
     -d "text_not_inside=character" \
     -d "language=en" \
     -d "output_file_name=myoutputfile" \
     -d "output_file_timestamp=true" \
     -d "autoplay=false" \
     -d "autoplay_volume=0.8"

Note: If your text contains double quotes, escape them with \" (see the narrator example).

Minimal Request Example

You can send a request with any mix of settings you wish. Missing fields will be populated using default API Global settings and default TTS engine settings:

curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
     -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest"

Response

The API returns a JSON object with the following properties:

Property	Description
`status`	Indicates whether the generation was successful (`generate-success`) or failed (`generate-failure`).
`output_file_path`	The on-disk location of the generated WAV file.
`output_file_url`	The HTTP location for accessing the generated WAV file for browser playback.
`output_cache_url`	The HTTP location for accessing the generated WAV file as a pushed download.

Example response:

{
    "status": "generate-success",
    "output_file_path": "C:\\text-generation-webui\\extensions\\alltalk_tts\\outputs\\myoutputfile_1704141936.wav",
    "output_file_url": "/audio/myoutputfile_1704141936.wav",
    "output_cache_url": "/audiocache/myoutputfile_1704141936.wav"
}

Note: The response no longer includes the IP address and port number. You will need to add these in your own software/extension.

Additional Notes

All global settings for the API endpoint can be configured within the AllTalk interface under Global Settings > AllTalk API Defaults.
TTS engine-specific settings, such as voices to use or engine parameters, can be set on an engine-by-engine basis in TTS Engine Settings > TTS Engine of your choice.
Although you can send all variables/settings, the loaded TTS engine will only support them if it is capable. For example, you can request a TTS generation in Russian, but if the TTS model that is loaded only supports English, it will only generate English-sounding text-to-speech.
Voices sent in the request have to match the voices available within the TTS engine loaded. Generation requests where the voices don't match will result in nothing being generated and possibly an error message.

AllTalk Version 2 Index

Installation

System Requirements

Features

3rd Party Integrations

XTTS Finetuning Guides

API Documentation

Support & Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly