-
-
Notifications
You must be signed in to change notification settings - Fork 145
API ‐ Standard TTS Generation API
This endpoint allows you to generate Text-to-Speech (TTS) audio based on text input. It supports both character and narrator speech generation.
To understand how tts requests to this endpoint flow through AllTalk V2, please see the flowchart here
-
URL:
http://{ipaddress}:{port}/api/tts-generate
-
Method:
POST
-
Content-Type:
application/x-www-form-urlencoded
Parameter | Type | Description |
---|---|---|
text_input |
string | The text you want the TTS engine to produce. |
text_filtering |
string | Filter for text. Options: none , standard , html
|
character_voice_gen |
string | The name of the character's voice file (WAV format). |
rvccharacter_voice_gen |
string | The name of the RVC voice file for the character. Format: folder\file.pth or Disabled
|
rvccharacter_pitch |
integer | The pitch for the RVC voice for the character. Range: -24 to 24 |
narrator_enabled |
boolean | Enable or disable the narrator function. |
narrator_voice_gen |
string | The name of the narrator's voice file (WAV format). |
rvcnarrator_voice_gen |
string | The name of the RVC voice file for the narrator. Format: folder\file.pth or Disabled
|
rvcnarrator_pitch |
integer | The pitch for the RVC voice for the narrator. Range: -24 to 24 |
text_not_inside |
string | Specify handling of lines not inside quotes or asterisks. Options: character , narrator , silent
|
language |
string | Choose the language for TTS. (See supported languages below) |
output_file_name |
string | The name of the output file (excluding the .wav extension). |
output_file_timestamp |
boolean | Add a timestamp to the output file name. |
autoplay |
boolean | Enable or disable playing the generated TTS to your standard sound output device at the Terminal/Command prompt window. |
autoplay_volume |
float | Set the autoplay volume. Range: 0.1 to 1.0 |
speed |
float | Set the speed of the generated audio. Range: 0.25 to 2.0 |
pitch |
integer | Set the pitch of the generated audio. Range: -10 to 10 |
temperature |
float | Set the temperature for the TTS engine. Range: 0.1 to 1.0 |
repetition_penalty |
float | Set the repetition penalty for the TTS engine. Range: 1.0 to 20.0 |
Code | Language |
---|---|
ar |
Arabic |
zh-cn |
Chinese (Simplified) |
cs |
Czech |
nl |
Dutch |
en |
English |
fr |
French |
de |
German |
hi |
Hindi (limited support) |
hu |
Hungarian |
it |
Italian |
ja |
Japanese |
ko |
Korean |
pl |
Polish |
pt |
Portuguese |
ru |
Russian |
es |
Spanish |
tr |
Turkish |
Generate a time-stamped file for standard text and play the audio at the command prompt/terminal:
curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
-d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest" \
-d "text_filtering=standard" \
-d "character_voice_gen=female_01.wav" \
-d "narrator_enabled=false" \
-d "narrator_voice_gen=male_01.wav" \
-d "text_not_inside=character" \
-d "language=en" \
-d "output_file_name=myoutputfile" \
-d "output_file_timestamp=true" \
-d "autoplay=false" \
-d "autoplay_volume=0.8"
Generate a time-stamped file for text with narrator and character speech and play the audio at the command prompt/terminal:
curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
-d "text_input=*This is text spoken by the narrator* \"This is text spoken by the character\". This is text not inside quotes." \
-d "text_filtering=standard" \
-d "character_voice_gen=female_01.wav" \
-d "narrator_enabled=true" \
-d "narrator_voice_gen=male_01.wav" \
-d "text_not_inside=character" \
-d "language=en" \
-d "output_file_name=myoutputfile" \
-d "output_file_timestamp=true" \
-d "autoplay=false" \
-d "autoplay_volume=0.8"
Note: If your text contains double quotes, escape them with \" (see the narrator example).
You can send a request with any mix of settings you wish. Missing fields will be populated using default API Global settings and default TTS engine settings:
curl -X POST "http://127.0.0.1:7851/api/tts-generate" \
-d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest"
The API returns a JSON object with the following properties:
Property | Description |
---|---|
status |
Indicates whether the generation was successful (generate-success ) or failed (generate-failure ). |
output_file_path |
The on-disk location of the generated WAV file. |
output_file_url |
The HTTP location for accessing the generated WAV file for browser playback. |
output_cache_url |
The HTTP location for accessing the generated WAV file as a pushed download. |
Example response:
{
"status": "generate-success",
"output_file_path": "C:\\text-generation-webui\\extensions\\alltalk_tts\\outputs\\myoutputfile_1704141936.wav",
"output_file_url": "/audio/myoutputfile_1704141936.wav",
"output_cache_url": "/audiocache/myoutputfile_1704141936.wav"
}
Note: The response no longer includes the IP address and port number. You will need to add these in your own software/extension.
- All global settings for the API endpoint can be configured within the AllTalk interface under Global Settings > AllTalk API Defaults.
- TTS engine-specific settings, such as voices to use or engine parameters, can be set on an engine-by-engine basis in TTS Engine Settings > TTS Engine of your choice.
- Although you can send all variables/settings, the loaded TTS engine will only support them if it is capable. For example, you can request a TTS generation in Russian, but if the TTS model that is loaded only supports English, it will only generate English-sounding text-to-speech.
- Voices sent in the request have to match the voices available within the TTS engine loaded. Generation requests where the voices don't match will result in nothing being generated and possibly an error message.
import requests
import json
# API endpoint
API_URL = "http://127.0.0.1:7851/api/tts-generate"
# Function to generate TTS
def generate_tts(text, character_voice, narrator_voice=None, language="en", output_file="output", autoplay=False):
# Prepare the payload
payload = {
"text_input": text,
"text_filtering": "standard",
"character_voice_gen": character_voice,
"narrator_enabled": "true" if narrator_voice else "false",
"narrator_voice_gen": narrator_voice if narrator_voice else "",
"text_not_inside": "character",
"language": language,
"output_file_name": output_file,
"output_file_timestamp": "true",
"autoplay": str(autoplay).lower(),
"autoplay_volume": "0.8"
}
# Send POST request to the API
response = requests.post(API_URL, data=payload)
# Check if the request was successful
if response.status_code == 200:
result = json.loads(response.text)
if result["status"] == "generate-success":
print(f"TTS generated successfully!")
print(f"File path: {result['output_file_path']}")
print(f"File URL: {result['output_file_url']}")
print(f"Cache URL: {result['output_cache_url']}")
else:
print("TTS generation failed.")
else:
print(f"Error: {response.status_code} - {response.text}")
# Example usage
if __name__ == "__main__":
text = "Hello, this is a test of the TTS API. *This part is narrated.* \"And this is spoken by a character.\""
character_voice = "female_01.wav"
narrator_voice = "male_01.wav"
generate_tts(text, character_voice, narrator_voice)
# Note: Make sure to replace the API_URL with the correct IP address and port if different from the default
# You can customize the payload further by adding more parameters as needed (e.g., pitch, speed, temperature)
# Error handling can be improved for production use
// API endpoint
const API_URL = "http://127.0.0.1:7851/api/tts-generate";
// Function to generate TTS
async function generateTTS(text, characterVoice, narratorVoice = null, language = "en", outputFile = "output", autoplay = false) {
// Prepare the payload
const payload = new URLSearchParams({
text_input: text,
text_filtering: "standard",
character_voice_gen: characterVoice,
narrator_enabled: narratorVoice ? "true" : "false",
narrator_voice_gen: narratorVoice || "",
text_not_inside: "character",
language: language,
output_file_name: outputFile,
output_file_timestamp: "true",
autoplay: autoplay.toString(),
autoplay_volume: "0.8"
});
try {
// Send POST request to the API
const response = await fetch(API_URL, {
method: 'POST',
body: payload,
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
},
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const result = await response.json();
if (result.status === "generate-success") {
console.log("TTS generated successfully!");
console.log(`File path: ${result.output_file_path}`);
console.log(`File URL: ${result.output_file_url}`);
console.log(`Cache URL: ${result.output_cache_url}`);
return result;
} else {
console.error("TTS generation failed.");
return null;
}
} catch (error) {
console.error("Error:", error);
return null;
}
}
// Example usage
const text = "Hello, this is a test of the TTS API. *This part is narrated.* \"And this is spoken by a character.\"";
const characterVoice = "female_01.wav";
const narratorVoice = "male_01.wav";
generateTTS(text, characterVoice, narratorVoice)
.then(result => {
if (result) {
// Handle successful generation, e.g., play audio or update UI
}
});
// Note: Make sure to replace the API_URL with the correct IP address and port if different from the default
// You can customize the payload further by adding more parameters as needed (e.g., pitch, speed, temperature)
// This example uses async/await for better readability, but you can also use .then() chains if preferred
// Error handling can be improved for production use
// For browser usage, ensure CORS is properly configured on the server side