A FastAPI-based Text-to-Speech API service using Piper TTS. This service provides a simple REST API for converting text to speech with streaming capabilities.
- Text-to-Speech conversion using Piper TTS
- RESTful API with FastAPI
- Support for both full response and streaming audio
- Dockerized deployment
- WAV audio output
- Docker and Docker Compose
- ONNX voice model file (
voice_model.onnx
andvoice_model.onnx.json
)
-
Place your ONNX voice model files in the project root:
voice_model.onnx
voice_model.onnx.json
-
Build and run the Docker container:
docker-compose up --build
The API will be available at http://localhost:8000
POST /api/v1/tts
Request body:
{
"text": "Your text to convert to speech"
}
Response: WAV audio file
POST /api/v1/tts/stream
Request body:
{
"text": "Your text to convert to speech"
}
Response: Streamed WAV audio
Using curl:
# Full response
curl -X POST "http://localhost:8000/api/v1/tts" \
-H "Content-Type: application/json" \
-d '{"text":"Hello, world!"}' \
--output output.wav
# Streaming response
curl -X POST "http://localhost:8000/api/v1/tts/stream" \
-H "Content-Type: application/json" \
-d '{"text":"Hello, world!"}' \
--output output.wav
Using Python requests:
import requests
# Full response
response = requests.post(
"http://localhost:8000/api/v1/tts",
json={"text": "Hello, world!"}
)
with open("output.wav", "wb") as f:
f.write(response.content)
# Streaming response
response = requests.post(
"http://localhost:8000/api/v1/tts/stream",
json={"text": "Hello, world!"},
stream=True
)
with open("output.wav", "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
To run the service without Docker:
- Install dependencies:
pip install -r requirements.txt
- Run the service:
uvicorn app.main:app --host 0.0.0.0 --port 8000
Once the service is running, you can access the interactive API documentation at:
- Swagger UI:
http://localhost:8000/docs
- ReDoc:
http://localhost:8000/redoc