-
-
Notifications
You must be signed in to change notification settings - Fork 145
FAQ, Quirks & General Questions
This page addresses common questions, quirks, and general information about AllTalk V2.
- Configuration Files
- TTS Engines and Models
- Language Support
- Interface Navigation
- Additional Information
Q: Which are the main configuration files changed by AllTalk TTS?
A: The two main configuration files that are changed are:
-
\alltalk_tts\confignew.json
: This file stores almost all the configuration settings. -
\alltalk_tts\system\tts_engines\tts_engines.json
: This file stores the currently loaded TTS engine & model, as well as a list of available TTS engines and models.
Q: Where is the Python environment for AllTalk stored?
A: The Python environment for AllTalk is built in the alltalk_environment
folder. This folder contains all the necessary Python packages and dependencies for running AllTalk.
Q: What are the "start_xxxx" files?
A: The installation process creates several "start_xxxx" batch (for Windows) or shell (for Unix-based systems) files:
-
start_alltalk
: Starts the main AllTalk application -
start_environment
: Activates the AllTalk Python environment -
start_finetune
: Starts the finetuning process -
start_diagnostics
: Generates a diagnostics file
These files are created to make it easier to run different aspects of AllTalk without having to manually activate the Python environment each time.
Q: Can I transfer the alltalk_environment
folder or "start_xxxx" files between different installations?
A: No, it's not recommended to transfer the alltalk_environment
folder or the "start_xxxx" files between different installations or disk locations. Doing so can cause issues because these files and folders contain absolute paths specific to their original installation location. If you need to set up AllTalk in a new location, it's best to perform a fresh installation.
Q: How can I completely rebuild the Python environment from scratch?
A: To rebuild the Python environment from scratch:
- Delete the
alltalk_environment
folder entirely. - Delete all the "start_xxxx" files.
- Run the installation process again (typically by running
atsetup.bat
or the equivalent for your system).
This will create a fresh Python environment and new startup scripts tailored to your current installation location.
Q: What are the base features of each TTS engine?
Model | DeepSpeed | Pitch | Speed | RepPen | MultiLang | Streaming | Low VRAM | Temp | Multi Model | Notes |
---|---|---|---|---|---|---|---|---|---|---|
F5-TTS | No | No | Yes | No | *Yes | No | Yes | No | Yes | * |
Parler-TTS | No | No | No | No | No | No | Yes | No | Yes | ** |
Piper | No | No | Yes | No | *No | No | No | No | Yes | *** |
Coqui VITS | No | No | No | No | *No | No | Yes | No | Yes | *** |
Coqui XTTS | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | **** |
- F5-TTS: Supports only Chinese and English voice cloning.
- Parler-TTS: Likely English TTS generation only.
- Piper and Coqui VITS: Language support depends on the model file loaded.
- Coqui XTTS: Multi-language and voice cloning capability.
Q: How do I change/set the TTS Engine to XTTS, Piper, VITS, etc.?
A: You can change the TTS Engine in the Gradio interface:
- Go to the "Generate TTS" > "Generate" tab.
- Look for the "Swap TTS Engine" option to change the engine.
- Use the "Load Different Model" option to change the model for the selected engine.
Note: There is also a "Generate Help" tab that provides detailed explanations for this portion of the interface.
Q: How can I find more information about each TTS Engine?
A: The Gradio interface provides detailed information for each TTS Engine:
- Go to the "TTS Engine Settings" tab.
- Select the engine you're interested in.
- For each engine, you'll find:
- Available settings you can configure
- An "Engine Information" tab with details about the engine, including the developer's website
- A "Models Download" area where you can download models for that TTS Engine
- An "Engine Help" tab with specific information about the engine, including:
- Where its models are stored
- How to create voices (if available)
- Any other relevant information or quirks
Q: What are the basic differences between the TTS Engines?
Engine | Type | Voice Cloning | Resource Usage | Generation Speed | Key Features |
---|---|---|---|---|---|
Coqui XTTS | Neural TTS (VITS-based) | Yes | High | Medium-Fast | - Zero-shot voice cloning with just 3s audio - Supports 17 languages - Streaming capable (<200ms latency) - Cross-language voice cloning |
Piper | Neural TTS (VITS-based) | No* | Low | Fast | - Optimized for Raspberry Pi - ONNX runtime for efficiency - Wide language support (30+ languages) - Local/offline use - Streaming capable |
Parler | Neural TTS | No | Medium | Medium | - Natural language controlled voice style - 34 built-in voices - High quality audiobook-style speech - Strong prosody control via punctuation |
VITS | Neural TTS | No* | Medium-High | Fast | - End-to-end architecture - Multi-speaker capabilities (with training) - Can use external speaker embeddings - HiFiGAN vocoder based |
F5-TTS | Neural TTS (Flow Matching) | Yes | Medium | Medium-Fast | - Flow matching technique - Multi-style/Multi-speaker - Chunk inference support - Voice chat capabilities |
Note: VITS & Piper can support multiple speakers but requires full training/fine-tuning rather than zero-shot voice cloning. Please see the developers websites for details on doing this.
Q: Can AllTalk TTS support Hindi?
A: Yes, but with some limitations:
- The Coqui XTTS engine can process Hindi, but only with the XTTS 2.0.3 model loaded as
apitts
. - Idiap (which maintains the Coqui TTS engine) is working on updating the tokenizer to improve Hindi support. However, this update is not yet available.
- You can track the progress of this update in this GitHub commit.
Q: Where can I find help on using the Generate TTS interface?
A: In the Gradio interface, navigate to the "Generate TTS" > "Generate" tab. There is a "Generate Help" tab that provides comprehensive explanations for all the options and features available in this section of the interface.
Q: Are there any known quirks or limitations I should be aware of?
A: Yes, here are a few:
- Some languages may have limited support depending on the TTS engine and model you're using.
- Certain features or settings may only be available with specific engines or models.
- Always check the "Engine Help" tab for any engine-specific quirks or limitations.