Releases · DrewThomasson/ebook2audiobook

25 Dec 08:14

DrewThomasson

2.0

d7aed38

V2.0: Tons of improvements and support for 1,107+ languages! 🤯 Latest

Latest

New Improved v2.0 gui
Easy access to fine-tuned models
Loading bar actually works 🤯
Support for 1,107+ languages! 🤯
Single run/installer script Mac windows and linux locally
THANK YOU @ROBERT-MCDOWELL

What's Changed

Major update version 2.0.0 by @ROBERT-MCDOWELL in #35
pull attempty into v2.0 by @DrewThomasson in #43
swapped download_xttsv2_model with existing download_and_extract by @DrewThomasson in #44
PR#1 by @ROBERT-MCDOWELL in #45
V2.0 base model downloader patch by @DrewThomasson in #46
Update README.md by @ROBERT-MCDOWELL in #48
Update colab_ebook2audiobookxtts.ipynb by @pafend in #49
renamed split_long_sentence to get_sentences, added punctuation for each language into language_mapping, unused code removed by @ROBERT-MCDOWELL in #50
Added 1162 languages, removed unused code by @ROBERT-MCDOWELL in #51
Last PR before merge to main by @ROBERT-MCDOWELL in #57
Added around 57 more test_ebooks as well as a ebook generator by @DrewThomasson in #58
Add disclaimer to README about DRM and legal use by @DrewThomasson in #61
delete .DS_Store files by @DrewThomasson in #62
added Fairseq supported language list to english readme by @DrewThomasson in #63
Major commit by @ROBERT-MCDOWELL in #64
more fixes by @ROBERT-MCDOWELL in #65
rebuild test files, new tools folder, various typo fixes by @ROBERT-MCDOWELL in #66
Rebuild voices folder tree by @ROBERT-MCDOWELL in #67
regenerate test files, various bug fixes, new resume process implementation by @ROBERT-MCDOWELL in #70
double quotes to simple quotes normalization by @ROBERT-MCDOWELL in #71
various fixes by @ROBERT-MCDOWELL in #73
more fixes by @ROBERT-MCDOWELL in #74
Multiprocessing, multithread, multiuser ready, various changes and fixes by @ROBERT-MCDOWELL in #75
fix model loading, removed unused conf model options by @ROBERT-MCDOWELL in #77
Minor spelling and punctuation corrections by @Bynanaa in #79
fix chapter audio and settences order, device gpu to cuda by @ROBERT-MCDOWELL in #81
added voice actor voices by @DrewThomasson in #80
implementation of fine-tuned and various fixes by @ROBERT-MCDOWELL in #82
fixed fine-tuned dropdown by @ROBERT-MCDOWELL in #83
fixed chapter combien audio bug. fine-tuned model cache still to solve by @ROBERT-MCDOWELL in #84
Added discord server link to readme by @DrewThomasson in #86
fixed fine-tuned, custom modal upload back and working(?) by @ROBERT-MCDOWELL in #87
added ref.wav in custom model upload by @ROBERT-MCDOWELL in #88
custom_model should work now by @ROBERT-MCDOWELL in #89
v2.0 updated gui demo gif by @DrewThomasson in #90
various fixes by @ROBERT-MCDOWELL in #91
added convert to gif script to make it easier to create a new gif for readme by @DrewThomasson in #92
V2.0 update readme improved and added assets folder by @DrewThomasson in #93
added new options in conf.py, custom_model still in dev by @ROBERT-MCDOWELL in #94
updated custom_model now managed by session, fixed various bugs by @ROBERT-MCDOWELL in #97
fix conf inversion, added BobRoss in fine-tuned by @ROBERT-MCDOWELL in #98
added fine tuned models in conf.py, renamed some by @ROBERT-MCDOWELL in #99
optimize audio ffmpeg cmd by @ROBERT-MCDOWELL in #100
Various fixes by @ROBERT-MCDOWELL in #101
fixed crlf to lf unix on app.py, various other important fixes by @ROBERT-MCDOWELL in #104
removed old code by @ROBERT-MCDOWELL in #105
V2.0 update readme and Mac launcher by @DrewThomasson in #106
fix convert_btn and more... by @ROBERT-MCDOWELL in #108
fixed f-string errors by @ROBERT-MCDOWELL in #109
optimizing audio presence by @ROBERT-MCDOWELL in #110
varioux fixes by @ROBERT-MCDOWELL in #112
V2.0: Tons of improvements and support for 1,107+ languages! 🤯 by @DrewThomasson in #111

New Contributors

@pafend made their first contribution in #49
@Bynanaa made their first contribution in #79

Full Changelog: 1.2.1...2.0

Contributors

ROBERT-MCDOWELL, Bynanaa, and 2 other contributors

Assets 2

0 Join discussion

11 Oct 20:18

DrewThomasson

1.2.1

9cb2f33

V1.2.1

Fixed custom model loading issue.

What's Changed

chinese readme by @WUYIN66 in #25
Installation with pip in edit mode by @ROBERT-MCDOWELL in #26
Revert "Installation with pip in edit mode" by @DrewThomasson in #27
Merge new Kaggel additions by @DrewThomasson in #29

New Contributors

@WUYIN66 made their first contribution in #25
@ROBERT-MCDOWELL made their first contribution in #26
@DrewThomasson made their first contribution in #27

Full Changelog: 1.2...1.2.1

Contributors

ROBERT-MCDOWELL, WUYIN66, and DrewThomasson

Assets 3

0 Join discussion

09 Oct 03:30

DrewThomasson

1.2

c68d44e

V1.2

New and improved App

Single app that runs in gui or headless mode
Fixed Sentence splitting for all 16 languages

New and Improved Web GUI

Added these parameters for headless mode:

usage: app.py [-h] [--share SHARE] [--headless HEADLESS] [--ebook EBOOK] [--voice VOICE]
              [--language LANGUAGE] [--use_custom_model USE_CUSTOM_MODEL]
              [--custom_model CUSTOM_MODEL] [--custom_config CUSTOM_CONFIG]
              [--custom_vocab CUSTOM_VOCAB] [--custom_model_url CUSTOM_MODEL_URL]
              [--temperature TEMPERATURE] [--length_penalty LENGTH_PENALTY]
              [--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K] [--top_p TOP_P]
              [--speed SPEED] [--enable_text_splitting ENABLE_TEXT_SPLITTING]

Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the
Gradio interface or run the script in headless mode for direct conversion.

options:
  -h, --help            show this help message and exit
  --share SHARE         Set to True to enable a public shareable Gradio link. Defaults
                        to False.
  --headless HEADLESS   Set to True to run in headless mode without the Gradio
                        interface. Defaults to False.
  --ebook EBOOK         Path to the ebook file for conversion. Required in headless
                        mode.
  --voice VOICE         Path to the target voice file for TTS. Optional, uses a default
                        voice if not provided.
  --language LANGUAGE   Language for the audiobook conversion. Options: en, es, fr, de,
                        it, pt, pl, tr, ru, nl, cs, ar, zh-cn, ja, hu, ko. Defaults to
                        English (en).
  --use_custom_model USE_CUSTOM_MODEL
                        Set to True to use a custom TTS model. Defaults to False. Must
                        be True to use custom models, otherwise you'll get an error.
  --custom_model CUSTOM_MODEL
                        Path to the custom model file (.pth). Required if using a custom
                        model.
  --custom_config CUSTOM_CONFIG
                        Path to the custom config file (config.json). Required if using
                        a custom model.
  --custom_vocab CUSTOM_VOCAB
                        Path to the custom vocab file (vocab.json). Required if using a
                        custom model.
  --custom_model_url CUSTOM_MODEL_URL
                        URL to download the custom model as a zip file. Optional, but
                        will be used if provided. Examples include David Attenborough's
                        model: 'https://huggingface.co/drewThomasson/xtts_David_Attenbor
                        ough_fine_tune/resolve/main/Finished_model_files.zip?download=tr
                        ue'. More XTTS fine-tunes can be found on my Hugging Face at
                        'https://huggingface.co/drewThomasson'.
  --temperature TEMPERATURE
                        Temperature for the model. Defaults to 0.65. Higher Tempatures
                        will lead to more creative outputs IE: more Hallucinations.
                        Lower Tempatures will be more monotone outputs IE: less
                        Hallucinations.
  --length_penalty LENGTH_PENALTY
                        A length penalty applied to the autoregressive decoder. Defaults
                        to 1.0.
  --repetition_penalty REPETITION_PENALTY
                        A penalty that prevents the autoregressive decoder from
                        repeating itself. Defaults to 2.0.
  --top_k TOP_K         Top-k sampling. Lower values mean more likely outputs and
                        increased audio generation speed. Defaults to 50.
  --top_p TOP_P         Top-p sampling. Lower values mean more likely outputs and
                        increased audio generation speed. Defaults to 0.8.
  --speed SPEED         Speed factor for the speech generation. IE: How fast the
                        Narrerator will speak. Defaults to 1.0.
  --enable_text_splitting ENABLE_TEXT_SPLITTING
                        Enable splitting text into sentences. Defaults to True.

Example: python script.py --headless --ebook path_to_ebook --voice path_to_voice
--language en --use_custom_model True --custom_model model.pth --custom_config
config.json --custom_vocab vocab.json

What's Changed

1.wav missing - change to default_voice.wav by @matthiss in #12

New Contributors

@matthiss made their first contribution in #12

Full Changelog: 1.1...1.2

Contributors

matthiss

Assets 2

0 Join discussion

22 Feb 07:12

DrewThomasson

1.1

e6dd63f

V1.1

Full Changelog: https://github.com/DrewThomasson/ebook2audiobookXTTS/commits/1.1

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

New and improved App

New and Improved Web GUI

Added these parameters for headless mode:

What's Changed

New Contributors

Contributors

Releases: DrewThomasson/ebook2audiobook

V2.0: Tons of improvements and support for 1,107+ languages! 🤯

What's Changed

New Contributors

Contributors

V1.2.1

What's Changed

New Contributors

Contributors

V1.2

New and improved App

New and Improved Web GUI

Added these parameters for headless mode:

What's Changed

New Contributors

Contributors

V1.1