Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major update version 2.0.0 #35

Merged
merged 121 commits into from
Nov 11, 2024
Merged

Conversation

ROBERT-MCDOWELL
Copy link
Collaborator

@ROBERT-MCDOWELL ROBERT-MCDOWELL commented Oct 22, 2024

This PR is a big code refactoring fixing bugs and redundant non optimized code, folder renaming for more intuitive and scaling process, embedding calibre and ffmpeg into a new Docker to avoid never ending version conflicts between OS python and calibre, ffmpeg and AI libraries.

this PR keeps backward compatibility to the prevrious version of ebook2audiobook, meaning you still can run in command line and gradio app.py as long as the OS python is compatible. the run with Docker is unchanged too.

New directories and files:

  • ebooks: default input folder
  • audiobooks: default output folder
  • voices: default voices wav files for cloning, divided in 4 sub folders: elder,adult,teen,child, conatining female and male folders, which containing the language in ISO 2 letters style like "en" for English.
  • lib: folder containing conf.py (default config globals), lang.py (default lang globals), functions.py
  • models: default folder for downloaded models
  • tmp: folder where all conversion are processed and deleted once finished
  • install.sh: installer for linux/Mac to run in native mode
  • instal.bat: windows installer calling install.ps1 to run in native mode
  • pyproject.toml, setup.py, requirements.txt: files needed to install ebook2audiobook with pip which is run with install.sh or install.bat
  • ebook2audiobook.cmd, ebook2audiobook.sh: script to run in native mode with usual options (replaces from call directly app.py)

New options and behaviors:

  • all options requiring True or False can now be used without Boolean and is considered as True. i.e.: --headless without True or False is accepted as True
  • --device: force to use cpu or gpu, but if gpu is not available cpu is used
  • --ebooks_dir: batch conversion providing a folder rather than a file with --ebook

Example usage:
Windows:
headless:
./ebook2audiobook.cmd --headless --ebook 'path_to_ebook' --voice 'path_to_voice' --language en --use_custom_model --custom_model 'model.zip' --custom_config config.json --custom_vocab vocab.json
Graphic Interface:
./ebook2audiobook.cmd
Linux/Mac:
headless:
./ebook2audiobook.sh --headless --ebook 'path_to_ebook' --voice 'path_to_voice' --language en --use_custom_model --custom_model 'model.zip' --custom_config config.json --custom_vocab vocab.json
Graphic Interface:
./ebook2audiobook.sh

@DrewThomasson
suggestion about the repo name: why not to rename ebook2audiobookXTTS to ebook2audiobook?

@DrewThomasson
Copy link
Owner

DrewThomasson commented Oct 22, 2024

Edit: Added a Fix for the functions.py file that fixes this issue by ensuring the root_dir exists

Fix for this issue Seen here ---> Fix

When running the app.py command it doesn't appear to be automatically creating the /audiobooks/ folder?

🤔

as you can see here

(ebook2audiobook) drew@wmughal-CN4D09397T ebook2audiobook % python app.py
Running on local URL: http://localhost:7860
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.44.0, however version 4.44.1 is available, please upgrade. 
--------
  warnings.warn(
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 812, in convert_ebook
    delete_old_web_folders(audiobooks_dir)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 889, in delete_old_web_folders
    for folder_name in os.listdir(root_dir):
FileNotFoundError: [Errno 2] No such file or directory: '/Users/drew/Downloads/ebook2audiobook/audiobooks'
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 818, in convert_ebook
    nlp = spacy.load(language + '_core_web_sm')
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/spacy/util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 812, in convert_ebook
    delete_old_web_folders(audiobooks_dir)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 889, in delete_old_web_folders
    for folder_name in os.listdir(root_dir):
FileNotFoundError: [Errno 2] No such file or directory: '/Users/drew/Downloads/ebook2audiobook/audiobooks'

@DrewThomasson
Copy link
Owner

DrewThomasson commented Oct 22, 2024

It's decided on the merge I'll be renaming the repo from ebook2audiobookxtts ---> ebook2audiobook and using the version name of version 2.0

And I'll hold the v1.0 code in a repo under the oldebook2audiobookxtts` repo name lol thatll also point to the new one as the updated version for Version 2.0 +

This Is to fix this error I was getting on my end lol

- Now it runs just fine on my m1 16gb 2021 Mac 😄 

```bash
(ebook2audiobook) drew@wmughal-CN4D09397T ebook2audiobook % python app.py
Running on local URL: http://localhost:7860
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.44.0, however version 4.44.1 is available, please upgrade. 
--------
  warnings.warn(
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 812, in convert_ebook
    delete_old_web_folders(audiobooks_dir)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 889, in delete_old_web_folders
    for folder_name in os.listdir(root_dir):
FileNotFoundError: [Errno 2] No such file or directory: '/Users/drew/Downloads/ebook2audiobook/audiobooks'
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 818, in convert_ebook
    nlp = spacy.load(language + '_core_web_sm')
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/spacy/util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
Traceback (most recent call last):
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
    return await future
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
    result = context.run(func, *args)
  File "/Users/drew/miniconda3/envs/ebook2audiobook/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 1081, in process_conversion
    status, audiobook_file = convert_ebook(args, ui_needed)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 812, in convert_ebook
    delete_old_web_folders(audiobooks_dir)
  File "/Users/drew/Downloads/ebook2audiobook/lib/functions.py", line 889, in delete_old_web_folders
    for folder_name in os.listdir(root_dir):
FileNotFoundError: [Errno 2] No such file or directory: '/Users/drew/Downloads/ebook2audiobook/audiobooks'
```
@DrewThomasson
Copy link
Owner

@ROBERT-MCDOWELL

We'll have to add these changes to the readme file 😅

@ROBERT-MCDOWELL
Copy link
Collaborator Author

audiobooks should be already existing since it's part of the repo folder :D

@DrewThomasson
Copy link
Owner

audiobooks should be already existing since it's part of the repo folder :D

mmmmmm, nope? it doesn't?

I mean either way my code edit fixed that so should be fine without us doing anything else on that side lol

@ROBERT-MCDOWELL
Copy link
Collaborator Author

ok it's root_dir from delete_old_web_folders(audiobooks_dir) which is the root_dir of all web-* subfolders created for the web interface to avoid process collision between users....
audiobooks_dir is set in conf.py :-\

@ROBERT-MCDOWELL
Copy link
Collaborator Author

ROBERT-MCDOWELL commented Oct 22, 2024

I should add as global into the function calling delete_old_web_folders() to be sure
updated: adde to the PR.

On my repo i Modified the ABOUT text to:
Convert multiple languages ebooks to audiobooks with chapters and metadata using dynamic A.I. models and voice cloning.

maybe it can inspire you ;)

@ROBERT-MCDOWELL
Copy link
Collaborator Author

ROBERT-MCDOWELL commented Nov 26, 2024

later please.. will do it when v2.0.0 will be merged. I'm too busy for now.... the issue is you pull my repo which is a fork, so now when you merge my PR it turns back as an update for my repo, overwritting all my files back!... idn how you can remove this behavior but very annoying.... I'm coding with inspiration, and as I know me, when inspiration is gone, it's very hard to continue.... so....

@DrewThomasson
Copy link
Owner

You're good!

Take all the time you need; there's no rush. :/

In the future, I'll be excessively careful. Going forward, I'll only submit pull requests to your branch when you're working on ebook2audiobook.

And to be extra carful , I'll share changes as suggestions to your fork without performing any actual git merges when we're both working on the project.

Your pull requests and forks will always take top priority.

@ROBERT-MCDOWELL
Copy link
Collaborator Author

I could today remember the half of I done yesterday, not perfect but better than nothing.
did you pull my repo one time at least? if yes so that's why there is a push pull loop.

@DrewThomasson
Copy link
Owner

Related to pulling your repo?

No other than accepting your pull requests you sent me

The only files Ive modified were

Via git merges you have also accepted:

  • README.md

But I did modify the ..... config file... at the latest when just updating the download link.....
shit....I didn't do the double checking with you on the merge there...

is that what probably caused it?

80a9f56

@DrewThomasson
Copy link
Owner

DrewThomasson commented Nov 26, 2024

shit it was probably caused by this wasn't it...

80a9f56

cause the commit before that was this
2ada95a

And that one is just accepting your merge....

@ROBERT-MCDOWELL
Copy link
Collaborator Author

it can come from the hard drive of my laptop too. or the PR too big that created mess idk

@DrewThomasson
Copy link
Owner

DrewThomasson commented Nov 27, 2024

Oh I think I was having an issue with one of these in my testing when trying to run it on a gpu

at around the 50% mark it ran into this error

Traceback (most recent call last):
  File "/home/user/app/lib/functions.py", line 721, in convert_chapters_to_audio
    tts.tts_with_vc_to_file(
  File "/home/user/app/TTS/api.py", line 455, in tts_with_vc_to_file
    wav = self.tts_with_vc(
  File "/home/user/app/TTS/api.py", line 415, in tts_with_vc
    self.tts_to_file(
  File "/home/user/app/TTS/api.py", line 334, in tts_to_file
    wav = self.tts(
  File "/home/user/app/TTS/api.py", line 276, in tts
    wav = self.synthesizer.tts(
  File "/home/user/app/TTS/utils/synthesizer.py", line 398, in tts
    outputs = synthesis(
  File "/home/user/app/TTS/tts/utils/synthesis.py", line 221, in synthesis
    outputs = run_model_torch(
  File "/home/user/app/TTS/tts/utils/synthesis.py", line 53, in run_model_torch
    outputs = _func(
  File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/user/app/TTS/tts/models/vits.py", line 1157, in inference
    attn = generate_path(w_ceil.squeeze(1), attn_mask.squeeze(1).transpose(1, 2))
IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
Caught DependencyError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
Exception: Dimension out of range (expected to be in range of [-2, 1], but got 2)

I tried making it attempt to try on cpu if a chunk ever runs into that error on gpu but idk if thats a real solution....

It was from one of these test files I think...

Long_Swahili_test.txt

long_Bengali_test.txt

@ROBERT-MCDOWELL
Copy link
Collaborator Author

PR pushed, you can try with the last git.

@DrewThomasson
Copy link
Owner

👍👍🫡

I'll test tomorrow

@ROBERT-MCDOWELL
Copy link
Collaborator Author

I planned to work on the TTS loading. for now it loads for each conversion which is very bad. it should be loaded in the RAM once for XTTS-v2 and once for each language with Fairseq.... my discord is zirmook888

@DrewThomasson
Copy link
Owner

Kk👍

@DrewThomasson
Copy link
Owner

Just sent a request 👌

@DrewThomasson
Copy link
Owner

Also I just realized that then importing a custom xtts model it might support different languages,

So we should probably have the languages that default to xtts be dependent/linked to the custom xtts models config.py

@DrewThomasson
Copy link
Owner

Cause if not someone will run into an error if for instance:

They import a custom xtts model that doesn't support Vietnamese but they select Vietnamese and then the program defaults to xtts cause it's in the hard coded xtts support list

@ROBERT-MCDOWELL
Copy link
Collaborator Author

later later... the whole custom model logic must be reviewed completely to avoid bad situations.

@DrewThomasson
Copy link
Owner

Tru tru

@DrewThomasson
Copy link
Owner

I may of found a way to increase the generation time for Specifically running on the GPU by 10X or even 15X for only the XTTS models

At a slight reduction in audio quality of course

I'll report back when I get time to do so it looks very promising though!

@ROBERT-MCDOWELL
Copy link
Collaborator Author

k

@DrewThomasson
Copy link
Owner

DrewThomasson commented Dec 3, 2024

But it does break when running on cpu,

made a demo space to test it out with

https://huggingface.co/spaces/drewThomasson/fast_xtts_tts_test

Based on this github:

https://github.com/astramind-ai/Auralis

Sucks it breaks when trying to run or install on a computer with no cpu, so it might increase the complexity of our code .

A sample output wav file tho form it:

tmpd5psiu05.wav.zip

and the generation time results I got running on a 15gb Vram Cuda GPU

Non-enhanced
Generated in 39.14 seconds, Audio Length: 525.02 seconds


Enhanced:
Generated in 35.64 seconds, Audio Length: 491.31 seconds


@DrewThomasson
Copy link
Owner

I do prefer how our standard way of using Coqui TTS works tho

Cause our way still sounds better in my option this seems slightly more grainy

So perhaps this only as a experimental feature :/

@ROBERT-MCDOWELL
Copy link
Collaborator Author

ROBERT-MCDOWELL commented Dec 3, 2024

humm, 4 sec improvements.... not worth for now...
do you know mojo? https://www.modular.com/mojo
it's a new language with python syntax claiming to be 35000x faster than python.
promising... the only down is it's license proprietary for now.

@DrewThomasson
Copy link
Owner

DrewThomasson commented Dec 3, 2024

Intresting...

HEK I realized my wording may of been confusing

  • the enhanced is a audio enhancement parameter in the new method

If you wanted a comparison between our method and theirs it's

Our Coqui tts = Best GPU is 2X real time

This new method I found = Best GPU is 10X-15X real time

@DrewThomasson
Copy link
Owner

Wouldn't the coqui tts code and all the modulals we're using need to be written in Mojo tho?

humm, 4 sec improvements.... not worth for now...

do you know mojo? https://www.modular.com/mojo

it's a new language with python syntax claiming to be 35000x faster than python.

promising... the only down is it's license proprietary for now.

@ROBERT-MCDOWELL
Copy link
Collaborator Author

it's out since 6 months only.... some will certainly find a way to make a bridge/conversion

@DrewThomasson
Copy link
Owner

Good point we'll focus on what we know works

And if it's good enough someone will integrate it into coqui tts probs

@ROBERT-MCDOWELL
Copy link
Collaborator Author

no time for now to test on GPU, I dev on CPU only, and when I finish the multiX things, I have to install the OS and SET my xavier jansen minibox and mini ITX for tests...

@ROBERT-MCDOWELL
Copy link
Collaborator Author

there is certainly a patent war between A.I. solutions and to see so many A.I. project flourishing and after 2 or 3 years it's abandonned or just closed their open source section nothing seems very positive on further A.I. open source dev....
official coqui-tts is already abandoned without any notice, only the fork we are using is just trying to maintain the project and improve it.

@scruffynerf
Copy link

do you guys have a discord yet?

@DrewThomasson
Copy link
Owner

Just made one

https://discord.gg/zk8AAd4T

I guess I'll add that in the readme for v2.0

@DrewThomasson
Copy link
Owner

Join Our Discord Server!

Click the badge below to join the Ebook2audiobook Discord Server!
Discord

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants