Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(packages): add speaches support for SileroVAD and Whisper STT as OpenAI API #872

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

OriNachum
Copy link
Collaborator

@OriNachum OriNachum commented Feb 22, 2025

Note: I couldn't get the test to run.
It goes right into server startup.

When I run the test (From another docker) it works great.

Edit: Also, onnx for TTS not supported for now (onnx based)

Copy link
Collaborator

@ms1design ms1design left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there @OriNachum, thanks for your PR! 🥇

I’ve just wrapped up my review, and everything looks well crafted. Found just few places where we could improve and resolve some of the issues that you are facing.

Appreciate your hard work and contribution! 🫶

#---
# name: faster-whisper:speaches
# group: audio
# depends: [ctranslate2:master-builder,faster-whisper, python:3.12, numpy]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to not use ctranslate2 here instead of builder? This impacts the image size and build time. We can easily reduce them.

# docs: docs.md
# notes: run with cuda12.6
#---

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix the tabs ;)

&& rm -rf /var/lib/apt/lists/*

# Remove any existing .venv directories
RUN find / -name ".venv" -type d -exec rm -rf {} + 2>/dev/null || true
Copy link
Collaborator

@ms1design ms1design Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running find / as root inside a Docker RUN command can be slow. If you want to remove a .venv dir at a specific path, you can use:

RUN [ -d "/app/.venv" ] && rm -rf /app/.venv || true

Comment on lines +32 to +43
# Build faster-whisper wheel
RUN git clone https://github.com/guillaumekln/faster-whisper && \
cd faster-whisper && \
sed -i \
-e 's|^onnxruntime.*||g' \
-e 's|^ctranslate2.*||g' \
-e 's|^huggingface_hub.*||g' \
-e 's|^tokenizers.*|tokenizers|g' \
requirements.txt && \
pip3 install --no-cache-dir --verbose -r requirements.txt && \
python3 setup.py bdist_wheel && \
cp dist/faster_whisper*.whl ${WHEEL_DIR}/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already included the faster-whisper in the depends, so it's wheel is already installed. Safe to remove I assume.

FROM ${BASE_IMAGE}

ARG SPEACHES_ROOT="/opt/speaches"
ARG WHEEL_DIR="/opt/wheels"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dusty-nv would we like to set JC default wheels dir and set this ENV in python package instead?

RUN chmod +x ${SPEACHES_ROOT}/start-speaches-server.sh

# Expose port
EXPOSE 8000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NTH: Make the port configurable by ARG

Comment on lines +71 to +72
COPY start-speaches-server.sh ${SPEACHES_ROOT}/start-speaches-server.sh
RUN chmod +x ${SPEACHES_ROOT}/start-speaches-server.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that anybody would build jetson-containers on Windows, so we are safe to assume that you can set (on Linux) start-speaches-server.sh permissions before copying the file into docker image as the Docker will typically preserve the permissions if you're working in a Unix-like environment.

Comment on lines +56 to +62
# Print locations of globally installed packages (for debugging)
RUN python3 -c "import sys, numpy, ctranslate2, faster_whisper; \
print(f'Python path: {sys.path}'); \
print(f'NumPy: {numpy.__file__}'); \
print(f'ctranslate2: {ctranslate2.__file__}'); \
print(f'faster_whisper: {faster_whisper.__file__}')" || \
echo "Warning: Not all packages could be imported."
Copy link
Collaborator

@ms1design ms1design Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed and can be moved to tests.

# requires: '>=34.1.0'
# test: [test.py]
# docs: docs.md
# notes: run with cuda12.6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you dare to create config.py for this package we could create a minimum requirement for this package and pin it to cu126

@@ -0,0 +1,77 @@
#---
# name: faster-whisper:speaches
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dusty-nv I know that the initial package name comes from package root directory. Should we allow to use colon : within the package name in Dockerfile?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants