Implementation of batched whisper and updates on Audio pipeline #53

Jiltseb · 2024-02-14T09:27:01Z

This PR is a new feature mentioned in #41 It integrates faster version of ASR to Aana SDK. The PR also creates Audio dataclass for type handling in audio pipelines.

Components:

Vad deployment and related scripts.
Updated whisper deployment and related scripts.
Updated pipelines, endpoints.
Audio dataclass creation, file handling and execution throughout the pipeline.
Tests audio data class, extract_audio method, vad and whisper deployments, integration tests for batched whisper

…on and saving batched transcription

…egments

Jiltseb · 2024-03-12T15:22:11Z

Added the changes after @movchan74 review. Please resolve the comments if the issues are addressed and point out unaddressed ones, if any.

aana/utils/general.py

…sion for chat_with_video

aana/tests/deployments/test_vad_deployment.py

aana/utils/general.py

aana/models/core/audio.py

aana/utils/video.py

Implementation of batched whisper and updates on Audio pipeline

Jiltseb added 30 commits January 26, 2024 16:31

added required packages

c4ab130

added endpoint for transcribe_batch

280a7c8

added vad deployment details

a838ca4

added nodes for batched inference and storage

8339600

vad deployment for segmenting the audio

d68d84a

modified whisper deployment for batched_inference

34681a7

modify output datamodel to accomodate batched inference

ab89fcf

output datamodel for vad_deployment

3c89095

parameters of the vad model and inference

b3e6fc5

added default parameters for batched asr

b642373

added example in the demo

1d6087d

Added end point for batched transcription

ce1160f

Added nodes for audio extraction, vad processing, batched transcripti…

ac2055c

…on and saving batched transcription

Modified deployment for faster audio reading

f5d40a4

Modified deployment for faster audio reading and handling empty vad s…

60c31ac

…egments

Added extract_audio function utility for video to audio conversion

aa6a33a

Added examples and benchmarking scripts on documentary data

de6d00a

test script for vad_deployment

f860166

changes to vad deployment

5c697d7

batched_inference endpoint

25a97a7

pipeline changes for Audio dataclass

90d41de

added audio_dir

b4fe114

input type changes, fixes

1d9a7bf

input type changes, fixes

5e67eff

added AudioReadingException

a25c8ce

added core Audio dataclass

c0a641d

tests for vad deployment

399a8f2

updated tests for whisper deployment

2961ab0

adding test audio file

ef503ec

added expected vad_output for testing

e0a8807

changes for audio PR after reviews

61dc28f

movchan74 reviewed Mar 12, 2024

View reviewed changes

aana/utils/general.py Outdated Show resolved Hide resolved

movchan74 reviewed Mar 12, 2024

View reviewed changes

aana/utils/general.py Outdated Show resolved Hide resolved

Jiltseb added 3 commits March 13, 2024 00:07

changes for vad deployment, general util files and adding batched ver…

71c5cf3

…sion for chat_with_video

updating cache files

eff4085

updated cache files

04c64df

movchan74 reviewed Mar 13, 2024

View reviewed changes

aana/tests/deployments/test_vad_deployment.py Outdated Show resolved Hide resolved

vad_deployment test and rerun tests

8f3916e

movchan74 reviewed Mar 13, 2024

View reviewed changes

aana/utils/general.py Outdated Show resolved Hide resolved

Jiltseb added 2 commits March 13, 2024 09:01

ruff checks passed

f493f22

update Path once

1342d2f

movchan74 reviewed Mar 13, 2024

View reviewed changes

aana/models/core/audio.py Outdated Show resolved Hide resolved

movchan74 reviewed Mar 13, 2024

View reviewed changes

aana/models/core/audio.py Outdated Show resolved Hide resolved

Jiltseb added 3 commits March 13, 2024 14:30

renamed model, added model_dir

996eb27

changed reading from bytes

8a6b5d9

linked issue, removed duplicate

28b4c40

evanderiel approved these changes Mar 14, 2024

View reviewed changes

aana/utils/video.py Outdated Show resolved Hide resolved

Jiltseb and others added 5 commits March 14, 2024 12:32

updated docstrings

c3ea371

Merge branch 'main' into js/batched_whisper

b29c7b7

Updated test files and cache

b737a14

Cosmetical fixes

9e6c8e7

Merge branch 'main' into js/batched_whisper

d648973

movchan74 approved these changes Mar 15, 2024

View reviewed changes

Update content-hash in poetry.lock

0bb4336

movchan74 requested a review from evanderiel March 15, 2024 14:57

evanderiel approved these changes Mar 16, 2024

View reviewed changes

Jiltseb merged commit 6f4be69 into main Mar 16, 2024
2 checks passed

movchan74 deleted the js/batched_whisper branch June 28, 2024 08:50

movchan74 pushed a commit that referenced this pull request Sep 19, 2024

Merge pull request #53 from mobiusml/js/batched_whisper

c8e1259

Implementation of batched whisper and updates on Audio pipeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of batched whisper and updates on Audio pipeline #53

Implementation of batched whisper and updates on Audio pipeline #53

Jiltseb commented Feb 14, 2024

Jiltseb commented Mar 12, 2024

Implementation of batched whisper and updates on Audio pipeline #53

Implementation of batched whisper and updates on Audio pipeline #53

Conversation

Jiltseb commented Feb 14, 2024

Jiltseb commented Mar 12, 2024