Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tdrz and coreml support? #1088

Open
whicks1 opened this issue Jul 7, 2023 · 12 comments
Open

tdrz and coreml support? #1088

whicks1 opened this issue Jul 7, 2023 · 12 comments

Comments

@whicks1
Copy link

whicks1 commented Jul 7, 2023

Setting up a new macbook pro, m2, added coreml, works great! Except with new trdz feature.

running ./models/generate-coreml-model.sh small.en-tdrz is missing from conversion script list of options.

Traceback (most recent call last):
  File "/whisper.cpp/models/convert-whisper-to-coreml.py", line 306, in <module>
    raise ValueError("Invalid model name")
ValueError: Invalid model name
coremlc: error: Model does not exist at models/coreml-encoder-small.en-tdrz.mlpackage -- file:////whisper.cpp/
mv: rename models/coreml-encoder-small.en-tdrz.mlmodelc to models/ggml-small.en-tdrz-encoder.mlmodelc: No such file or directory
@akeybl
Copy link

akeybl commented Jul 12, 2023

cc @akashmjn, who proposed the original (awesome) PR. This is important for speed improvements on iPhones/Mac.

@akeybl
Copy link

akeybl commented Jul 12, 2023

Was able to workaround this by adding

https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/__init__.py#L23
https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/__init__.py#L40

to the whisper package installed in my miniconda environment

~/miniconda3/envs/py310-whisper/lib/python3.10/site-packages/whisper/__init__.py

Probably a better way, but it worked heh...

@ldenoue
Copy link

ldenoue commented Aug 16, 2023

@akeybl do you have this model uploaded somewhere? Perhaps we could add it to the same folder on HuggingFace where @akashmjn already saved the non CoreML version? https://huggingface.co/akashmjn/tinydiarize-whisper.cpp

@akashmjn
Copy link
Contributor

Hey @akeybl thanks for the cc! i was on break for a bit, hence the delay. Looks like i missed the coreml conversion in this PR #1001

I'll take a look and fix this later today (both conversion from pytorch checkpoint, and directly supporting small.en-tdrz in download-coreml-model.sh)

Excited to see someone give this a spin on an iPhone!

@ldenoue
Copy link

ldenoue commented Aug 16, 2023

@akashmjn would it also be possible to get tiny.en-tdrz ?

@akashmjn
Copy link
Contributor

akashmjn commented Aug 21, 2023

1). Regarding a finetuned tiny.en-tdrz , i'd tried it but it didn't actually work very well. Likely because it is a very weirdly shaped model (token embeddings are >50% of total params).

2). I just looked into generate-coreml-model.sh. Everything should work - you just need to ensure your local whisper package finds the small.en-tdrz checkpoint name.

This can be done either by

  • replacing the python openai-whisper package in your env with my fork pip install https://github.com/akashmjn/tinydiarize.git
  • just hacking whisper.__init__.py as @akeybl did (totally valid 🙂 ) by adding a path to my mirrored pytorch checkpoint.

3). Regarding download-coreml-model.sh or hosting of pre-converted coreML checkpoints: it appears from #566 that until things stabilize with coreML, the maintainer (ggerganov)'s recommendation is that everyone convert locally themselves. So in the meantime would adding a -tdrz section to ./models/README.md help?

I don't actually have the right Mac/iPhone hardware atm to test the CoreML stuff so let me know how it goes.

@ScriptTiger
Copy link

ScriptTiger commented Mar 5, 2024

1). Regarding a finetuned tiny.en-tdrz , i'd tried it but it didn't actually work very well. Likely because it is a very weirdly shaped model (token embeddings are >50% of total params).

@akashmjn, would a base.en-tdrz have the same issue?

@bwuzhang
Copy link

The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/__init__.py#L23

@dylanfrankcom
Copy link

The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/__init__.py#L23

Interested in this as well!

@charlie-ac
Copy link

The link has expired. Does anyone have the pt file hosted somewhere else? https://github.com/akashmjn/tinydiarize/blob/886a8d3523c5bf6e8bfc57f3441f1ce6f4345ad4/whisper/__init__.py#L23

If any of you have worked with the tdrz model last year, could you please check your ~/.cache/whisper for small.en-tdrz.pt? The ggml version is on Hugging Face, but the original .pt file seems to be lost to time.

@prafulkl
Copy link

prafulkl commented Apr 5, 2024

I tried the ggml model to load, don't understand how to load this model
https://huggingface.co/akashmjn/tinydiarize-whisper.cpp/tree/main

@novakai
Copy link

novakai commented Dec 28, 2024

Is anyone else experiencing issues with this? All I see when running the small model using the "hack" above is

$ ./main -pp -oj -tdrz -m models/ggml-small.en-tdrz.bin -f $WAV_PATH

...

[00:11:15.580 --> 00:11:17.560]  ,
[00:11:25.000 --> 00:11:46.440]  ,
[00:11:46.440 --> 00:11:46.440]  ,.
[00:11:46.440 --> 00:12:07.880]  ,.
[00:12:07.880 --> 00:12:36.400]  .
[00:12:36.400 --> 00:12:57.840]  .
[00:12:57.840 --> 00:13:19.280]  .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants