-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WeNet 3.0 Roadmap #1192
Comments
Do we have plan to enhance post processing like puct restoring |
Yes, ITN, punctuation is in our plan, and the solution should be simple and elegant. |
Do you have plan to implement Text to speech models? |
I found this: https://github.com/wenet-e2e/wetts |
For binding:there is 3 questions: 1 get model by language type, so if we can supply small and big model for each language?(the small model could be trained with kd method) 2 shrink libtorch, now it is little big,but libtorch has many backend like mkl or openmp,it not easy make it small only through passing compile argument。 it seems we need open a repo to do this? 3 For advance usage,should we open more api for other language developer like onnx model |
Hello, any plan for a VAD in the future? |
Hi, do you have plans to introduce some text only domain adaptation methods? Or do you have any suggestions on the topic? |
Under testing... |
new architure? any paper for reference? |
The server-side is the old architure with a smaller acoustic unit. But we don't need the force alignment. From @Mddct:
|
@robin1001 Hi, macoS m1 can not built. |
Moree data augment like rir torchaudio will add multi channel riri based on pyroomacoustics |
Great! |
what about adding time-stamp per word? The current completion does not seems accurate with 100ms fixed width. |
Really exciting to see the Raspberry Pi on the 3.0 roadmap but think there are now so many great Aarch64 platforms that often have friendlier OpenCL capable GPU's and NPU's that maybe not just focus on Raspberry and with stock availability things are not looking good. It would be great to see a losely coupled system where you can mix and match preference of any vedor modules to make up a voice system. Linux needs standalone standard conf modules that are weakly linked by queues. There is a natural serial queue to voice processing:- 1... Mic/kws input Thats it in a nutshell how simple a native Linux voice can be as its just a series of queues and keeping it simple with Native Linux methods than embedded programming means its scalable to the complex. Each Mic/KWS is allocated to a zone (room) and channel which should remain /etc/conf linux file system that likely mirrors the zone & channel of the audio system outputs |
Sorry, I don't get the point. wenet focuses on ASR. It should be easy to integrate wenet in your system if ASR is required for your system, |
You have wekws & wetts aswell ? |
Yes. |
Is it convenient for you to provide the code for knowledge distillation based on wenet |
Update: For 2, we now support ort backend in wenetruntime |
If you are interest in WeNet 3.0, please see our roadmap https://github.com/wenet-e2e/wenet/blob/main/ROADMAP.md, add discuss here.
WeNet is a community-driven project and we love your feedback and proposals on where we should be heading.
Feel free to volunteer yourself if you are interested in trying out some items(they do not have to be on the list).
The text was updated successfully, but these errors were encountered: