Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to support aarch64 #15

Closed
trholding opened this issue Oct 3, 2022 · 3 comments
Closed

Request to support aarch64 #15

trholding opened this issue Oct 3, 2022 · 3 comments
Labels
build Build related issues enhancement New feature or request

Comments

@trholding
Copy link
Contributor

Make errors out on a aarch64 server

make base.en
#gcc -pthread -O3 -c ggml.c
gcc -pthread -O3 -mcpu=cortex-a72 -mfloat-abi=hard -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -c ggml.c
gcc: error: unrecognized command-line option ‘-mfloat-abi=hard’
gcc: error: unrecognized command-line option ‘-mfpu=neon-fp-armv8’
gcc: error: unrecognized command-line option ‘-mfp16-format=ieee’
gcc: error: unrecognized command-line option ‘-mno-unaligned-access’
make: *** [Makefile:7: ggml.o] Error 1

Perhaps this is enough as C flags? : -Ofast -g -mfpu=neon

@ggerganov ggerganov added the enhancement New feature or request label Oct 3, 2022
@WilliamTambellini
Copy link
Contributor

A solution: use/link with
https://github.com/oneapi-src/oneDNN
for onednn to do all the optimizations for the local CPU at runtime whatever the cpu model/brand.

@ggerganov
Copy link
Owner

@trholding Yes, I think this will work. Obviously, the current Makefile is not very cross-platform - I mainly focus to be able to compile on M1 and Intel processors. With time, it will be improved to support various other platforms.

@ggerganov ggerganov added the build Build related issues label Oct 5, 2022
@trholding
Copy link
Contributor Author

@ggerganov Works awesome on aarch64!

time make base.en 
bash ./download-ggml-model.sh base.en
Downloading ggml model base.en ...
Model base.en already exists. Skipping download.

===============================================
Running base.en on all samples in ./samples ...
===============================================

----------------------------------------------
[+] Running base.en on samples/jfk.wav ... (run 'ffplay samples/jfk.wav' to listen)
----------------------------------------------

whisper_model_load: loading model from 'models/ggml-base.en.bin'
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 2
whisper_model_load: mem_required  = 377.00 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: ggml ctx size = 163.43 MB
whisper_model_load: memory size =    22.83 MB 
whisper_model_load: model size  =   140.54 MB

main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, lang = en, task = transcribe, timestamps = 1 ...

[00:00.000 --> 00:11.000]   And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.


whisper_print_timings:     load time =   154.29 ms
whisper_print_timings:      mel time =    82.46 ms
whisper_print_timings:   sample time =     7.87 ms
whisper_print_timings:   encode time =  5479.56 ms / 913.26 ms per layer
whisper_print_timings:   decode time =   553.81 ms / 92.30 ms per layer
whisper_print_timings:    total time =  6278.83 ms


real    0m6.331s
user    0m23.265s
sys     0m0.257s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Build related issues enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants