-
Notifications
You must be signed in to change notification settings - Fork 74.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464
Build for TARGET_ARCH=fusion_f1 via reference implementation fallback. #45464
Conversation
Thanks for contributing to TensorFlow Lite Micro. To keep this process moving along, we'd like to make sure that you have completed the items on this list:
We would like to have a discussion on the Github issue first to determine the best path forward, and then proceed to the PR review. |
tagging @pnikam-cad @nyadla-sys @kpraving |
This change adds reference fallbacks to the optimized xtensa kernels for the case when TARGET_ARCH is anything other than hifimini. This sets the stage for a baseline from which we can incrementally optimize for architectures other than hifimini. The goal is to have a starting point where all the unit tests pass for `TARGET_ARCH=hifimini` (which will use the optimized implementations) or any other `TARGET_ARCH` (with reference fallback). Tested for `TARGET_ARCH=fusion_f1` with: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test ``` With the following profiling results: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 test_keyword_benchmark InitializeKeywordRunner() took 239061 ticks (239 ms) KeywordRunNIerations(1) took 168564 ticks (168 ms) KeywordRunNIerations(10) took 1685111 ticks (1685 ms) ``` ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=fusion_f1 XTENSA_CORE=Google_F1 keyword_benchmark BUILD_TYPE=release xt-size tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark text data bss dec hex filename 48256 40132 24952 113340 1babc tensorflow/lite/micro/tools/make/gen/xtensa_fusion_f1/bin/keyword_benchmark ``` After this change, we can: * add a continuous build for Hifi4 * add optimizations for Hifi4 on a per kernelbasis and keep profiling the impact of these optimizations on the keyword benchmark cycles and binary size. Also tested that `TARGET_ARCH=hifimini` is unaffected: ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG test_keyword_benchmark InitializeKeywordRunner() took 1392788 ticks (1392 ms) KeywordRunNIerations(1) took 89195 ticks (89 ms) KeywordRunNIerations(10) took 891509 ticks (891 ms) ``` ``` make -f tensorflow/lite/micro/tools/make/Makefile -j8 TARGET=xtensa OPTIMIZED_KERNEL_DIR=xtensa TARGET_ARCH=hifimini XTENSA_CORE=mini1m1m_RG keyword_benchmark BUILD_TYPE=release xt-size tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark text data bss dec hex filename 46080 40204 24952 111236 1b284 tensorflow/lite/micro/tools/make/gen/xtensa_hifimini/bin/keyword_benchmark ```
7024ab8
to
00f5e3c
Compare
Internal checks were failing (while the external build was ok) because there is an automatic clang-format step prior to the code being imported into the google codebase. And my original commit was missing clang-format and an associated header. d2fd64f fixes the issue. |
#45464 added a new file but did not add the Apache header. Instead the internal change was force submitted. This resulted in breaking all sync between internal and external. PiperOrigin-RevId: 346613912 Change-Id: I078c18f677dcf05be01966b2277f28b4ef42ad68
This change adds reference fallbacks to the optimized xtensa kernels for the case when TARGET_ARCH is anything other than hifimini.
This sets the stage for a baseline from which we can incrementally optimize for architectures other than hifimini.
The goal is to have a starting point where all the unit tests pass for
TARGET_ARCH=hifimini
(which will use the optimized implementations) or any otherTARGET_ARCH
(with reference fallback).Tested for
TARGET_ARCH=fusion_f1
with:After this change, we can:
Also tested that
TARGET_ARCH=hifimini
is unaffected: